- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Workspace mechanism: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
 
(38 intermediate revisions by 4 users not shown)
Line 2: Line 2:
The toolset is an Open Source Project (see [https://github.com/holgerBerger/hpc-workspace hpc-workspace@github]). Further documentation can be found at github as well, e.g. the [https://github.com/holgerBerger/hpc-workspace/blob/master/user-guide.md Workspace user guide@github].
The toolset is an Open Source Project (see [https://github.com/holgerBerger/hpc-workspace hpc-workspace@github]). Further documentation can be found at github as well, e.g. the [https://github.com/holgerBerger/hpc-workspace/blob/master/user-guide.md Workspace user guide@github].


== Allocating new workspace:==  
== Allocating new workspace:==
Workspaces can be allocated with the following command:
{{command|command=MYSCR=`ws_allocate SimulatesomeThing 10` }}
MYSCR will contain the name of a directory that exists for 10 days, is on a temporary filesystem, and is owned by the caller. The directory is not deleted after the job, but after 10 days of realworld time. In a second job, you can just use the same line to get the same directory. Please note that the directory of the example will be deleted 10 days after first usage, no matter how often it is re-used and what duration was specified in the subsequent calls. The name may not contain any special characters, only digits and letters are allowed (the only exceptions are dash, dot, and underscore, which are also possible but not as first character of the name).


  MYSCR=`ws_allocate SimulatesomeThing 10`
=== Usefull ws_allocate options: ===
<UL>
 
MYSCR will contain the name of a directory which exists for 10 days, is on a temporary filesystem, and is owned by the caller. The directory is not deleted after the job, but after 10 days of realworld time. In a second job, you can just use the same line to get the same directory. Please note that the directory of the example will be deleted 10 days after first usage, no matter how often it is used and what duration was specified in the subsequent calls. The name may not contain any special characters, only digits and letters are allowed (the only exceptions are dash, dot, and underscore which are also possible, but not as first character of the name).<BR>
The ws_allocate command comes with a bunch of options. Some useful once are described below. For the full description see the manpage of ws_allocate.
 
'''Options:'''<BR>  
<LI> Non-default filesystem
</UL>
 
    ws_allocate [-F filesystem] name duration  
{{command|command=ws_allocate [-F <filesystem>] <name> <duration>}}
<UL>  
The option -F <filesystem> specifies the filesystem on which your workspace will be located. If this option is omitted, then your workspace will be located on a default filesystem.<BR>
The option -F filesystem specifies the filesystem on which your workspace will be located. If this option is omitted, then your workspace will be located on a default filesystem.<BR>
If <duration> is omitted your workspace will have default lifetime of 1 day.
</LI>
</UL>
 
<LI> Expiration reminder
 
You can enable a reminder mail some days before your workspace expires and is deleted using the options -r <days> and -m <mailaddress>:
{{command|command=ws_allocate -r <days before ws expires> -m <mailaddress> ...}}
If you don't want to add these options each time to your ws_allocate command you can add these information to the '~/.ws_user.conf' configuration file:
{{file|filename=~/.ws_user.conf
|content=<pre>
mail:    <mailaddress>
reminder: <days>
</pre>
}}
</LI>


== Find your existing workspace path:==  
== Find your existing workspace path:==  
  MYSCR=`ws_find SimulatesomeThing`
{{Command|command=MYSCR=$(ws_find myWorkspaceName)}}
<UL>
MYSCR will contain the name of the directory where your prior allocated workspace is located
MYSCR will contain the name of the directory where your prior allocated workspace is located
</UL>
</UL>
Line 38: Line 52:
   ws_list -F <fstype>
   ws_list -F <fstype>
<UL>  
<UL>  
If several filesystems are used, you can limit the listing to workspaces of a specific filesystem type only. On Laki the filesystem NEC_lustre is available, on Hazelhen the filesystem univ_1. default is also a possible value.
If several filesystems are used, you can limit the listing to workspaces of a specific filesystem type only. On vulcan the filesystem ws3 is available, on hawk the filesystem ws10.*, ws11.*. default is also a possible value.
</UL>  
</UL>  
   ws_list -l
   ws_list -l
Line 45: Line 59:
</UL>
</UL>


== Release a workspace:==  
== Release a workspace:==
If a workspace <tt>wsname</tt> is not needed any more it is good practice to release it with
{{Command|command=ws_release [-F filesystem] <wsname>}}
If the optione <tt>-F filesystem</tt> is omitted, then the workspace <tt>wsname</tt> will be released on the default workspace filesystem.
 
Users are responsible for releasing the correct workspace. After a workspace is released or the workspace is expired, the directory of the workspace is moved to some kind of a trash can.
 
{{Warning|text=Released workspaces may still account for quota limits of the user and group.  Please see also the [[#Notes_for_expired_or_released_workspaces|note w.r.t. quota]] before doing a ws_release!}}


  ws_release [-F filesystem] name
<UL>
Release the workspace 'name' on the specified filesystem. If the optione '-F filesystem' is omitted, then the workspace 'name' will be released on the default workspace filesystem. The user are responsible for releasing the correct workspace. After a workspace is released or the workspace is expired, then the directory of this workspace will be deleted.
</UL>
==Register your workspaces:==  
==Register your workspaces:==  


Line 60: Line 77:
== Extend your workspace duration:==  
== Extend your workspace duration:==  


Starting with Version 3.0 of the workspace tools there is a possibility to extend an existing workspace:
It is possibility to extend an existing workspace <tt>wsname</tt> within the policies set by the administrators with:
ws_extend [-f <fsname>] <wsname> [<duration>]
{{Command|command=ws_extend [-f <filesystem>] <wsname> [<duration>]}}
or, alternatively, using <tt>ws_allocate</tt> with the flag <tt>-x</tt>
or, alternatively, using <tt>ws_allocate</tt> with the flag <tt>-x</tt>
ws_allocate -x [-f <fsname>] <wsname> <duration>
{Command|command=ws_allocate -x [-f <filesystem>] <wsname> <duration>}}
with <tt>wsname</tt> being the name of an existing workspace. When using <tt>ws_extend</tt> the duration may be ommitted.
When using <tt>ws_extend</tt> the duration may be omitted.
Extension is allowed for a small number of times, which is displayed with <tt>ws_list <wsname></tt>.
Extension is allowed for a small number of times (typically 3 times). You can list the number of remaining extensions with <tt>ws_list <wsname></tt>.
 


== Remind your workspace expiration date ==
== Remind your workspace expiration date ==
Line 78: Line 94:
programm with an integrated calendar function, you can accept
programm with an integrated calendar function, you can accept
this event and your calendar / PDA will remind you ...
this event and your calendar / PDA will remind you ...
== Share your workspace with another user ==
To grant and manage read access to a workspace for other users, e.g., support staff, the <tt>ws_share</tt> command can be used. (shipped with ws-tools 1.4)
To grant read access to a workspace <tt><wsname></tt> for a user <tt><user></tt> use
{{Command|command=ws_share share <wsname> <user>}}
You can list the users, which you granted access to your workspaces using the command
{{Command|command=ws_share list <wsname>}}
To revoke the access for a user run
{{Command|command=ws_share unshare <wsname> <user>}}
Run <tt>ws_share --help</tt> to get more help or consult the full documentation in the man page of <tt>ws_share</tt>.


== Quota limits ==
== Quota limits ==


==== New mechanism for Hawk ====
==== New mechanism and policy for Hawk and vulcan (ws3 filesystem) ====
Due to the cache-based structure, the native quota tools are not
Due to the cache-based structure, the native quota tools are not
sufficient to obtain the information. HLRS has provided special commands ws_quota to display the limits as well as the current usage of both the user and the group.  
sufficient to obtain the information. HLRS has provided a special command '''''ws_quota''''' to display the limits as well as the current usage of both the user and the group.  
<font color="red">If you as a user or your group exceed the quota limit, no further
jobs will be executed in the batch queues!</font> In this case, you and your group will receive an email in order to let you know about this fact. ''This policy is due to the fact that also the filesystem is an expensive resource which should be used as less as possible hence!''
 
If you can not run jobs anymore due to hitting the quota limit, please pursue the following strategy:
* All members of the respective group should check whether they are a member of this group only. You can check the groups you are assigned to by the id command.
* If (and only if!) all members of the respective group are assigned to this group only, every member of your group should check his/her personal contribution w.r.t. data volume and file count by means of the ws_quota command. If the sum of the results is close to the overall numbers of the group, you can use the individual numbers to figure out who contributes most to the quota and hence should reduce data volume and/or file count.
* If some members of the respective group are also assigned to other groups, it's not reasonable to rely on the numbers given by ws_quota in order to assess who contributes a large amount of the overall quota! This is due to the fact that users might have files which are assigned to different groups. ws_quota - however - prints the ''overall'' numbers only, independent of which group those files are assigned to!


==== lustre usage on vulcan (old ws2 filesystem) ====
usually quotas are enforced for all filesystems holding workspaces on user and group basis. To check the current usage, use following commands:
usually quotas are enforced for all filesystems holding workspaces on user and group basis. To check the current usage, use following commands:
         lfs quota <file_system>
         lfs quota <file_system>
Line 91: Line 126:
Please note, for a lustre based filesystem one has to use the ''lfs'' command to get the quota information.
Please note, for a lustre based filesystem one has to use the ''lfs'' command to get the quota information.
This is different to e.g. the HOME, for which the ''quota'' command shows a different quota.
This is different to e.g. the HOME, for which the ''quota'' command shows a different quota.
==== Notes for expired or released workspaces ====
Expired or released workspace directories are kept for a few days and will be counted to the quota!
If you want to delete files immediately to free up space, use the "rm" command before "ws_release".


== Workspace was expired, what can I do? ==
== Workspace was expired, what can I do? ==
After a workspace was expired, a system cleaner moves that workspace for some days to a trash directory which is not accessible by users.
After a workspace was expired, a system cleaner moves that workspace for some days to a trash directory which is not accessible by users.
Some days later the trash directory will be really cleaned.
Some days later the trash directory will be really cleaned.
Users can list their own expired workspace which are still in the trash directory by using:
        ws_restore -l
To restore expired workspaces is also possible by users.
First you need a valid workspace. If you don't have a valid workspace, so please create one (see above).
Then you can restore a expired workspace which are listed with "ws_restore -l" using:
        ws_restore [options] workspace_name target_name


workspace_name ''is one of the names listed by "ws_restore -l".'' target_name ''is one of your valid workspace id's listed with "ws_list".''
Users can list their own expired workspace that are still in the trash directory by using:
{{Command|command=ws_restore -l}}
Restoring expired workspaces from the trash is possible by users.
First you need a valid workspace as a target for the restore. If you don't have such a workspace yet, please create one (see <tt>ws_allocate</tt> above).
Then you can restore an expired workspace using:
{{Command|command=ws_restore [options] <workspace_name> <target_name>}}
Here, <tt>workspace_name</tt> is one of the names listed by <tt>ws_restore -l</tt> and <tt>target_name</tt> is one of your valid workspace names listed by <tt>ws_list</tt>.
Please have in mind that the expired and new workspace have to be located on the same filesystem!
If the workspaces are located on a non-default filesystem, please also provide the filesystem to ws_restore via the -F flag!

Latest revision as of 14:29, 19 August 2024

This mechanism allows you to keep data outside your home not only during a run, but also after a run. The idea is to allocate disk space for a number of days, and giving it a name, which allows you to identify a workspace, and to distinguish several workspaces. It is also possible to allocate workspaces on different filesystems, which are prepared for workspaces on the local host. The toolset is an Open Source Project (see hpc-workspace@github). Further documentation can be found at github as well, e.g. the Workspace user guide@github.

Allocating new workspace:

Workspaces can be allocated with the following command:

MYSCR=`ws_allocate SimulatesomeThing 10`

MYSCR will contain the name of a directory that exists for 10 days, is on a temporary filesystem, and is owned by the caller. The directory is not deleted after the job, but after 10 days of realworld time. In a second job, you can just use the same line to get the same directory. Please note that the directory of the example will be deleted 10 days after first usage, no matter how often it is re-used and what duration was specified in the subsequent calls. The name may not contain any special characters, only digits and letters are allowed (the only exceptions are dash, dot, and underscore, which are also possible but not as first character of the name).

Usefull ws_allocate options:

The ws_allocate command comes with a bunch of options. Some useful once are described below. For the full description see the manpage of ws_allocate.

  • Non-default filesystem
    ws_allocate [-F <filesystem>] <name> <duration>

    The option -F <filesystem> specifies the filesystem on which your workspace will be located. If this option is omitted, then your workspace will be located on a default filesystem.
    If <duration> is omitted your workspace will have default lifetime of 1 day.

  • Expiration reminder You can enable a reminder mail some days before your workspace expires and is deleted using the options -r <days> and -m <mailaddress>:
    ws_allocate -r <days before ws expires> -m <mailaddress> ...

    If you don't want to add these options each time to your ws_allocate command you can add these information to the '~/.ws_user.conf' configuration file:

    File: ~/.ws_user.conf
    mail:     <mailaddress>
    reminder: <days>
    
  • Find your existing workspace path:

    MYSCR=$(ws_find myWorkspaceName)

    MYSCR will contain the name of the directory where your prior allocated workspace is located

    Listing your workspaces and available workspace filesystems:

     ws_list
    
      Lists all workspaces of the default workspace filesystem, their names and locations as well as remaining live time.
     ws_list -a
    
      Lists your workspaces in all workspace filesystems.
     ws_list -s
    
      short output: list names of workspaces only (this is useful for scripting, e.g. the output can be used for ws_find to finally obtain the directories)
     ws_list -F <fstype>
    
      If several filesystems are used, you can limit the listing to workspaces of a specific filesystem type only. On vulcan the filesystem ws3 is available, on hawk the filesystem ws10.*, ws11.*. default is also a possible value.
     ws_list -l
    
      list all available workspace filesystems

    Release a workspace:

    If a workspace wsname is not needed any more it is good practice to release it with

    ws_release [-F filesystem] <wsname>

    If the optione -F filesystem is omitted, then the workspace wsname will be released on the default workspace filesystem.

    Users are responsible for releasing the correct workspace. After a workspace is released or the workspace is expired, the directory of the workspace is moved to some kind of a trash can.

    Warning: Released workspaces may still account for quota limits of the user and group. Please see also the note w.r.t. quota before doing a ws_release!


    Register your workspaces:

     ws_register -F filesystem dir
    
      This command will create/update in directory 'dir/filesystem' symbolic links of your workspaces in the specified 'filesystem'. If filesystem = ALL, then all of your workspaces in all available filesystems will be registered in the specified 'dir'. The symbolic links could be useful for getting some more information about your workspaces e.g. find ..., du ..., ls ..., ....

    Extend your workspace duration:

    It is possibility to extend an existing workspace wsname within the policies set by the administrators with:

    ws_extend [-f <filesystem>] <wsname> [<duration>]

    or, alternatively, using ws_allocate with the flag -x {Command|command=ws_allocate -x [-f <filesystem>] <wsname> <duration>}} When using ws_extend the duration may be omitted. Extension is allowed for a small number of times (typically 3 times). You can list the number of remaining extensions with ws_list <wsname>.

    Remind your workspace expiration date

     ws_send_ical  <WS_name> <your_Email-address>
    

    This little script may help you to take care of your work space. For this task it checks the remaining time of a given work space (first parameter) and send a calendar invitation to the Email address (second parameter). If you are using a mail programm with an integrated calendar function, you can accept this event and your calendar / PDA will remind you ...

    Share your workspace with another user

    To grant and manage read access to a workspace for other users, e.g., support staff, the ws_share command can be used. (shipped with ws-tools 1.4)

    To grant read access to a workspace <wsname> for a user <user> use

    ws_share share <wsname> <user>

    You can list the users, which you granted access to your workspaces using the command

    ws_share list <wsname>

    To revoke the access for a user run

    ws_share unshare <wsname> <user>


    Run ws_share --help to get more help or consult the full documentation in the man page of ws_share.

    Quota limits

    New mechanism and policy for Hawk and vulcan (ws3 filesystem)

    Due to the cache-based structure, the native quota tools are not sufficient to obtain the information. HLRS has provided a special command ws_quota to display the limits as well as the current usage of both the user and the group. If you as a user or your group exceed the quota limit, no further jobs will be executed in the batch queues! In this case, you and your group will receive an email in order to let you know about this fact. This policy is due to the fact that also the filesystem is an expensive resource which should be used as less as possible hence!

    If you can not run jobs anymore due to hitting the quota limit, please pursue the following strategy:

    • All members of the respective group should check whether they are a member of this group only. You can check the groups you are assigned to by the id command.
    • If (and only if!) all members of the respective group are assigned to this group only, every member of your group should check his/her personal contribution w.r.t. data volume and file count by means of the ws_quota command. If the sum of the results is close to the overall numbers of the group, you can use the individual numbers to figure out who contributes most to the quota and hence should reduce data volume and/or file count.
    • If some members of the respective group are also assigned to other groups, it's not reasonable to rely on the numbers given by ws_quota in order to assess who contributes a large amount of the overall quota! This is due to the fact that users might have files which are assigned to different groups. ws_quota - however - prints the overall numbers only, independent of which group those files are assigned to!

    lustre usage on vulcan (old ws2 filesystem)

    usually quotas are enforced for all filesystems holding workspaces on user and group basis. To check the current usage, use following commands:

            lfs quota <file_system>
    

    Please note, for a lustre based filesystem one has to use the lfs command to get the quota information. This is different to e.g. the HOME, for which the quota command shows a different quota.

    Notes for expired or released workspaces

    Expired or released workspace directories are kept for a few days and will be counted to the quota! If you want to delete files immediately to free up space, use the "rm" command before "ws_release".

    Workspace was expired, what can I do?

    After a workspace was expired, a system cleaner moves that workspace for some days to a trash directory which is not accessible by users. Some days later the trash directory will be really cleaned.

    Users can list their own expired workspace that are still in the trash directory by using:

    ws_restore -l

    Restoring expired workspaces from the trash is possible by users. First you need a valid workspace as a target for the restore. If you don't have such a workspace yet, please create one (see ws_allocate above). Then you can restore an expired workspace using:

    ws_restore [options] <workspace_name> <target_name>

    Here, workspace_name is one of the names listed by ws_restore -l and target_name is one of your valid workspace names listed by ws_list. Please have in mind that the expired and new workspace have to be located on the same filesystem! If the workspaces are located on a non-default filesystem, please also provide the filesystem to ws_restore via the -F flag!