Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New file transfer guide #824

Merged
merged 26 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
1f8d1a1
Revise file transfer draft
xamberl Jul 19, 2024
1726f26
Expand transfer_output_files, more formatting
xamberl Jul 19, 2024
5ecba19
language update
xamberl Sep 13, 2024
4a76d52
Reformat header level
xamberl Sep 20, 2024
acdd6d4
Remove disk usage/tar info
xamberl Sep 30, 2024
651e74b
Rename page, add file sizes to table, text changes
xamberl Oct 4, 2024
0707294
Clarify language for transferring outputs
xamberl Oct 4, 2024
f003ab3
typo
xamberl Oct 16, 2024
83e4367
Moved old guides; updated new guides
xamberl Oct 18, 2024
3d121a1
fix table of contents
xamberl Oct 18, 2024
b31a53e
Change layout to file_avail
xamberl Oct 18, 2024
b90dbce
Update _layouts/file_avail.html
xamberl Oct 24, 2024
66e0ded
Update _layouts/file_avail.html
xamberl Oct 24, 2024
9e15f73
Update _uw-research-computing/file-avail-largedata.md
xamberl Oct 24, 2024
b967d0e
Update _uw-research-computing/htc-job-file-transfer.md
xamberl Oct 24, 2024
7912001
Update example for `transfer_input_files`
xamberl Oct 24, 2024
7352fc0
unpublish the archived pages
xamberl Oct 24, 2024
98b11a5
Remove section for large std output
xamberl Oct 24, 2024
97c14aa
Update links and toc
xamberl Oct 24, 2024
b8a8fcb
Add redirects; add info for datasets > 100GB
xamberl Oct 24, 2024
b146677
Move checking quota to one page
xamberl Oct 25, 2024
7182b2a
Update title; add more to `transfer_output_remaps`
xamberl Oct 28, 2024
0452a7d
Find and replace osdf with 2 slashes to 3
xamberl Dec 9, 2024
b3831f9
Remove unused draft
xamberl Dec 9, 2024
04a391b
Merge branch 'master' into preview-xalim-osdf-guide
xamberl Mar 3, 2025
391b846
Add specify `file:///` for group directories
xamberl Mar 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 5 additions & 18 deletions _layouts/file_avail.html
Original file line number Diff line number Diff line change
Expand Up @@ -9,40 +9,27 @@ <h2>Which Option is the Best for Your Files?</h2>
<tr>
<th>Input Sizes</th>
<th>Output Sizes</th>
<th>Link to Guide</th>
<th>File Location</th>
<th>How to Transfer</th>
<th>Syntax for <code>transfer_input_files</code></th>
<th>Availability, Security</th>
<!--<th>Special Considerations</th>-->
</tr>

<tr>
<td>0 - 100 MB per file, up to 500 MB per job</td>
<td>0 - 5 GB per job</td>
<td><a href="file-availability.html">Small Input/Output File Transfer via <b>HTCondor</b></a></td>
<td><code>/home</code></td>
<td><b>submit file</b>; filename in <small><code>transfer_input_files</code></small></td>
<td>CHTC, UW Grid, and OSG; works for <i>your</i> jobs</td>
<td>No special syntax</td>
<td>CHTC and external pools</td>
<!--<td><font color="red">DO NOT USE HTCondor transfer for files in <code>/staging</code> OR <code>/squid</code></font></td>-->
</tr>

<tr>
<td>100 MB - 1 GB per repeatedly-used file</td>
<td>Not available for output</td>
<td><a href="file-avail-squid.html">Large Input File Availability Via <b>Squid</b></a></td>
<td><code>/squid</code></td>
<td><b>submit file</b>; http link in <small><code>transfer_input_files</code></small></td>
<td>CHTC, UW Grid, and OSG; files are made *publicly-readable* via an HTTP address</i></td>
<!--<td>large files unique to individual jobs are better for large data staging</td>-->
</tr>

<tr>
<td>100 MB - TBs per job-specific file; repeatedly-used files > 1GB</td>
<td>4 GB - TBs per job</td>
<td><a href="file-avail-largedata.html">Large Input and Output File Availability Via <b>Staging</b></a></td>
<td><code>/staging</code></td>
<td><b>job executable</b>; copy or move within the job</td>
<td>a portion of CHTC; accessible only to <i>your</i> jobs</td>
<td><code>osdf://</code> or <code>file:///</code></td>
<td>all of CHTC/external pools or a subset of CHTC</td>
<!--<td>special submit "Requirements"</td>-->
</tr>

Expand Down
5 changes: 5 additions & 0 deletions _redirects/file-avail-squid.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
layout: redirect
redirect_url: /uw-research-computing/htc-job-file-transfer
permalink: /uw-research-computing/file-avail-squid
---
5 changes: 5 additions & 0 deletions _redirects/file-availability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
layout: redirect
redirect_url: /uw-research-computing/htc-job-file-transfer
permalink: /uw-research-computing/file-availability
---
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@
highlighter: none
layout: file_avail
title: Transfer Large Input Files Via Squid
published: false
guide:
order: 1
category: Handling Data in Jobs
tag:
- htc
---
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,8 @@ highlighter: none
layout: file_avail
title: Small Input and Output File Availability Via HTCondor
alt_title: Transfer Small Input and Output
published: false
guide:
order: 0
category: Handling Data in Jobs
tag:
- htc
---
Expand Down
63 changes: 40 additions & 23 deletions _uw-research-computing/check-quota.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,25 @@ guide:
---

The following commands will allow you to monitor the amount of disk
space you are using in your home directory on our (or another) submit node and to determine the
amount of disk space you have been allotted (your quota).

If you also have a `/staging` directory on the HTC system, see our
[staging guide](file-avail-largedata.html#5-checking-your-quota-data-use-and-file-counts) for
details on how to check your quota and usage.
\
The default quota allotment on CHTC submit nodes is 20 GB with a hard
limit of 30 GB (at which point you cannot write more files).\
\
**Note: The CHTC submit nodes are not backed up, so you will want to
space you are using in your home directory on the access point and to determine the
amount of disk space you have been allotted (your quota).

The default quota allotment in your `/home` directory is 20 GB with a hard
limit of 30 GB (at which point you cannot write more files).
Comment on lines +15 to +16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this got upped when we were upgrading the OS last spring.

Suggested change
The default quota allotment in your `/home` directory is 20 GB with a hard
limit of 30 GB (at which point you cannot write more files).
The default quota allotment in your `/home` directory is 40 GB with a hard
limit of 50 GB (at which point you cannot write more files).


**Note: The CHTC access points are not backed up, so you should
copy completed jobs to a secure location as soon as a batch completes,
and then delete them on the submit node in order to make room for future
jobs.** If you need more disk space to run a single batch or concurrent
batches of jobs, please contact us ([Get Help!](get-help.html)). We have multiple ways of dealing with large disk space
requirements to make things easier for you.
jobs.** Disk space provided is intended for *active* calculations only, not permanent storage.
If you need more disk space to run a single batch or concurrent
batches of jobs, please contact us ([Get Help!](get-help.html)). We have multiple ways of dealing with large disk space requirements to make things easier for you.

If you wish to change your quotas, please see [Request a Quota Change](quota-request).

**1. Checking Your User Quota and Usage**
**1. Checking Your `/home` Quota and Usage**
-------------------------------------

From any directory location within your home directory, type
From any directory location within your `/home` directory, use the command
`quota -vs`. See the example below:

```
Expand All @@ -43,18 +39,39 @@ Disk quotas for user alice (uid 20384):
```
{:.term}

The output will list your total data usage under `blocks`, your soft
The output will list your total data usage under `space`, your soft
`quota`, and your hard `limit` at which point your jobs will no longer
be allowed to save data. Each of the values given are in 1-kilobyte
be allowed to save data. Each value is given in 1-kilobyte
blocks, so you can divide each number by 1024 to get megabytes (MB), and
again for gigabytes (GB). (It also lists information for ` files`, but
we don\'t typically allocate disk space by file count.)
again for gigabytes (GB). (It also lists information for number of `files`, but
Comment on lines +43 to +45
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -s option for quotas should return mebibytes, not kibibytes.

Also, bytes are multiples of 1000 while bibytes are multiples of 1024 (bi is used to denote that the unit is a power of 2; 1024 = 2^10).
We use bibytes for setting quotas.

In general, I think that we use bytes when communicating with users, but if math is involved for getting their quota values, will need to use bibytes instead..

we don't typically allocate disk space in `/home` by file count.)

**2. Checking Your `/staging` Quota and Usage**
------------------------------------------------
Users may have a `/staging` directory, meant for staging large files and data intended for
job submission. See our [Managing Large Data in HTC Jobs](file-avail-largedata) guide for
more information.

To check your `/staging` quota, use the command `get_quotas /staging/username`.

```
[alice@submit]$ get_quotas /staging/alice
Path Quota(GB) Items Disk_Usage(GB) Items_Usage
/staging/alice 20 5 3.18969 5
```
{:.term}

Your `/staging` directory has a disk and item quota. In the example above, the disk quota is
20 GB, and the items quota is 5 items. The current usage is printed in the following columns;
in the example, the user has used 3.19 GB and 5 items.

To request a quota increase, [fill out our quota request form](quota-request).

**2. Checking the Size of Directories and Contents**
**3. Checking the Size of Directories and Contents**
------------------------------------------------

Move to the directory you\'d like to check and type `du` . After several
moments (longer if you\'re directory contents are large), the command
Move to the directory you'd like to check and type `du` . After several
moments (longer if the contents of your directory are large), the command
will add up the sizes of directory contents and output the total size of
each contained directory in units of kilobytes with the total size of
that directory listed last. See the example below:
Expand Down
Loading
Loading