diff --git a/docs/data-management/acdc/acdc-overview.md b/docs/data-management/acdc/acdc-overview.md index ecd1817f36..6559bdc00c 100644 --- a/docs/data-management/acdc/acdc-overview.md +++ b/docs/data-management/acdc/acdc-overview.md @@ -4,10 +4,10 @@ The ALCF Community Data Co-Op (ACDC) powers data-driven research by providing a platform for data access and sharing, and value-added services for data discovery and analysis. -A fundamental aspect of ACDC is a data fabric that allows programmatic data access, and straightforward large-scale data sharing with collaborators via [Globus services](https://www.globus.org).This provides a platform to build out different modalities for data access and use, such as indexing of data for discovery, data portals for interactive search and access, and accessible analysis services. ACDC will continue to be expanded to deliver ALCF users the platform to build customizable and accessible services towards the goal of supporting data-driven discoveries. +A fundamental aspect of ACDC is a data fabric that allows programmatic data access and straightforward large-scale data sharing with collaborators via [Globus services](https://www.globus.org). This provides a platform to build out different modalities for data access and use, such as indexing of data for discovery, data portals for interactive search and access, and accessible analysis services. ACDC will continue to be expanded to deliver ALCF users the platform to build customizable and accessible services towards the goal of supporting data-driven discoveries. ## Data access and sharing -ALCF project PIs can share data on Eagle with their collaborators, making facility accounts unnecessary. With this service, the friction of data sharing amongst collaborators is eliminated – there is no need to create copies of data for sharing, or allocation and accounts just to access data. ALCF PIs can grant access to data, at read-only or read/write access levels. Non-ALCF users throughout the scientific community, who have been granted permissions, can access the data on Eagle filesystem using Globus. +ALCF project PIs can share data on Eagle with their collaborators, making facility accounts unnecessary. With this service, the friction of data sharing amongst collaborators is eliminated – there is no need to create copies of data for sharing, or allocation and accounts just to access data. ALCF PIs can grant access to data, at read-only or read/write access levels. Non-ALCF users throughout the scientific community, who have been granted permissions, can access the data on the Eagle filesystem using Globus. Access to the data for ALCF users and collaborators is supported via bulk transfer (Globus transfer) or direct browser-based access (HTTP/S). Direct connections to high-speed external networks permit data access at many gigabytes per second. Management of permissions and access is via a web application or command line clients, or directly via an Application Programming Interface (APIs). The interactivity permitted by the APIs distinguishes ACDC from the ALCF’s previous storage systems and presents users with many possibilities for data control and distribution. @@ -16,12 +16,12 @@ ACDC’s fully supported production environment is the next step in the expansio ACDC includes several project-specific data portals that enable search and discovery of the data hosted on Eagle. The portals allow users to craft queries and filters to find specific sets of data that match their criteria and use faceted search for the discovery of data. Portals also provide the framework for other interfaces including data processing capabilities, all secured with authentication and configured authorization policy. -The ACDC portal is a deployment of Django Globus Portal Framework customized for a variety of different projects For most of these projects, the search metadata links directly to data on Eagle, with browser-based download, preview, and rendering of files, and bulk data access. +The ACDC portal is a deployment of Django Globus Portal Framework customized for a variety of different projects. For most of these projects, the search metadata links directly to data on Eagle, with browser-based download, preview, and rendering of files, and bulk data access. ## Getting Started 1. **Request an allocation:** Researchers or PIs request an allocation on Eagle, and a project allocation is created upon request acceptance. 2. **Manage Access:** PIs can manage the space independently or assign other users to manage the space, as well as provide other users with read or read/write access for folders in the space. Globus groups and identities are used to manage such access. -3. **Authentication:** Globus is used for authentication and identity needed to access the system. As Globus has built-in support for federated logins, users can access ACDC using their campus or institution federated username and passcode +3. **Authentication:** Globus is used for authentication and identity needed to access the system. As Globus has built-in support for federated logins, users can access ACDC using their campus or institution federated username and passcode. If you are new to the ALCF, follow these instructions on how to transfer your data to ACDC: [Transferring Data to Eagle](transferring-data-to-eagle.md) diff --git a/docs/data-management/acdc/eagle-data-sharing.md b/docs/data-management/acdc/eagle-data-sharing.md index d1588ee9c7..42c1a49b3c 100644 --- a/docs/data-management/acdc/eagle-data-sharing.md +++ b/docs/data-management/acdc/eagle-data-sharing.md @@ -1,15 +1,17 @@ # Sharing Data on Eagle Using Globus Guest Collections + ## Overview Collaborators throughout the scientific community have the ability to write data to and read scientific data from the Eagle filesystem using Globus sharing capability. This capability provides PIs with a natural and convenient storage space for collaborative work. -**Note:** The project PI needs to have an **active** ALCF account to set up Globus guest collections on Eagle, and set permissions for collaborators to access data. If the PI does not have an account or has an inactive account, they will not be able to create a Globus guest collectively. If a PI's account goes inactive after the Globus guest collection was created and shared, the collection will become inaccessible until the PI's account is reactivated. Only the project PI has the ability to create a collection; project proxies cannot create a collection. +!!! note -Globus is a service that provides research data management, including managed transfer and sharing. It makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus supports GridFTP for bulk and high-performance file transfer, and direct HTTPS for download. The service allows the user to submit a data transfer request, and performs the transfer asynchronously in the background. For more information, see Globus data transfer and Globus data sharing. +The project PI needs to have an **active** ALCF account to set up Globus guest collections on Eagle and set permissions for collaborators to access data. If the PI does not have an account or has an inactive account, they will not be able to create a Globus guest collection. If a PI's account goes inactive after the Globus guest collection was created and shared, the collection will become inaccessible until the PI's account is reactivated. Only the project PI has the ability to create a collection; project proxies cannot create a collection. +Globus is a service that provides research data management, including managed transfer and sharing. It makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus supports GridFTP for bulk and high-performance file transfer, and direct HTTPS for download. The service allows the user to submit a data transfer request and performs the transfer asynchronously in the background. For more information, see Globus data transfer and Globus data sharing. +## Logging into Globus with your ALCF Login -## Logging into Globus with your ALCF Login ## ALCF researchers can use their ALCF Login username and password to access Globus. Go to the [Globus website](https://www.globus.org/) and click on Log In in the upper right corner of the page.
@@ -26,52 +28,39 @@ Type or scroll down to "Argonne LCF" in the "Use your existing organizational lo You will be taken to a familiar-looking page for ALCF login. Enter your ALCF login username and password. -## Accessing your Eagle Project Directory ## - - -**Note:** Specifically for PIs with Eagle 'Data-Only' projects (no compute allocations), logging in through Globus is the only way to access the project directory. +## Accessing your Eagle Project Directory -PIs with data and compute allocations will have access to the required compute-system login nodes (along with the Globus Web Interface) to access their project directory. - +PIs with data and compute allocations will have access to the required compute-system login nodes (along with the Globus Web Interface) to access their project directory. ## Creating a Guest Collection -A project PI needs to have an 'active' ALCF account in place to create and share guest collections with collaborators. Please note that ONLY a PI has the ability to create guest collections. -> PIs with an "Inactive/Deleted" ALCF account, should submit a reactivation request by filling out this form: [Re-activation Form](https://my.alcf.anl.gov/accounts/#/accountReactivate) +A project PI needs to have an 'active' ALCF account in place to create and share guest collections with collaborators. Please note that ONLY a PI has the ability to create guest collections. -> PIs without an ALCF account should submit an ALCF account request by filling out this form: [Account Request Form](https://my.alcf.anl.gov/accounts/#/accountRequest) +!!! info + PIs with an "Inactive/Deleted" ALCF account should submit a reactivation request by filling out this form: [Re-activation Form](https://my.alcf.anl.gov/accounts/#/accountReactivate) -### Navigate to the Collections tab ### +!!! info + PIs without an ALCF account should submit an ALCF account request by filling out this form: [Account Request Form](https://my.alcf.anl.gov/accounts/#/accountRequest) -There are multiple ways to Navigate to the Collections tab in "Endpoints": +### Navigate to the Collections tab + +There are multiple ways to navigate to the Collections tab in "Endpoints": 1. [Click the link to get started](https://app.globus.org/file-manager/collections/05d2c76a-e867-4f67-aa57-76edeb0beda0/shares). It will take you to the Collections tab for Eagle. **OR** -2. Click on 'Endpoints' located in the left panel of the [Globus web app](https://app.globus.org/endpoints). Type "alcf#dtn_eagle" (for Eagle) or "alcf#dtn_eagle" (for Eagle) in the search box located at the top of the page and click the magnifying glass to search. Click on the Managed Public Endpoint "alcf#dtn_eagle" or "alcf#dtn_eagle" from the search results. Click on the Collections tab. **OR** -3. Click on 'File Manager' located in the left panel of the Globus web app. Search for 'alcf#dtn_Eagle' (or "alcf#dtn_eagle") and select it in the Collection field. Select your project directory or a sub directory that you would like to share with collaborators as a Globus guest collection. Click on 'Share' on the right side of the panel, which will take you to the Collections tab. +2. Click on 'Endpoints' located in the left panel of the [Globus web app](https://app.globus.org/endpoints). Type "alcf#dtn_eagle" (for Eagle) in the search box located at the top of the page and click the magnifying glass to search. Click on the Managed Public Endpoint "alcf#dtn_eagle" from the search results. Click on the Collections tab. **OR** +3. Click on 'File Manager' located in the left panel of the Globus web app. Search for 'alcf#dtn_eagle' and select it in the Collection field. Select your project directory or a subdirectory that you would like to share with collaborators as a Globus guest collection. Click on 'Share' on the right side of the panel, which will take you to the Collections tab. -**Note:** When you select an endpoint to transfer data to/from, you may be asked to authenticate with that endpoint. Follow the instructions on screen to activate the endpoint and to authenticate. You may also have to provide Authentication/Consent for the Globus web app to manage collections on this endpoint +**Note:** When you select an endpoint to transfer data to/from, you may be asked to authenticate with that endpoint. Follow the instructions on screen to activate the endpoint and to authenticate. You may also have to provide Authentication/Consent for the Globus web app to manage collections on this endpoint. -### Adding a Guest Collection ### -In the Collections tab, click 'Add a Guest Collection' located at the top right hand corner. +### Adding a Guest Collection -1. Fill out the form: - 1. If the path to the directory is not pre-populated, click the browse button, navigate and select the directory. Note that you can create a single guest collection and set permissions for folders within a guest collection. There is no reason to create multiple guest collections to share for a single project. +In the Collections tab, click 'Add a Guest Collection' located at the top right-hand corner. +1. Fill out the form: + 1. If the path to the directory is not pre-populated, click the browse button, navigate and select the directory. Note that you can create a single guest collection and set permissions for folders within a guest collection. There is no reason to create multiple guest collections to share for a single project. 2. Give the collection a Display Name (choose a descriptive name) 2. Click "Create Collection" @@ -82,9 +71,8 @@ In the Collections tab, click 'Add a Guest Collection' located at the top right
## Sharing Data with Collaborators Using Guest Collections -Your data in the Guest Collections can be easily shared with collaborators at ALCF or elsewhere. You have full control over which files your collaborators can access, and whether they have read-only or read-write permissions. - +Your data in the Guest Collections can be easily shared with collaborators at ALCF or elsewhere. You have full control over which files your collaborators can access, and whether they have read-only or read-write permissions. To share data with collaborators (that either have a Globus account or an ALCF account), click on 'Endpoints', select your newly created Guest Collection (as described in the section above), and go to the 'Permissions' tab. Click on 'Add Permissions - Share With': @@ -93,17 +81,17 @@ To share data with collaborators (that either have a Globus account or an ALCF a
Add Permissions
-You can share with other Globus users or Globus Groups (for more information on Groups, scroll down to Groups). You can give the collaborators read, write or read+write permissions. Once the options have been selected, click 'Add Permission'. +You can share with other Globus users or Globus Groups (for more information on Groups, scroll down to Groups). You can give the collaborators read, write, or read+write permissions. Once the options have been selected, click 'Add Permission'.
![Add Permissions - Share With](files/permissions-share-with.png){ width="700" }
Add Permissions - Share With
-PI can also choose to share their data with 'Public' with anonymous read access (and anonymous write disabled). This allows anyone that has access to the data read and/or download it without authorizing the request. +PI can also choose to share their data with 'Public' with anonymous read access (and anonymous write disabled). This allows anyone that has access to the data to read and/or download it without authorizing the request.
- ![HTTP - Share With](files/https-read.png){ width="700" } + ![HTTP - Share With](files/https-read.png){ width="700" }
Add Permissions - Share With
@@ -115,26 +103,21 @@ You should then see the share and the people you have shared it with. You can re ## Additional information on Globus Guest Collections -1. ONLY a project PI can create guest collections and make them accessible to collaborators. Project proxies cannot create guest collections. +1. ONLY a project PI can create guest collections and make them accessible to collaborators. Project proxies cannot create guest collections. 2. You can only share directories, not individual files. - -3. Globus allows directory trees to be shared as either read or read/write. This means that any subdirectories within that tree also have the same permissions. -Globus supports setting permissions at a folder level, so there is no need to create multiple guest collections for a project. You can create a guest collection at the top level and share sub-directories with the collaborators by assigning the appropriate permissions. - +3. Globus allows directory trees to be shared as either read or read/write. This means that any subdirectories within that tree also have the same permissions. Globus supports setting permissions at a folder level, so there is no need to create multiple guest collections for a project. You can create a guest collection at the top level and share sub-directories with the collaborators by assigning the appropriate permissions. 4. When you create a guest collection endpoint and give access to one or more Globus users, you can select whether each person has read or read/write access. If they have write access, they can also delete files within that directory tree, so you should be careful about providing write access. - 5. Globus guest collections are created and managed by project PIs. If the PI of a project changes, the new PI will have to create a new guest collection and share them with the users. Contact ALCF Support (support@alcf.anl.gov) in such cases. Globus guest collections' ownership cannot be transferred. - 6. Guest collections are active as long as the project directory is available **and** the PI's ALCF account is active. If the PI's ALCF account goes inactive, the collections become inaccessible to all its collaborators. Access is restored once the PI's account is reactivated. - -7. All RW actions are performed as the PI, when using Guest Collections. If a PI does not have permissions to read or write a file or a directory, then the Globus guest collection users won't either. +7. All RW actions are performed as the PI when using Guest Collections. If a PI does not have permissions to read or write a file or a directory, then the Globus guest collection users won't either. ## Creating a group + 1. Go to Groups on the left panel 2. Click on ‘Create a new group’ at the top -3. Give the group a descriptive name and add Description for more information -4. Make sure you select ‘group members only’ radio button +3. Give the group a descriptive name and add a Description for more information +4. Make sure you select the ‘group members only’ radio button 5. Click on ‘Create Group’
@@ -142,17 +125,18 @@ Globus supports setting permissions at a folder level, so there is no need to cr
Create new group
-## Transferring data from Eagle -Log in to [Globus](https://app.globus.org) using your ALCF credentials. After authenticating, you will be taken to the Globus File Manager tab. In the 'Collection' box, type the name of Eagle managed endpoint (```alcf#dtn_eagle```) Navigate to the folder/file you want to transfer. HTTPS access (read-only) is enabled so you can download files by clicking the "Download" button. +## Transferring data from Eagle -Click on 'Download' to download the required file. +Log in to [Globus](https://app.globus.org) using your ALCF credentials. After authenticating, you will be taken to the Globus File Manager tab. In the 'Collection' box, type the name of the Eagle managed endpoint (`alcf#dtn_eagle`). Navigate to the folder/file you want to transfer. HTTPS access (read-only) is enabled so you can download files by clicking the "Download" button. + +Click on 'Download' to download the required file.
![Download](files/download.png){ width="700" }
Download the required file
-To transfer files to another Globus endpoint, in the "collection" search box in the RHS panel, enter the destination endpoint (which could also be your Globus Connect Personal endpoint). +To transfer files to another Globus endpoint, in the "collection" search box in the RHS panel, enter the destination endpoint (which could also be your Globus Connect Personal endpoint).
![Endpoint transfer](files/Source_Destination_Endpoints.png){ width="700" } @@ -188,7 +172,8 @@ You will also receive an email when the transfer is complete.
## Deleting a guest collection -To see all guest collections you have shared, go to 'Endpoints' in the left hand navigation bar, then 'Administered by You'. Select the guest collection endpoint you wish to delete, and click on 'Delete endpoint'. + +To see all guest collections you have shared, go to 'Endpoints' in the left-hand navigation bar, then 'Administered by You'. Select the guest collection endpoint you wish to delete, and click on 'Delete endpoint'.
![Deleting collections](files/delete_endpoint.png){ width="700" } @@ -196,11 +181,12 @@ To see all guest collections you have shared, go to 'Endpoints' in the left hand
## What to tell your Collaborators + If you set up a shared endpoint and want your collaborator to download the data, this is what you need to tell them. First, the collaborator needs to get a Globus account. The instructions for setting up a Globus account are as described above. This account is free. They may already have Globus access via their institution. -If the collaborator is downloading the data to his/her personal workstation, they need to install the Globus Connect client. Globus connect clients are available for Mac, Windows or Linux systems and are free. +If the collaborator is downloading the data to his/her personal workstation, they need to install the Globus Connect client. Globus connect clients are available for Mac, Windows, or Linux systems and are free. If you clicked on the 'notify users via email' button when you added access for this user, they should have received a message that looks like this: @@ -209,22 +195,23 @@ If you clicked on the 'notify users via email' button when you added access for
Click on the 'notify users via email' button for collaborators to receive an email
-You can, of course, also send email to your collaborators yourself, telling them you've shared a folder with them. The collaborator should click on the link, which will require logging in with their institutional or Globus login username and password. They should then be able to see the files you shared with them. External collaborator's view of the shared collection is shown below: +You can, of course, also send an email to your collaborators yourself, telling them you've shared a folder with them. The collaborator should click on the link, which will require logging in with their institutional or Globus login username and password. They should then be able to see the files you shared with them. External collaborator's view of the shared collection is shown below:
![Collaborator view](files/collaborator_view.png){ width="700" }
Collaborator transfer or sync to
-They should click on the files they want to transfer, then 'Transfer or Sync to', enter their own endpoint name and desired path and click the 'Start' button near the bottom to start the transfer. +They should click on the files they want to transfer, then 'Transfer or Sync to', enter their own endpoint name and desired path, and click the 'Start' button near the bottom to start the transfer.
![Transfer path](files/collab-transfer.png){ width="700" } -
Chossing transfer path
+
Choosing transfer path
## Encryption and Security -Data can be encrypted during Globus file transfers. In some cases encryption cannot be supported by an endpoint, and Globus Online will signal an error. + +Data can be encrypted during Globus file transfers. In some cases, encryption cannot be supported by an endpoint, and Globus Online will signal an error. For more information, see [How does Globus ensure my data is secure?](https://docs.globus.org/faq/security/#how_does_globus_ensure_my_data_is_secure) @@ -240,32 +227,34 @@ Alternatively, you can encrypt the files before transfer using any method on you **Note:** Encryption and verification will slow down the data transfer. ## FAQs -### General FAQs: + +### General FAQs: + **1. What is the Eagle file system?** -They are Lustre file systems residing on an HPE ClusterStor E1000 platform equipped with 100 Petabytes of usable capacity across 8480 disk drives. Each ClusterStor platform also provides 160 Object Storage Targets and 40 Metadata Targets with an aggregate data transfer rate of 650GB/s. - -**2. What is the difference between a Guest, Shared, and Mapped collection?** +They are Lustre file systems residing on an HPE ClusterStor E1000 platform equipped with 100 Petabytes of usable capacity across 8480 disk drives. Each ClusterStor platform also provides 160 Object Storage Targets and 40 Metadata Targets with an aggregate data transfer rate of 650GB/s. + +**2. What is the difference between a Guest, Shared, and Mapped collection?** - Guest collections: A Guest collection is a logical construct that a PI sets up on their project directory in Globus that makes it accessible to collaborators. The PI creates a guest collection at or below their project and shares it with the Globus account holders. - Shared collection: A guest collection becomes a shared collection when it is shared with a user/group. - Mapped Collections: Mapped Collections are created by the endpoint administrators. In the case of Eagle, these are created by ALCF. - + **3. Who can create Guest collections?** -ONLY a project PI (or project owner) can create guest collections and make them accessible to collaborators. +ONLY a project PI (or project owner) can create guest collections and make them accessible to collaborators. + +Project Proxy (on the POSIX side) or Access Manager (on the Globus side) do not have the ability to create guest collections. -Project Proxy (on the POSIX side) or Access Manager (on the Globus side) do not have the ability to create guest collections. - -**4. Who is an Access Manager?** +**4. Who is an Access Manager?** -Access Manager is someone who can act as a Proxy on behalf of the PI to manage the collection. The Access Manager has the ability to add users, remove users, grant or revoke read/write access privileges for those users on that particular guest collection. However, Access Managers DO NOT have permissions to create guest collections. +Access Manager is someone who can act as a Proxy on behalf of the PI to manage the collection. The Access Manager has the ability to add users, remove users, grant or revoke read/write access privileges for those users on that particular guest collection. However, Access Managers DO NOT have permissions to create guest collections. -**5. What are Groups?** +**5. What are Groups?** -Groups are constructs that enable multi-user data collaboration. A PI (and an Access Manager) can create new groups, add members to them and share a guest collection with a group of collaborators. +Groups are constructs that enable multi-user data collaboration. A PI (and an Access Manager) can create new groups, add members to them, and share a guest collection with a group of collaborators. -**Note** Members of groups do not need to have ALCF accounts. +**Note:** Members of groups do not need to have ALCF accounts. **6. What are some of the Common Errors you see and what do they mean?** @@ -274,47 +263,48 @@ Groups are constructs that enable multi-user data collaboration. A PI (and an Ac - PermissionDenied - If you do not have permissions to view or modify the collection on - ServiceUnavailable - If the service is down for maintenance ``` - + ### PI FAQs: -**1. How can a PI request for a data-only, Eagle storage allocation?** -A project PI can request an allocation by filling out the Director’s Discretionary Allocation Request form: [Request an allocation](https://my.alcf.anl.gov/accounts/#/allocationRequests). The allocations committee reviews the proposals and provides its decision in 1-2 weeks. +**1. How can a PI request for a data-only, Eagle storage allocation?** + +A project PI can request an allocation by filling out the Director’s Discretionary Allocation Request form: [Request an allocation](https://my.alcf.anl.gov/accounts/#/allocationRequests). The allocations committee reviews the proposals and provides its decision in 1-2 weeks. **2. Does a PI need to have an ALCF account to create a Globus guest collection?** -Yes. The PI needs to have an 'active' ALCF account in place to create and share guest collections with collaborators. +Yes. The PI needs to have an 'active' ALCF account in place to create and share guest collections with collaborators. + +- PIs with an "Inactive/Deleted" ALCF account should submit a reactivation request by filling out this form: [Re-activation Form](https://my.alcf.anl.gov/accounts/#/accountReactivate) +- PIs without an ALCF account should submit an ALCF account request by filling out this form: [Account Request Form](https://my.alcf.anl.gov/accounts/#/accountRequest) -- PIs with an "Inactive/Deleted" ALCF account, should submit a reactivation request by filling out this form: [Re-activation Form](https://my.alcf.anl.gov/accounts/#/accountReactivate) -- PIs without an ALCF account should submit an ALCF account request by filling out this form: [Account Request Form](https://my.alcf.anl.gov/accounts/#/accountRequest) - **3. What endpoint should the PI use?** -```alcf#dtn_eagle``` (project on Eagle) +`alcf#dtn_eagle` (project on Eagle) **4. What are the actions a PI can perform?** - Create and delete guest collections, groups -- Create, delete and share the data with ALCF users and external collaborators +- Create, delete, and share the data with ALCF users and external collaborators - Specify someone as a Proxy (Access Manager) for the guest collections - Transfer data between the guest collection on Eagle and other Globus endpoints/collections **5. How can a PI specify someone as a Proxy on the Globus side?** -Go to alcf#dtn_eagle (or alcf#dtn_eagle) -> collections -> shared collection -> roles -> select 'Access Manager' +Go to alcf#dtn_eagle -> collections -> shared collection -> roles -> select 'Access Manager'
![Roles](files/roles.png){ width="700" }
To specify someone as a Proxy, click on "Roles"
- +
![Proxy](files/proxy.png){ width="700" }
Choose Access Manager and "Add Role"
-**6. What is the high-level workflow for setting up a guest collection?** +**6. What is the high-level workflow for setting up a guest collection?** -1. PI requests a compute or data-only allocation project. +1. PI requests a compute or data-only allocation project. 2. Once the request is approved, ALCF staff sets up a project, unixgroup, and project directory. 3. A Globus sharing policy is created for the project with appropriate access controls, provided the PI has an active ALCF account. 4. PI creates a guest collection for the project, using the Globus mapped collection for the file system (alcf#dtn_eagle) @@ -326,11 +316,11 @@ Go to alcf#dtn_eagle (or alcf#dtn_eagle) -> collections -> shared collection -> **7. How can project members with ALCF accounts access the project directory via Globus?** - Users that have active ALCF accounts and are part of the project in the ALCF Account and Project Management system will automatically have access to the project directory which they can access by browsing the Globus endpoint ```alcf#dtn_eagle``` . If they want to access the files using the Globus guest collection set up by the PI, the PI will need to explicitly give them permissions to that guest collection. The purpose of Globus guest collections is to share the data with collaborators that don't have ALCF accounts or are not part of the project in the ALCF Account and Project Management system. +Users that have active ALCF accounts and are part of the project in the ALCF Account and Project Management system will automatically have access to the project directory which they can access by browsing the Globus endpoint `alcf#dtn_eagle`. If they want to access the files using the Globus guest collection set up by the PI, the PI will need to explicitly give them permissions to that guest collection. The purpose of Globus guest collections is to share the data with collaborators that don't have ALCF accounts or are not part of the project in the ALCF Account and Project Management system. **8. Who has the permissions to create a guest collection?** -Only the PI has the ability to create a guest collection. The Access Manager, along with the PI, has permissions to share it with collaborators (R-only or R-W permissions as needed). +Only the PI has the ability to create a guest collection. The Access Manager, along with the PI, has permissions to share it with collaborators (R-only or R-W permissions as needed). **9. I am the project PI. Why do I see a "Permission Denied" error when I try to CREATE a guest collection?** @@ -341,7 +331,8 @@ If you are a PI and you see this error, it could mean that a sharing policy for No, project proxies cannot create guest collections, only the PI can. **11. Who can create groups?** -A PI (and an Access Manager) can create new groups, add members to them and share a guest collection with a group of collaborators. For more information, refer to: [Creating a group](#Creating-a-group) + +A PI (and an Access Manager) can create new groups, add members to them, and share a guest collection with a group of collaborators. For more information, refer to: [Creating a group](#Creating-a-group) **12. What happens when the PI of a project changes? What happens to the guest collection endpoint?** @@ -349,20 +340,21 @@ The new PI will need to create new guest collections and share it with collabora **13. I noticed that I am the owner of all the files that were transferred by external collaborators using the guest collection. Why is that?** -When collaborators read files from or write files to the guest collection, they do so on behalf of the PI. All writes show up as having been carried by the PI. Additionally, if the PI does not have permission to read or write to a file or folder in the directory, then the collaborators will not have those permissions either. +When collaborators read files from or write files to the guest collection, they do so on behalf of the PI. All writes show up as having been carried by the PI. Additionally, if the PI does not have permission to read or write to a file or folder in the directory, then the collaborators will not have those permissions either. **14. What happens to the guest collections when the PI's account goes inactive?** -The collections goes inactive and will remain in that state until the PI's account is re-activated. +The collections go inactive and will remain in that state until the PI's account is re-activated. **15. How long does it take for the endpoint to become accessible to collaborators after a PI's account is re-activated?** Right away. The page needs to be refreshed and sometimes you may have to log out and log back in. ### Access Manager FAQs: + **1. What are the actions an Access Manager can perform?** - Access Manager should be able to see the collection under "Shared with you" and "Shareable by you" tabs. They have permissions to add and/or delete collaborators on the shared collection and restrict their R-W access as needed. +Access Manager should be able to see the collection under "Shared with you" and "Shareable by you" tabs. They have permissions to add and/or delete collaborators on the shared collection and restrict their R-W access as needed. **2. Does an Access Manager need to have an ALCF account?** @@ -371,14 +363,14 @@ Not necessary. However, if they need to manage the membership on the POSIX side **3. What is the difference between an ALCF project Proxy and a guest collection Access Manager?** An ALCF Project Proxy has permissions to manage project membership on the POSIX side whereas a guest collection Access Manager has permissions to manage the project membership specific to that guest collection, created by the PI, on the Globus side. - -**4. I am an 'Access Manager' on the collection. Why do I see a 'Permission Denied' error when I try to SHARE a guest collection created by the PI?** + +**4. I am an 'Access Manager' on the collection. Why do I see a 'Permission Denied' error when I try to SHARE a guest collection created by the PI?** If you are a non-PI who is able to access the guest collection but unable to share it, it means that your role on this guest collection is limited to a "Member". If you want the ability to share folders and sub-folders from the collections that are shared with you, please talk to the PI. They will need to set your role to an "Access Manager" for the collection within Globus. **5. Can an Access Manager give external collaborators access to the collections that are shared with them?** -Yes, an Access Manager will see "Permissions" tab at the top of the shared collection page and can share it with collaborators and/or a group. +Yes, an Access Manager will see the "Permissions" tab at the top of the shared collection page and can share it with collaborators and/or a group. **6. Can an Access Manager create collections using the shared endpoint?** @@ -386,17 +378,18 @@ No. An access manager cannot create a collection, only a PI can do that. The acc **7. Can an Access Manager leave a globus group or withdraw membership request for collaborators?** -Yes.[Go to alcf#dtn_eagle -> Groups > group_name -> Members -> click on specific user -> Role & Status -> Set the appropriate status] +Yes. [Go to alcf#dtn_eagle -> Groups > group_name -> Members -> click on specific user -> Role & Status -> Set the appropriate status]
![Permission denied](files/roles.png){ width="700" } -
If you get thie error, you do not have read permissions.
+
If you get this error, you do not have read permissions.
**8. Can an Access Manager delete guest collections created by PI?** + No. Access managers cannot delete guest collections. -### Guest Collection Collaborators FAQs: +### Guest Collection Collaborators FAQs: **1. What actions can collaborators perform?** @@ -406,15 +399,13 @@ No. Access managers cannot delete guest collections. *If the PI has read permissions for those files on the POSIX side and the collaborator is given read permissions in Globus for the guest collection. -**If the PI has write permissions for those files on the POSIX side and the collaborator is given write permissions in Globus for the guest collection. +**If the PI has write permissions for those files on the POSIX side and the collaborator is given write permissions in Globus for the guest collection. **2. I am a collaborator. Why do I see a 'Permission Denied' error when I try to ACCESS a guest collection created by the PI?** -If you are a non-PI and you see this error while trying to access the collection, it means that you do not have read permissions to access the quest collection. Please contact the PI for required access. +If you are a non-PI and you see this error while trying to access the collection, it means that you do not have read permissions to access the guest collection. Please contact the PI for required access.
![Permission denied](files/roles.png){ width="1000" } -
If you get thie error, you do not have read permissions.
+
If you get this error, you do not have read permissions.
- - diff --git a/docs/data-management/acdc/transferring-data-to-eagle.md b/docs/data-management/acdc/transferring-data-to-eagle.md index e941812469..278047de17 100644 --- a/docs/data-management/acdc/transferring-data-to-eagle.md +++ b/docs/data-management/acdc/transferring-data-to-eagle.md @@ -1,6 +1,8 @@ # Transferring Data to Eagle + ## Evolution of the Petrel Data Service to the ALCF Community Data Co-Op -The Petrel data service is evolving into a more mature service called the ALCF Community Data Co-Op (ACDC) which will be launched later this year. + +The Petrel data service is evolving into a more mature service called the ALCF Community Data Co-Op (ACDC), which will be launched later this year. In preparation for this shift, all current Petrel project PIs will need to move their project data to ALCF's Eagle filesystem by December 2021. @@ -9,31 +11,38 @@ For detailed instructions on how to move your data, please follow the steps outl If you have any questions, please email: [support@alcf.anl.gov](mailto:support@alcf.anl.gov). ## Transferring data to Eagle + ### 1. Request a DD project on Eagle Filesystem -All Petrel project owners/PIs should request for a Director's Discretionary project on the Eagle filesystem by filling out the form at [https://my.alcf.anl.gov/accounts/#/allocationRequests](https://my.alcf.anl.gov/accounts/#/allocationRequests). Select "New Project" and then "Eagle" as the resource and fill out the rest of the form. In the "Project and Justification Summary" section, along with the requested details you should also state that you are migrating your data from Petrel. -Once the submission is reviewed and approved by the allocations committee, your project will be created on the Eagle filesystem and you will be notified via email. The approval process may take 1-2 weeks. Once the project is approved, proceed to the next step. +All Petrel project owners/PIs should request a Director's Discretionary project on the Eagle filesystem by filling out the form at [https://my.alcf.anl.gov/accounts/#/allocationRequests](https://my.alcf.anl.gov/accounts/#/allocationRequests). Select "New Project" and then "Eagle" as the resource and fill out the rest of the form. In the "Project and Justification Summary" section, along with the requested details, you should also state that you are migrating your data from Petrel. + +Once the submission is reviewed and approved by the allocations committee, your project will be created on the Eagle filesystem, and you will be notified via email. The approval process may take 1-2 weeks. Once the project is approved, proceed to the next step. ### 2. Apply for an ALCF account + A project PI will need an active ALCF account to: + - Transfer their data from Petrel to the Eagle filesystem - Enable data sharing on their Eagle project (See section "4 Share your data on Eagle using Globus Guest Collections" for more details) **NOTE:** A collaborator does not need an ALCF account to access data that is shared on Eagle (as a Globus Guest Collection). They can sign into Globus with their institutional identity to access the data. The first time they log in, they will need to accept terms and conditions. #### To apply for an ALCF account: + - Visit [https://my.alcf.anl.gov/](https://my.alcf.anl.gov/) and click on "Request An Account". -- When prompted for project name, please select the project on Eagle that was created for your Petrel data as a result of Step 1: Request a DD project on Eagle (you have to wait for your project to be created before you can apply for an account) - - If you don't have one, please follow the directions under "Step 1: Request a DD project on Eagle" (above) - - For more details on the ALCF account request process, visit the webpage Request an account -- Once your account is created and you have the cryptocard/mobile token to login to Eagle, proceed to the next step to transfer the data from Petrel to Eagle +- When prompted for project name, please select the project on Eagle that was created for your Petrel data as a result of Step 1: Request a DD project on Eagle (you have to wait for your project to be created before you can apply for an account). + - If you don't have one, please follow the directions under "Step 1: Request a DD project on Eagle" (above). + - For more details on the ALCF account request process, visit the webpage Request an account. +- Once your account is created and you have the cryptocard/mobile token to log in to Eagle, proceed to the next step to transfer the data from Petrel to Eagle. ### 3. Transfer data from your source endpoint to Eagle using Globus -You can use the Globus web app to transfer data or the CLI. See [Using CLI](#Using-Globus-CLI-tool) for instructions on how to use the CLI to transfer data. The following set of instructions use the Globus web app, using **alcf#dtn_eagle (path /projectname)** as the destination to transfer data from your source endpoint. + +You can use the Globus web app to transfer data or the CLI. See [Using CLI](#Using-Globus-CLI-tool) for instructions on how to use the CLI to transfer data. The following set of instructions uses the Globus web app, using **alcf#dtn_eagle (path /projectname)** as the destination to transfer data from your source endpoint. **NOTE:** Anonymous HTTPS read access is enabled on Eagle. -**Step 1:** Log into [https://app.globus.org/file-manager?destination_id=05d2c76a-e867-4f67-aa57-76edeb0beda0](https://app.globus.org/file-manager?destination_id=05d2c76a-e867-4f67-aa57-76edeb0beda0) which opens two panes in the Globus File Manager, with ALCF Eagle on the right-hand side. +**Step 1:** Log into [https://app.globus.org/file-manager?destination_id=05d2c76a-e867-4f67-aa57-76edeb0beda0](https://app.globus.org/file-manager?destination_id=05d2c76a-e867-4f67-aa57-76edeb0beda0), which opens two panes in the Globus File Manager, with ALCF Eagle on the right-hand side. + - Enter the name of your source endpoint in the pane on the left-hand side.
@@ -91,51 +100,56 @@ You can use the Globus web app to transfer data or the CLI. See [Using CLI](#Usi
#### Migrating permissions from Petrel to Eagle: -For PIs who had previously stored data on Petrel, and are migrating to Eagle, the following tool automates the step of copying the permissions set on Petrel to Eagle. The tool, migrate_permissions.py at [https://github.com/globus/globus-tool-examples](https://github.com/globus/globus-tool-examples) takes the source endpoint (your shared endpoint on Petrel in this case), and destination endpoint (the guest collection on Eagle that has the data), and copies over all the permissions. The tool assumes the data was coped over as is from source to destination. -If you have any questions on the tool, or need further support, please contact [support@globus.org](mailto:support@globus.org). +For PIs who had previously stored data on Petrel and are migrating to Eagle, the following tool automates the step of copying the permissions set on Petrel to Eagle. The tool, `migrate_permissions.py` at [https://github.com/globus/globus-tool-examples](https://github.com/globus/globus-tool-examples), takes the source endpoint (your shared endpoint on Petrel in this case) and destination endpoint (the guest collection on Eagle that has the data) and copies over all the permissions. The tool assumes the data was copied over as is from source to destination. + +If you have any questions on the tool or need further support, please contact [support@globus.org](mailto:support@globus.org). ### 4. Share your data on Eagle using Globus Guest Collections -Your data on the Eagle file system can easily be shared with collaborators who are at ALCF or elsewhere. You have full control over which files your collaborator can access, and whether they have read-only or read-write permissions. + +Your data on the Eagle file system can easily be shared with collaborators who are at ALCF or elsewhere. You have full control over which files your collaborator can access and whether they have read-only or read-write permissions. See below for step-by-step instructions on how to share data from Eagle using Globus Guest Collections: [https://docs.alcf.anl.gov/data-management/acdc/eagle-data-sharing/](../acdc/eagle-data-sharing.md) -**NOTE:** Guest Collections are tied to the project PI's account so if the PI's account becomes inactive, the Guest Collections will also become inactive. Once the PI's account is reactivated, access to the Guest Collections is restored. +**NOTE:** Guest Collections are tied to the project PI's account, so if the PI's account becomes inactive, the Guest Collections will also become inactive. Once the PI's account is reactivated, access to the Guest Collections is restored. #### Using Globus CLI tool: + To copy data and permissions from a source collection, PIs can use a Globus CLI tool that automates the step of copying the permissions set on the source collection and applies them to the collection on Eagle. This is especially useful for PIs who had previously stored data on Petrel. See [https://github.com/globus/globus-tool-examples](https://github.com/globus/globus-tool-examples) for more information. -The tool, migrate_permissions.py in the github repo takes the source endpoint (the shared endpoint on Petrel for example), and destination endpoint (the guest collection on Eagle that has the data), and copies over all the permissions. The tool assumes the data was coped over as is from source to destination. Note that you need to have a guest collection set up for your project on Eagle to use the CLI command and tool. See this page for instructions on [how to set up guest collections](eagle-data-sharing.md#Creating-a-Guest-Collection). +The tool, `migrate_permissions.py` in the GitHub repo, takes the source endpoint (the shared endpoint on Petrel, for example) and destination endpoint (the guest collection on Eagle that has the data) and copies over all the permissions. The tool assumes the data was copied over as is from source to destination. Note that you need to have a guest collection set up for your project on Eagle to use the CLI command and tool. See this page for instructions on [how to set up guest collections](eagle-data-sharing.md#Creating-a-Guest-Collection). -If you have any questions on the tool, or need further support, please contact [support@globus.org](mailto:support@globus.org). +If you have any questions on the tool or need further support, please contact [support@globus.org](mailto:support@globus.org). **Existing data portals:** -To reconfigure and update your existing data portals to point to your guest collections on Eagle, please work directly with developer/maintainer of the portal. + +To reconfigure and update your existing data portals to point to your guest collections on Eagle, please work directly with the developer/maintainer of the portal. ## FAQs for migrating Petrel data to Eagle: + #### 1. Is it important for a Petrel project owner/PI to obtain an ALCF account? Yes, the data from Petrel needs to be moved to an ALCF project directory on the Eagle filesystem. The PI will need an ALCF account to log into Globus and move the data to their Eagle project directory. -#### 2. What is the workflow for migrating data from Petrel and giving access to collaborators on Eagle? +#### 2. What is the workflow for migrating data from Petrel and giving access to collaborators on Eagle? -1. PI requests an Eagle allocation project -2. Allocations Committee reviews and approves requests -3. Once the allocation request is approved, the project is created and associated with a UNIX group and project directory on Eagle -4. PI requests an ALCF account (if they don't have one) -5. Once the ALCF account is created and tied to the project on Eagle, the PI moves the data from Petrel to Eagle using Globus +1. PI requests an Eagle allocation project. +2. Allocations Committee reviews and approves requests. +3. Once the allocation request is approved, the project is created and associated with a UNIX group and project directory on Eagle. +4. PI requests an ALCF account (if they don't have one). +5. Once the ALCF account is created and tied to the project on Eagle, the PI moves the data from Petrel to Eagle using Globus. 6. PI creates guest collections for the project on Eagle, using the Globus web app using the mapped collection/endpoint for Eagle (alcf#dtn_eagle). Note that: - - The PI needs to have an active ALCF Account and will need to log in to Globus using their ALCF credentials - - Only the PI (and not a proxy) can create guest collections - - If the PI already has a Globus account, it needs to be linked to their ALCF account -7. PI adds collaborators to the guest collection. - - Added with read-only or read-write permissions. - - **Note:** Anonymous HTTPS write is disabled and only anonymous HTTPS read is allowed. -8. Existing data portals on Petrel should be updated to point to the new guest collection on Eagle. Please work directly with developer/maintainer of the portal. + - The PI needs to have an active ALCF Account and will need to log in to Globus using their ALCF credentials. + - Only the PI (and not a proxy) can create guest collections. + - If the PI already has a Globus account, it needs to be linked to their ALCF account. +7. PI adds collaborators to the guest collection. + - Added with read-only or read-write permissions. + - **Note:** Anonymous HTTPS write is disabled, and only anonymous HTTPS read is allowed. +8. Existing data portals on Petrel should be updated to point to the new guest collection on Eagle. Please work directly with the developer/maintainer of the portal. -#### 3. What endpoints should the PI use to move data from Petrel?** +#### 3. What endpoints should the PI use to move data from Petrel? - Source: Globus endpoint on Petrel for the Petrel allocation -- Destination: Globus endpoint on the Eagle filesystem and the path to the directory (alcf#dtn_eagle, path /) OR the name of the guest collection on Eagle +- Destination: Globus endpoint on the Eagle filesystem and the path to the directory (alcf#dtn_eagle, path /) OR the name of the guest collection on Eagle \ No newline at end of file diff --git a/docs/data-management/data-transfer/sftp-scp.md b/docs/data-management/data-transfer/sftp-scp.md index 9977d79ff8..019a865e99 100644 --- a/docs/data-management/data-transfer/sftp-scp.md +++ b/docs/data-management/data-transfer/sftp-scp.md @@ -1,4 +1,5 @@ # SFTP and SCP + These standard utilities are available for local area transfers of small files; they are not recommended for use with large data transfers due to poor performance and excess resource utilization on the login nodes. -See [Globus](using-globus.md) for performing large data transfers. +See [Globus](using-globus.md) for performing large data transfers. \ No newline at end of file diff --git a/docs/data-management/data-transfer/using-globus.md b/docs/data-management/data-transfer/using-globus.md index 610d183b05..6fef48a412 100644 --- a/docs/data-management/data-transfer/using-globus.md +++ b/docs/data-management/data-transfer/using-globus.md @@ -1,28 +1,32 @@ # Using Globus -[Globus](http://www.globus.org/) addresses the challenges faced by researchers in moving, sharing, and archiving large volumes of data among distributed sites. With Globus, you hand off data movement tasks to a hosted service that manages the entire operation. It monitors performance and errors, retries failed transfers, corrects problems automatically whenever possible, and reports status to keep you informed and keep you focused on your research. -Command line and Web-based interfaces are available. The command line interface, which requires only ssh to be installed on the client, is the method of choice for script-based workflows. Globus also provides a [REST-style transfer API](https://docs.globus.org/api/transfer/) for advanced-use cases that require scripting and automation. +[Globus](http://www.globus.org/) addresses the challenges faced by researchers in moving, sharing, and archiving large volumes of data among distributed sites. With Globus, you hand off data movement tasks to a hosted service that manages the entire operation. It monitors performance and errors, retries failed transfers, corrects problems automatically whenever possible, and reports status to keep you informed and focused on your research. + +Command line and web-based interfaces are available. The command line interface, which requires only SSH to be installed on the client, is the method of choice for script-based workflows. Globus also provides a [REST-style transfer API](https://docs.globus.org/api/transfer/) for advanced use cases that require scripting and automation. ## Getting Started -Basic documentation for getting started with Globus can be found at the following URL: + +Basic documentation for getting started with Globus can be found at the following URL: [https://docs.globus.org/how-to/](https://docs.globus.org/how-to/) ## Data Transfer Node + Several data transfer nodes (DTNs) for `/home`, Eagle, Grand, and HPSS are available to ALCF users, allowing users to perform wide and local area data transfers. Access to the DTNs is provided via the following Globus endpoints. ## ALCF Globus Endpoints -The Globus endpoint and the path to use depends on where your data resides. If your data is on: -- `/home` which is where your home directory resides: `alcf#dtn_home` for accessing `/home` (i.e. home directories on agile-home filesystem). Use the path `/` +The Globus endpoint and the path to use depend on where your data resides. If your data is on: + +- `/home`, which is where your home directory resides: `alcf#dtn_home` for accessing `/home` (i.e., home directories on the agile-home filesystem). Use the path `/` - HPSS: `alcf#dtn_hpss` -- Eagle filesystem: `alcf#dtn_eagle` for accessing /`lus/eagle/projects` or `/eagle` (i.e project directories on Eagle filesystem). Use the path `/eagle/` -- Grand filesystem: `alcf#dtn_grand` for accessing `/lus/grand/projects` or `/grand` (i.e. project directories on Grand filesystem). Use the path `/grand/` +- Eagle filesystem: `alcf#dtn_eagle` for accessing `/lus/eagle/projects` or `/eagle` (i.e., project directories on the Eagle filesystem). Use the path `/eagle/` +- Grand filesystem: `alcf#dtn_grand` for accessing `/lus/grand/projects` or `/grand` (i.e., project directories on the Grand filesystem). Use the path `/grand/` After [registering](https://app.globus.org/), simply use the appropriate ALCF endpoint, as well as other sources or destinations. Use your ALCF credentials (your OTP generated by the CryptoCARD token with PIN or Mobilepass app) to activate the ALCF endpoint. -[Globus Connect Personal](https://www.globus.org/globus-connect-personal) allows users to add laptops or desktops as an endpoint to Globus, in just a few steps. After you set up Globus Connect Personal, Globus can be used to transfer files to and from your computer. +[Globus Connect Personal](https://www.globus.org/globus-connect-personal) allows users to add laptops or desktops as an endpoint to Globus in just a few steps. After you set up Globus Connect Personal, Globus can be used to transfer files to and from your computer. ## References -[Research Data Management with Globus (2019)](https://www.alcf.anl.gov/support-center/training-assets/research-data-management-globus) + +[Research Data Management with Globus (2019)](https://www.alcf.anl.gov/support-center/training-assets/research-data-management-globus) - diff --git a/docs/data-management/filesystem-and-storage/data-storage.md b/docs/data-management/filesystem-and-storage/data-storage.md index 9f2fe27fe3..c0101589e6 100644 --- a/docs/data-management/filesystem-and-storage/data-storage.md +++ b/docs/data-management/filesystem-and-storage/data-storage.md @@ -4,25 +4,25 @@ The ALCF operates a number of file systems that are mounted globally across all of our production systems. ### Home -A Lustre file system residing on a DDN AI-400X NVMe Flash platform. It has 24 NVMe drives with 7 TB each with 123 TB of usable space. It provides 8 Object Storage Targets and 4 Metadata Targets. +A Lustre file system residing on a DDN AI-400X NVMe Flash platform. It has 24 NVMe drives with 7 TB each, providing 123 TB of usable space. It provides 8 Object Storage Targets and 4 Metadata Targets. ### Eagle -A Lustre file system residing on an HPE ClusterStor E1000 platform equipped with 100 Petabytes of usable capacity across 8480 disk drives. This ClusterStor platform provides 160 Object Storage Targets and 40 Metadata Targets with an aggregate data transfer rate of 650GB/s. The primary use of eagle is data sharing with the research community. Eagle has community sharing community capabilities which allow PIs to [share their project data with external collabortors](../acdc/eagle-data-sharing.md) using Globus. Eagle can also be used for compute campaign storage. +A Lustre file system residing on an HPE ClusterStor E1000 platform equipped with 100 Petabytes of usable capacity across 8,480 disk drives. This ClusterStor platform provides 160 Object Storage Targets and 40 Metadata Targets with an aggregate data transfer rate of 650 GB/s. The primary use of Eagle is data sharing with the research community. Eagle has community sharing capabilities which allow PIs to [share their project data with external collaborators](../acdc/eagle-data-sharing.md) using Globus. Eagle can also be used for compute campaign storage. -Also see [ALCF Data Policies](../../policies/data-and-software-policies/data-policy.md) and [Data Transfer](../data-transfer/using-globus.md) +Also see [ALCF Data Policies](../../policies/data-and-software-policies/data-policy.md) and [Data Transfer](../data-transfer/using-globus.md). ## Tape Storage -ALCF operates three 10,000 slot Spectralogic tape libraries. We are currently running a combination of LTO6 and LTO8 tape technology. The LTO tape drives have built-in hardware compression which typically achieve compression ratios between 1.25:1 and 2:1 depending on the data yielding an effective capacity of approximately 65PB. +ALCF operates three 10,000-slot Spectralogic tape libraries. We are currently running a combination of LTO6 and LTO8 tape technology. The LTO tape drives have built-in hardware compression which typically achieves compression ratios between 1.25:1 and 2:1 depending on the data, yielding an effective capacity of approximately 65 PB. ## HPSS HPSS is a data archive and retrieval system that manages large amounts of data on disk and robotic tape libraries. It provides hierarchical storage management services that allow it to migrate data between those storage platforms. -HPSS is currently configured with a disk and tape tier. The disk tier has a capacity of 1.2PB on a DataDirect Networks SFA12K-40 storage array. By default, all archived data is initially written to the disk tier. The tape tier consists of 3 SpectraLogic T950 robotic tape libraries containing a total of 72 LTO6 tape drives with total uncompressed capacity 64 PB. Archived data is migrated to the tape tier at regular intervals, then deleted from the disk tier to create space for future archives. +HPSS is currently configured with a disk and tape tier. The disk tier has a capacity of 1.2 PB on a DataDirect Networks SFA12K-40 storage array. By default, all archived data is initially written to the disk tier. The tape tier consists of 3 SpectraLogic T950 robotic tape libraries containing a total of 72 LTO6 tape drives with a total uncompressed capacity of 64 PB. Archived data is migrated to the tape tier at regular intervals, then deleted from the disk tier to create space for future archives. -Access to HPSS is provided by various client components. Currently, ALCF supports access through two command-line clients, HSI and HTAR. In order for the client to authenticate with HPSS, the user must have a keytab file that should be located in their home directory under subdirectory .hpss. The file name will be in the format .ktb_. +Access to HPSS is provided by various client components. Currently, ALCF supports access through two command-line clients, HSI and HTAR. In order for the client to authenticate with HPSS, the user must have a keytab file that should be located in their home directory under the subdirectory .hpss. The file name will be in the format .ktb_. ### HSI General Usage -HSI can be invoked by simply entering hsi at your normal shell prompt. Once authenticated, you will enter the hsi command shell environment: +HSI can be invoked by simply entering `hsi` at your normal shell prompt. Once authenticated, you will enter the HSI command shell environment: ``` > hsi @@ -31,16 +31,16 @@ HSI can be invoked by simply entering hsi at your normal shell prompt. Once auth You may enter "help" to display a brief description of available commands. -If archiving from or retrieving to eagle you must disable the Transfer Agent. -T off +If archiving from or retrieving to Eagle, you must disable the Transfer Agent with `-T off`. -Example archive +Example archive: ``` [HSI]/home/username-> put mydatafile # same name on HPSS [HSI]/home/username-> put local.file : hpss.file # different name on HPSS [HSI]/home/username-> put -T off mydatafile ``` -Example retrieval +Example retrieval: ``` [HSI]/home/username-> get mydatafile [HSI]/home/username-> get local.file : hpss.file @@ -60,7 +60,7 @@ And organizing your archived files: [HSI]/home/username-> rm dataset1/hpss.file ``` -It may be necessary to use single or double quotes around metacharacters to avoid having the shell prematurely expand them. For example: +It may be necessary to use single or double quotes around metacharacters to avoid having the shell prematurely expand them. For example: ``` [HSI]/home/username-> get *.c @@ -74,10 +74,10 @@ will not work, but will retrieve all files ending in .c. -Following normal shell conventions, other special characters in filenames such as whitespace and semicolon also need to be escaped with "\" (backslash). For example: +Following normal shell conventions, other special characters in filenames such as whitespace and semicolon also need to be escaped with "\" (backslash). For example: ``` - [HSI]/home/username-> get "data\ file\ \;\ version\ 1" +[HSI]/home/username-> get "data\ file\ \;\ version\ 1" ``` retrieves the file named "data file ; version 1". @@ -91,22 +91,21 @@ hsi -O log.file "put local.file" ### HTAR General Usage HTAR is a tar-like utility that creates tar-format archive files directly in HPSS. It can be run as a command line or embedded in a script. -Example archive +Example archive: ``` htar -cf hpssfile.tar localfile1 localfile2 localfile3 ``` -Example retrieval +Example retrieval: ``` htar -xf hpssfile.tar localfile2 ``` -**NOTE:** The current version of HTAR has a 64GB file size limit as well as a path length limit. The recommended client is HSI. +**NOTE:** The current version of HTAR has a 64 GB file size limit as well as a path length limit. The recommended client is HSI. ### Globus -In addition, HPSS is accessible through the Globus endpoint `alcf#dtn_hpss`. As with HSI and HTAR, you must have a keytab file before using this endpoint. For more information on using Globus, please see [Using Globus]. - +In addition, HPSS is accessible through the Globus endpoint `alcf#dtn_hpss`. As with HSI and HTAR, you must have a keytab file before using this endpoint. For more information on using Globus, please see [Using Globus](../data-transfer/using-globus.md). ## Keytab File Missing If you see an error like this: @@ -117,4 +116,4 @@ If you see an error like this: Error - authentication/initialization failed ``` -it means that your account is not enabled to use the HPSS yet. Please contact support to have it set up. +it means that your account is not enabled to use the HPSS yet. Please contact support to have it set up. \ No newline at end of file diff --git a/docs/data-management/filesystem-and-storage/disk-quota.md b/docs/data-management/filesystem-and-storage/disk-quota.md index 88d8a8e6ef..9a6ca6d8ba 100644 --- a/docs/data-management/filesystem-and-storage/disk-quota.md +++ b/docs/data-management/filesystem-and-storage/disk-quota.md @@ -1,25 +1,36 @@ # Disk Quota + ## Overview -Disk quotas are enabled on project directories. ALCF's HPC systems use the agile-home file system located at `/lus/agile/home` where quotas are also enforced. Details on the home file system are listed in [file systems](file-systems.md). Following are descriptions and examples for the home file system, as well as the Eagle project filesystems. + +Disk quotas are enabled on project directories. ALCF's HPC systems use the agile-home file system located at `/lus/agile/home`, where quotas are also enforced. Details on the home file system are listed in [file systems](file-systems.md). Below are descriptions and examples for the home file system, as well as the Eagle project filesystems. ## Home Directory Quotas -By default, each home directory is assigned a default of 50GB. File ownership determines disk space usage. + +By default, each home directory is assigned a quota of 50GB. File ownership determines disk space usage. To check the home directory usage, enter this command: + +```bash +myquota +``` + ``` -> myquota Name Type Filesystem Used Quota Grace ========================================================================================================= userX User /lus/agile 44.13G 50.00G none ``` ## Project Directory Quotas -The amount of data stored under `/lus/grand/projects/PROJECT_NAME` cannot exceed the approved project quota limit approved during the allocation period. The total data usage under the project directory is used to calculate the disk quota. + +The amount of data stored under `/lus/grand/projects/PROJECT_NAME` cannot exceed the approved project quota limit set during the allocation period. The total data usage under the project directory is used to calculate the disk quota. To check project quota usage on the file systems, enter this command: + +```bash +myprojectquotas +``` + ``` -> myprojectquotas - Lustre : Current Project Quota information for projects you're a member of: Name Type Filesystem Used Quota Grace @@ -28,9 +39,11 @@ projectX Project eagle 1.87T 10 ``` ## Requesting a New Eagle Allocation -For requesting a new project having an allocation on Eagle (with or without a compute allocation), please make a request by filling out the [Director's Discretionary allocation form](https://my.alcf.anl.gov/accounts/#/allocationRequests). Note that all new compute projects will have the default file system. + +To request a new project with an allocation on Eagle (with or without a compute allocation), please fill out the [Director's Discretionary allocation form](https://my.alcf.anl.gov/accounts/#/allocationRequests). Note that all new compute projects will have the default file system. ## Quota Increases -If you need a quota increase for Director's Discretionary allocations, please make a request by filling out the [Director's Discretionary allocation form](https://my.alcf.anl.gov/accounts/#/allocationRequests). -If you need a quota increase for your INCITE/ALCC/ALCC/ESP project directory, please send an email to [support@alcf.anl.gov](mailto:support@alcf.anl.gov) with the machine, project name, new quota amount and reason for the increase. +If you need a quota increase for Director's Discretionary allocations, please fill out the [Director's Discretionary allocation form](https://my.alcf.anl.gov/accounts/#/allocationRequests). + +If you need a quota increase for your INCITE/ALCC/ESP project directory, please send an email to [support@alcf.anl.gov](mailto:support@alcf.anl.gov) with the machine, project name, new quota amount, and reason for the increase. \ No newline at end of file diff --git a/docs/data-management/filesystem-and-storage/file-systems.md b/docs/data-management/filesystem-and-storage/file-systems.md index 825ea771cf..96e206fc0c 100644 --- a/docs/data-management/filesystem-and-storage/file-systems.md +++ b/docs/data-management/filesystem-and-storage/file-systems.md @@ -1,63 +1,63 @@ # ALCF File Systems -Our HPC systems store project data in a file system called Eagle. -Eagle is a Lustre file system mounted as` /eagle`. -For more information on the Lustre file system, here is a document on Lustre File Striping Basics. + +Our HPC systems store project data in a file system called Eagle. Eagle is a Lustre file system mounted as `/eagle`. For more information on the Lustre file system, here is a document on Lustre File Striping Basics. * [Lustre File Striping Basics](https://www.alcf.anl.gov/support-center/training-assets/file-systems-and-io-performance) For information on the AI Testbed storage systems, refer to the AI Testbed storage page: [https://argonne-lcf.github.io/ai-testbed-userdocs/common/storage/](https://argonne-lcf.github.io/ai-testbed-userdocs/common/storage/) -Our HPC systems also share a Lustre home file system, called agile-home. The home file system is mounted as /home, and should generally be used for small files and any binaries to be run on Polaris. The performance of this file system is reasonable, but using it for intensive I/O from the compute nodes is discouraged because I/O from the compute nodes uses the project data file systems, which are fast parallel systems and have far more storage space and greater I/O performance than the home directory space. +Our HPC systems also share a Lustre home file system, called agile-home. The home file system is mounted as `/home` and should generally be used for small files and any binaries to be run on Polaris. The performance of this file system is reasonable, but using it for intensive I/O from the compute nodes is discouraged because I/O from the compute nodes uses the project data file systems, which are fast parallel systems and have far more storage space and greater I/O performance than the home directory space. The agile-home file system is regularly backed up to tape. The data file system is not backed up. It is the user’s responsibility to ensure that copies of any critical data on the data file system have either been archived to tape or stored elsewhere. | Name | Accessible From | Type | Path | Production | Backed-up | Usage | -|--------------------------------------|----------|--------|---------------------------------------------------------------------------------------|-----------------------------------------------|-----------|------------------------------------------------------------------------| -| agile-home | Polaris | Lustre | /home or /lus/agile/home | Yes | Yes | General use | -| Eagle | Polaris | Lustre | /eagle or /lus/eagle/projects | Yes | No | Community sharing via Globus;
Intensive job output, large files | -| Node SSD

(Compute node only) | Polaris | xfs | /local/scratch (Polaris) | Yes | No | Local node scratch during run | +|--------------------------------------|-----------------|--------|---------------------------------------------------------------------------------------|-----------------------------------------------|-----------|------------------------------------------------------------------------| +| agile-home | Polaris | Lustre | /home or /lus/agile/home | Yes | Yes | General use | +| Eagle | Polaris | Lustre | /eagle or /lus/eagle/projects | Yes | No | Community sharing via Globus;
Intensive job output, large files | +| Node SSD

(Compute node only) | Polaris | xfs | /local/scratch (Polaris) | Yes | No | Local node scratch during run | ## Available Directories + ### Home Directories + - Created when an account is created - Located under /home - Each home directory is subject to a quota based on user file ownership. The default quota is 50 GB #### Sharing Home Directory Files or Subdirectories with Others -If you need to share files or subdirectories (folders) under your home directory with collaborators (other ALCF users), you need to change file permissions from their defaults. You must change permissions of your top-level /home/username directory, even if you only want to share certain files/directories within it. Using normal linux file permissions control is good enough to give access to *all* other users, and is simple. For more fine-grained control over specific users, you need to use linux access control list (ACL) commands. +If you need to share files or subdirectories (folders) under your home directory with collaborators (other ALCF users), you need to change file permissions from their defaults. You must change permissions of your top-level `/home/username` directory, even if you only want to share certain files/directories within it. Using normal Linux file permissions control is good enough to give access to *all* other users and is simple. For more fine-grained control over specific users, you need to use Linux access control list (ACL) commands. ##### Simple Method: Permission to All Users -First, a one-time-only change to your top-level /home/username directory. +First, a one-time-only change to your top-level `/home/username` directory. -``` +```bash chmod o+x /home/username ``` -Then you may permission individual files and/or subdirectories with read access. For example, to recursively change permissions on /home/username/subdirectoryname so that all files in that subdirectory and any subdirectory trees within it are world-readable, you would use +Then you may permission individual files and/or subdirectories with read access. For example, to recursively change permissions on `/home/username/subdirectoryname` so that all files in that subdirectory and any subdirectory trees within it are world-readable, you would use -``` +```bash chmod -R o+Xr /home/username/subdirectoryname ``` ##### Refined Method: Use ACL to Give Permission to Specific Users -First, a one-time-only change to your top-level /home/username directory. To share files/directories with user gilgamesh, for example: +First, a one-time-only change to your top-level `/home/username` directory. To share files/directories with user gilgamesh, for example: -``` +```bash setfacl -m u:gilgamesh:X /home/username ``` -Then you may permission individual files and/or subdirectories with read access. For example, to recursively change permissions on /home/username/subdirectoryname so that all files in that subdirectory and any subdirectory trees within it are readable to user gilgamesh, you would use +Then you may permission individual files and/or subdirectories with read access. For example, to recursively change permissions on `/home/username/subdirectoryname` so that all files in that subdirectory and any subdirectory trees within it are readable to user gilgamesh, you would use -``` +```bash setfacl -R -m u:gilgamesh:rX /home/username/subdirectoryname ``` - - ### Project Directories + - Directories on Eagle are created when an allocation (INCITE, ALCC, Discretionary, etc.) is awarded. Eagle directories can be created as stand-alone allocations. Use the [allocation request form](https://my.alcf.anl.gov/accounts/#/allocationRequests) to submit requests for an allocation on Eagle. - Directory paths: - Eagle: /eagle or /lus/eagle/projects @@ -65,9 +65,11 @@ setfacl -R -m u:gilgamesh:rX /home/username/subdirectoryname These project spaces do not have user quotas but a directory quota, meaning that ALL files contained within a project directory, regardless of the username, cannot exceed the disk space allocation granted to the project. For more information on quotas, see the [Disk Quota page](disk-quota.md). ## Local Node SSD + Access to SSDs is enabled by default on Polaris. ### SSD Information + - Local scratch SSD storage on compute nodes for running jobs - Completely local non-parallel filesystem - Located at /local/scratch on Polaris computes @@ -80,10 +82,7 @@ Access to SSDs is enabled by default on Polaris. Model PM1725a drives [specifications](https://semiconductor.samsung.com/resources/brochure/Brochure_Samsung_PM1725a_NVMe_SSD_1805.pdf) | Model PM1725a drives | ------- | -| ------ |-----------------| -| Capacity | 1.6 TB | -| Sequential | Read 3300 MB/s | -| Sequential | Write 3300 MB/s | - - - +| ---------------------|-----------------| +| Capacity | 1.6 TB | +| Sequential Read | 3300 MB/s | +| Sequential Write | 3300 MB/s | \ No newline at end of file diff --git a/docs/data-management/filesystem-and-storage/hpss.md b/docs/data-management/filesystem-and-storage/hpss.md index 95f5e52654..6a3917de86 100644 --- a/docs/data-management/filesystem-and-storage/hpss.md +++ b/docs/data-management/filesystem-and-storage/hpss.md @@ -3,14 +3,14 @@ HPSS is a data archive and retrieval system that manages large amounts of data on disk and robotic tape libraries. It provides hierarchical storage management services that allow it to migrate data between those storage platforms. -HPSS is currently configured with a disk and tape tier. The disk tier has a capacity of 1.2PB on a DataDirect Networks SFA12K-40 storage array. By default, all archived data is initially written to the disk tier. The tape tier consists of 3 SpectraLogic T950 robotic tape libraries containing a total of 72 LTO6 tape drives with total uncompressed capacity 64 PB. Archived data is migrated to the tape tier at regular intervals, then deleted from the disk tier to create space for future archives. +HPSS is currently configured with a disk and tape tier. The disk tier has a capacity of 1.2PB on a DataDirect Networks SFA12K-40 storage array. By default, all archived data is initially written to the disk tier. The tape tier consists of 3 SpectraLogic T950 robotic tape libraries containing a total of 72 LTO6 tape drives with a total uncompressed capacity of 64 PB. Archived data is migrated to the tape tier at regular intervals, then deleted from the disk tier to create space for future archives. + +Access to HPSS is provided by various client components. Currently, ALCF supports access through two command-line clients: HSI and HTAR. These are installed on the login nodes of Theta, Cooley, and Polaris. In order for the client to authenticate with HPSS, the user must have a keytab file that should be located in their home directory under the subdirectory `.hpss`. The file name will be in the format `.ktb_`. -Access to HPSS is provided by various client components. Currently, ALCF supports access through two command-line clients: HSI and HTAR. These are installed on the login nodes of Theta, Cooley, and Polaris. In order for the client to authenticate with HPSS, the user must have a keytab file that should be located in their home directory under subdirectory `.hpss`. The file name will be in the format `.ktb_`. - ## HSI General Usage -HSI can be invoked by simply entering hsi at your normal shell prompt. Once authenticated, you will enter the hsi command shell environment: -``` +HSI can be invoked by simply entering `hsi` at your normal shell prompt. Once authenticated, you will enter the HSI command shell environment: +```bash > hsi [HSI]/home/username-> ``` @@ -18,53 +18,52 @@ HSI can be invoked by simply entering hsi at your normal shell prompt. Once auth You may enter "help" to display a brief description of available commands. Example archive: -``` +```bash [HSI]/home/username-> put mydatafile # same name on HPSS [HSI]/home/username-> put local.file : hpss.file # different name on HPSS ``` Example retrieval: -``` +```bash [HSI]/home/username-> get mydatafile [HSI]/home/username-> get local.file : hpss.file ``` -Most of the usual shell commands will work as expected in the HSI command environment. - +Most of the usual shell commands will work as expected in the HSI command environment. + For example, checking what files are archived: -``` +```bash [HSI]/home/username-> ls -l ``` And organizing your archived files: -``` +```bash [HSI]/home/username-> mkdir dataset1 [HSI]/home/username-> mv hpss.file dataset1 [HSI]/home/username-> ls dataset1 [HSI]/home/username-> rm dataset1/hpss.file ``` -It may be necessary to use single or double quotes around metacharacters to avoid having the shell prematurely expand them. - +It may be necessary to use single or double quotes around metacharacters to avoid having the shell prematurely expand them. + For example: -``` +```bash [HSI]/home/username-> get *.c - +``` will not work, but - +```bash [HSI]/home/username-> get "*.c" - -will retrieve all files ending in .c. ``` +will retrieve all files ending in .c. -Following normal shell conventions, other special characters in filenames such as whitespace and semicolon also need to be escaped with "\" (backslash). For example: - - [HSI]/home/username-> get "data\ file\ \;\ version\ 1" - +Following normal shell conventions, other special characters in filenames such as whitespace and semicolon also need to be escaped with "\\" (backslash). For example: +```bash +[HSI]/home/username-> get "data\ file\ \;\ version\ 1" +``` retrieves the file named "data file ; version 1". HSI can also be run as a command line or embedded in a script as follows: -``` +```bash hsi -O log.file "put local.file" ``` @@ -72,21 +71,21 @@ hsi -O log.file "put local.file" HTAR is a tar-like utility that creates tar-format archive files directly in HPSS. It can be run as a command line or embedded in a script. Example archive: -``` +```bash htar -cf hpssfile.tar localfile1 localfile2 localfile3 ``` Example retrieval: -``` +```bash htar -xf hpssfile.tar localfile2 ``` **Note:** -- On Theta you must first load the HSI module to make HSI and HTAR available. "module load hsi" -- The current version of HTAR has a 64GB file size limit as well as a path length limit. The recommended client is HSI +- On Theta, you must first load the HSI module to make HSI and HTAR available. Use `module load hsi`. +- The current version of HTAR has a 64GB file size limit as well as a path length limit. The recommended client is HSI. ### Globus -In addition, HPSS is accessible through the Globus endpoint alcf#dtn_hpss. As with HSI and HTAR, you must have a keytab file before using this endpoint. For more information on using Globus, please see Using Globus. +In addition, HPSS is accessible through the Globus endpoint `alcf#dtn_hpss`. As with HSI and HTAR, you must have a keytab file before using this endpoint. For more information on using Globus, please see [Using Globus](#). ## Common Problems ### Keytab File Missing @@ -95,8 +94,5 @@ If you see an error like this: *** HSI: (KEYTAB auth method) - keytab file missing or inaccessible: / home/username/.hpss/.ktb_username Error - authentication/initialization failed - ``` - it means that your account is not enabled to use the HPSS yet. [Please contact support](mailto:support@alcf.anl.gov) to have it set up. - - - +``` +it means that your account is not enabled to use the HPSS yet. [Please contact support](mailto:support@alcf.anl.gov) to have it set up.