Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Installing Pelican/OSDF Cache via RPM #203

Closed
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
3964384
Let's Encrypt CA cert docs: use different paths for Pelican/OSDF
matyasselmeci Aug 16, 2024
5474136
WIP: Instructions for installing a Pelican/OSDF Origin via RPM (SOFT…
matyasselmeci Aug 16, 2024
cb627dd
Update hardware requirements for caches; add a note about the cache n…
matyasselmeci Oct 11, 2024
8bd2814
Merge branch 'pr/pelican-osdf-origin.SOFTWARE-5962' into wip/pelican-…
matyasselmeci Oct 11, 2024
e61a450
WIP: data/osdf/install-cache-rpm.md
matyasselmeci Oct 11, 2024
2015290
Let's Encrypt: mention that Pelican needs the full cert chain
matyasselmeci Oct 11, 2024
9d278e6
Merge branch 'pr/pelican-osdf-origin.SOFTWARE-5962' into wip/pelican-…
matyasselmeci Oct 11, 2024
ee75e30
Mention that you need the cert change in install-origin-rpm as well
matyasselmeci Oct 11, 2024
8d97237
Merge branch 'pr/pelican-osdf-origin.SOFTWARE-5962' into wip/pelican-…
matyasselmeci Oct 11, 2024
6bf93ef
WIP: Installing a Pelican/OSDF cache by RPM
matyasselmeci Oct 11, 2024
0026696
Merge branch 'master' into wip/pelican-osdf-cache.SOFTWARE-6013
matyasselmeci Nov 5, 2024
87487d2
Update hardware requirements; add install instructions for OSG 23 and…
matyasselmeci Nov 6, 2024
ab20921
typo
matyasselmeci Nov 6, 2024
a1bd433
Use "Service requirements" instead of "Hardware requirements"
matyasselmeci Nov 6, 2024
e167846
Add a validation test; move the registration step to the end and call…
matyasselmeci Nov 6, 2024
30f790b
fix syntax
matyasselmeci Nov 6, 2024
0e33e2b
Apply suggestions from code review
matyasselmeci Nov 8, 2024
e0ede48
Prefer OSG 24
matyasselmeci Nov 8, 2024
ecefef6
tweaks
matyasselmeci Nov 8, 2024
668b1d4
Tell people to email us with the logs if validation fails
matyasselmeci Nov 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions docs/data/osdf/install-cache-rpm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
title: Installing the OSDF Cache by RPM

Installing the OSDF Cache
=========================

This document describes how to install a Pelican-based Open Science Data Federation (OSDF) Cache service via RPMs.
This service allows a site or regional network to cache data frequently used in Open Science Pool jobs,
reducing data transfer over the wide-area network and decreasing access latency.


Before Starting
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
---------------

Before starting the installation process, consider the following requirements:

* __Operating system:__ A RHEL 8 or RHEL 9 or compatible operating systems.
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
* __User IDs:__ If they do not exist already, the installation will create the Linux user named `xrootd` for running daemons.
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
* __File Systems:__ The cache should have a partition of its own for storing data and metadata.
* __Host certificate:__ Required for authentication. See note below.
* __Network ports:__ The cache service requires the following ports open:
* Inbound TCP port 8443 for file access via the HTTP(S) and XRoot protocols.
* (Optional) Inbound TCP port 8444 for access to the web interface for monitoring and configuration;
if enabled, this should be restricted to the LAN.
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
* __Service requirements:__
* A cache serving the OSDF federation as a regional cache should have at least:
* 8 cores
* 40 Gbps connectivity
* 50-200 TB of NVMe disk for the cache partition; you may distribute the disk, e.g., by using an NVMe-backed Ceph pool,
if you cannot fit that much disk into a single chassis
* 24 GB of RAM
* A cache being used to serve data from the OSDF to a single site should have at least:
* 8 cores
* 40 Gbps connectivity
* 2 TB of NVMe disk for the cache partition
* 24 GB of RAM
We suggest that several gigabytes of local disk space be available for log files,
although some logging verbosity can be reduced.

As with all OSG software installations, there are some one-time steps to prepare in advance:

* Obtain root access to the host
* Prepare [the required Yum repositories](../../common/yum.md)


!!! note "OSG 23"
In OSG 23, the Pelican-based OSDF RPMs are only available in the "osg-upcoming" repositories.

!!! note "Host certificates"
Caches should use a CA that is accepted by major browsers and operating systems,
such as InCommon RSA or [Let's Encrypt](../../security/host-certs/lets-encrypt).
IGTF certs are not recommended because clients are not configured to accept them by default.
Note that you will need the full certificate chain, not just the certificate.

The following locations should be used (note that they are in separate directories):

* **Host Certificate Chain**: `/etc/pki/tls/certs/pelican.crt`
* **Host Key**: `/etc/pki/tls/private/pelican.key`


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've got a lot (?) of non-Pelican caches out there, it's probably worth having a separate, leading "upgrade section" here that outlines the overview of how to perform an upgrade. Maybe it's something as simple as "follow all the instructions here, turn off the old cache services, then register the new cache"

Or maybe we can't have two cache services running concurrently...

Installing the Cache
--------------------

The cache service is provided by the `osdf-cache` RPM.
Install it using one of the following commands:

OSG 23:
```console
root@host # yum install --enablerepo=osg-upcoming osdf-cache
```

OSG 24:
```console
root@host # yum install osdf-cache
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
```
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved


Configuring the Cache Server
----------------------------

!!! note "osdf-cache 7.11.1"
This configuration requires version 7.11.1 or newer of the `osdf-cache`
and `pelican` RPMs.

Configuration for a Pelican-based OSDF Cache is located in files in `/etc/pelican/config.d`.
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved

You must set the following config options:

matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved
In `/etc/pelican/config.d/20-cache.yaml`, set `Cache.LocalRoot`, `Cache.DataLocation` and `Cache.MetaLocation` as follows,
replacing `<CACHE PARTITION>` with the mount point of the partition you will use for the cache.
```
Cache:
LocalRoot: "<CACHE PARTITION>/namespaces"
DataLocation: "<CACHE PARTITION>/data"
MetaLocation: "<CACHE PARTITION>/meta"
```


Preparing for Initial Startup
-----------------------------

1. The cache identifies itself to the federation via public key authentication;
before starting the cache for the first time, it is generate a keypair.

:::console
root@host$ cd /etc/pelican
root@host$ pelican generate keygen


The newly created files, `issuer.jwk` and `issuer-pub.jwks` are the private and public keys, respectively.
**Save these files**; if you lose them, your cache will need to be re-approved.
matyasselmeci marked this conversation as resolved.
Show resolved Hide resolved


Validating the Cache Installation
---------------------------------

Do the following steps to verify that the cache is functional:

1. Start the cache using the following command:

:::console
root@host$ systemctl start osdf-cache

1. Download a test file from the OSDF through your cache (replacing `CACHE_HOSTNAME` with the host name of your cache)

:::console
root@host$ osdf object get -c CACHE_HOSTNAME:8443 /ospool/uc-shared/public/OSG-Staff/validation/test.txt /tmp/test.txt
root@host$ cat /tmp/test.txt

Hello, World!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a step or note here asking them to send logs to us if this doesn't work


Joining the Cache to the Federation
-----------------------------------

The cache must be registered with the OSG prior to joining the data federation.
Send mail to <[email protected]> requesting registration; provide the following information:

* Cache hostname
* Administrative and security contact(s)
* Institution that the cache belongs to

OSG Staff will register the cache and respond with the Resource Name that the cache was registered as.

Once you have that information, edit `/etc/pelican/config.d/15-osdf.yaml`, and set `XRootD.Sitename`:
```
XRootD:
Sitename: <RESOURCE NAME REGISTERED WITH OSG>
```

Then, restart the cache by running

```console
root@host$ systemctl restart osdf-cache
```


Managing the Cache Service
---------------------------
Use the following SystemD commands as root to start, stop, enable, and disable the OSDF Cache.

| To... | Run the command... |
| :--------------------------------------- | :--------------------------------- |
| Start the cache | `systemctl start osdf-cache` |
| Stop the cache | `systemctl stop osdf-cache` |
| Enable the cache to start on boot | `systemctl enable osdf-cache` |
| Disable the cache from starting on boot | `systemctl disable osdf-cache` |


Getting Help
------------
To get assistance, please use the [this page](../../common/help.md).
172 changes: 172 additions & 0 deletions docs/data/osdf/install-origin-rpm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
title: Installing the OSDF Origin by RPM

Installing the OSDF Origin
==========================

This document describes how to install a Pelican-based Open Science Data Federation (OSDF) Origin service via RPMs.
This service allows an organization to export its data to the Data Federation.

!!! note
The origin must be registered with the OSG prior to joining the data federation.
You may start the registration process prior to finishing the installation by [using this link](#registering-the-origin)
along with information like:

* Resource name and hostname
* VO associated with this origin server (which will be used to determine the origin's namespace prefix)
* Administrative and security contact(s)
* Who (or what) will be allowed to access the VO's data
* Which caches will be allowed to cache the VO data


Before Starting
---------------

Before starting the installation process, consider the following requirements:

* __Operating system:__ A RHEL 8 or RHEL 9 or compatible operating systems.
* __User IDs:__ If they do not exist already, the installation will create the Linux user ID `xrootd` for running daemons.
* __Host certificate:__ Required for authentication. See note below.
* __Network ports:__ The origin service requires the following ports open:
* Inbound TCP port 8443 for file access via the HTTP(S) and XRoot protocols.
* (Optional) Inbound TCP port 8444 for access to the web interface for monitoring and configuration;
if enabled, this should be restricted to the LAN or management network.
* __Hardware requirements:__ We recommend that an origin has at least 1Gbps connectivity and 12GB of RAM.
We suggest that several gigabytes of local disk space be available for log files,
although some logging verbosity can be reduced.

As with all OSG software installations, there are some one-time steps to prepare in advance:

* Obtain root access to the host
* Prepare [the required Yum repositories](../../common/yum.md)


!!! note "OSG 23"
In OSG 23, the Pelican-based OSDF RPMs are only available in the "osg-upcoming" repositories.

!!! note "Host certificates"
Origins should use a CA that is accepted by major browsers and operating systems,
such as InCommon RSA or [Let's Encrypt](../../security/host-certs/lets-encrypt).
IGTF certs are not recommended because clients are not configured to accept them by default.
Note that you will need the full certificate chain, not just the certificate.

The following locations should be used (note that they are in separate directories):

* **Host Certificate Chain**: `/etc/pki/tls/certs/pelican.crt`
* **Host Key**: `/etc/pki/tls/private/pelican.key`


Installing the Origin
---------------------

The origin service is provided by the `osdf-origin` RPM.
Install it using the following command:


```console
root@host # yum install --enablerepo=osg-upcoming osdf-origin
```


Configuring the Origin Server
-----------------------------

Configuration for a Pelican-based OSDF Origin is located in `/etc/pelican/osdf-origin.yaml`.

You must configure the following:
```
XRootD:
Sitename: <RESOURCE NAME REGISTERED WITH OSG>
```

In addition, you must tell Pelican the data to export to the federation.
An origin may export one or more directory trees, or one or more S3 buckets -- follow one of the sections below.
(An single origin cannot export both a bucket and a directory tree.)



### Configuring POSIX (directory) export

Set these options to export one or more directory trees to the federation.

```
Origin:
StorageType: "posix"
Exports:
# You may have one or more of the following block:
- FederationPrefix: "<EXTERNAL OSDF NAMESPACE>"
StoragePrefix: "<LOCAL FILESYSTEM DIRECTORY>"
Capabilities: # Add or remove as desired
- Reads # Enable authenticated reading of objects from under the directory tree through a cache
- PublicReads # Enable unauthenticated reading of objects from under the directory tree through a cache
- DirectReads # Enable reading objects from under the directory tree
# without going through a cache
- Listings # Enable directory listings
- Writes # Enable writing to files in the directory tree
```

### Configuring S3 export

Set these options to export one or more S3 buckets to the federation

```
Origin:
StorageType: "s3"
S3Region: "<S3 REGION IF APPLICABLE>"
S3ServiceUrl: "<URL OF S3 SERVER>"
S3UrlStyle: "path"
Exports:
# You may have one or more of the following block:
- FederationPrefix: "<EXTERNAL OSDF NAMESPACE>"
S3Bucket: "<NAME OF S3 BUCKET>"
S3AccessKeyfile: "<PATH TO S3 BUCKET ACCESS KEY>"
S3SecretKeyfile: "<PATH TO S3 BUCKET SECRET KEY>"
Capabilities: # Add or remove as desired
- Reads # Enable authenticated reading of objects from the bucket through a cache
- PublicReads # Enable unauthenticated reading of objects from the bucket through a cache
- DirectReads # Enable reading objects from the bucket
# without going through a cache
- Listings # Enable listing bucket items
- Writes # Enable writing to objects in the bucket
```


Preparing for Initial Startup
-----------------------------

1. The origin identifies itself to the federation via public key authentication;
before starting the origin for the first time, it is recommended to generate a keypair.

:::command
root@host$ cd /etc/pelican
root@host$ pelican generate keygen


The newly created files, `issuer.jwk` and `issuer-pub.jwks` are the private and public keys, respectively.
**Save these files**; if you lose them, you will have to re-register the origin.

1. Contact OSG Staff and let them know that you are about to start your origin,
and what namespace(s) the origin will serve.
OSG Staff will need to approve the origin's registration.


Managing the Origin Service
---------------------------
Use the following SystemD commands as root to start, stop, enable, and disable the OSDF Origin.

| To... | Run the command... |
| :--------------------------------------- | :--------------------------------- |
| Start the origin | `systemctl start osdf-origin` |
| Stop the origin | `systemctl stop osdf-origin` |
| Enable the origin to start on boot | `systemctl enable osdf-origin` |
| Disable the origin from starting on boot | `systemctl disable osdf-origin` |


Registering the Origin
----------------------
To be part of the Open Science Data Federation, your origin must be
[registered with the OSG](../../common/registration.md). The service type is `Pelican origin`.


Getting Help
------------
To get assistance, please use the [this page](../../common/help.md).
16 changes: 14 additions & 2 deletions docs/data/stashcache/install-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Before starting the installation process, consider the following requirements:
* __Operating system:__ Ensure the host has [a supported operating system](../../release/supported_platforms.md)
* __User IDs:__ If they do not exist already, the installation will create the Linux user IDs `condor` and
`xrootd`
* __File Systems:__ The cache should have a partition of its own for storing data and metadata.
* __Host certificate:__ Required for authentication.
See our [host certificate documentation](../../security/host-certs.md) for instructions on how to request and install host certificates.
* __Network ports:__ Your host may run a public cache instance (for serving public data only), an authenticated cache instance (for serving protected data), or both.
Expand All @@ -38,8 +39,19 @@ Before starting the installation process, consider the following requirements:
* An authenticated cache instance requires the following ports open:
* Inbound TCP port 8443 for authenticated file access via HTTPS
* Outbound UDP port 9930 for reporting to `xrd-report.osgstorage.org` and `xrd-mon.osgstorage.org` for monitoring
* __Hardware requirements:__ We recommend that a cache has at least 10Gbps connectivity, 1TB of
disk space for the cache directory, and 12GB of RAM.
* __Hardware requirements:__
* A cache serving the OSDF federation as a regional cache should have at least:
* 8 cores
* 40 Gbps connectivity
* 50-200 TB of NVMe disk for the cache partition; you may distribute the disk, e.g., by using an NVMe-backed Ceph pool,
if you cannot fit that much disk into a single chassis
* 24 GB of RAM
* A cache being used to serve data from the OSDF to a single site should have at least:
* 8 cores
* 40 Gbps connectivity
* 2 TB of NVMe disk for the cache partition
* 24 GB of RAM


As with all OSG software installations, there are some one-time steps to prepare in advance:

Expand Down
Loading
Loading