Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Ansible Playbook for Deploying to macOS #67

Merged
merged 23 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
508a9cf
Create initial playbook for mac agents
cailafinn Feb 7, 2023
df064a5
Add playbook
cailafinn Feb 8, 2023
b705c79
Fix all commands being run as root
cailafinn Feb 8, 2023
b77badf
Fix homebrew and git installation
cailafinn Feb 8, 2023
463a0a8
Add privileges setters for new managed macs
cailafinn Feb 8, 2023
e844aec
Add a test run of the slave script to the tasks
cailafinn Feb 8, 2023
f92d3b6
Add documentation for ansible mac deployment
cailafinn Feb 8, 2023
ccbb1de
Remove mistakenly added vscode settings
cailafinn Feb 8, 2023
bd6089f
Fix docs formatting
cailafinn Feb 8, 2023
91f260c
Fix other file's formatting
cailafinn Feb 8, 2023
c68915a
Fix typo in docs
cailafinn Feb 10, 2023
3bbc583
Move and rename script to match other OSs
cailafinn May 15, 2023
fc4fd35
Symlink java11 to make sure it's in the path
cailafinn Jun 6, 2023
5009660
Add extra manual setup steps to docs
cailafinn Jun 6, 2023
57ed93b
Download and place the mac sdk for building with conda
cailafinn Jun 7, 2023
a81bf45
Refactor installation into different files
cailafinn Jun 7, 2023
8281738
Add SSH key step to manual instructions
cailafinn Jul 17, 2023
db75b56
Amend path in docs
cailafinn Aug 22, 2023
412d03d
Reboot after power failure
cailafinn Sep 5, 2024
c1872c4
Add steps for accessing via VNC on new macOS
cailafinn Jan 23, 2025
592ed99
Force updates to requirements
cailafinn Jan 23, 2025
9647325
Use jenkins agent instead of old script
cailafinn Jan 28, 2025
36885e9
Install the create-dmg package
cailafinn Jan 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 105 additions & 0 deletions macOS/jenkins-node/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Jenkins macOS nodes for Mantid

This describes how to deploy a macOS build node. Such a node is able to perform any of the macOS jobs.

## Prerequisites

- Access to the Keeper password manager and the `ISIS Jenkins Nodes` file.
- Access to the [`mantidproject/ansible-linode`](https://github.com/mantidproject/ansible-linode) and [`dockerfiles`](https://github.com/mantidproject/dockerfiles) repositories.
- If the machine was already setup, you will need your SSH key adding to the list so you can connect remotely.


## Manual Setup

There are few steps that need to be manually taken on a brand new machine before ansible can take over.

- Login to the provided administrator account.
- Set up a `mantidbuilder` user on the new machine:

- Open the `System Preferences -> Users & Groups` menu.
- Press the `+` button below the list of users and add a new administrator account. Use `mantidbuilder` for both the name fields and provide a strong password.

- Enable remote access:

- Open `System Preferences -> Sharing`. May also be `System Preferences -> General -> Sharing`.
- Enable `Remote Login` for all users and allow full disk access.
- Make a note of the `ssh` login instructions, especially the hostname after the `@`.
- Store the chosen password and the hostname in the `ISIS Jenkins Nodes` file in Keeper.
- Enable `Remote Management` for all users.
- Click the `i` button and enable `VNC viewers may control screen with password`.

- Set security settings to allow for builds and consistent access:

- Open `System Preferences -> Security & Privacy`.
- In `General`, untick the `Require password [...] after sleep or screensaver begins` checkbox.
- In `FileVault` press the button to `Turn Off FileVault`.
- FileVault encrypts the contents of the disk until the first login. This means that the `ssh` service is not started until someone logs in on the physical machine, which makes the machine a pain to access after reboot.

- Install XCode Command Line Tools:

- Launch a terminal.
- Run `xcode-select --install`.
- Wait for the popup to appear and click `Install`.


- Back on the machine you will be doing the deployment on, you will need to add your SSH key to the new mac:

- `ssh-copy-id mantidbuilder@<HOST>`

## Jenkins Controller Node Creation

- Provision a new node in [Jenkins](https://builds.mantidproject.org/computer) with the following changes:
- Set *Remote root directory* to `/jenkins_workdir`
- Set environment variables:
- `BUILD_THREADS` => set based on system, e.g. number of cores
- `MANTID_DATA_STORE` => `/mantid_data`
- Once created make a note of the node's name and secret (the long string of letters and numbers)

## Deploying to the Agent

**We're calling nodes _agents_ from here on out. There's some nuance, but they're mostly interchangeable terms.**

The ansible scripts will set up the machine and connect it to the Jenkins controller ready for running builds and other jobs.

### Getting the Right Environment

1. If you already have the `ansible-linode` repo and associated conda environment, activate it and skip to step 4.
2. Clone the [`mantidproject/ansible-linode`](https://github.com/mantidproject/ansible-linode) repo.
3. Navigate to the base of the cloned repo and run:

- `mamba create --prefix ./condaenv ansible`
- `mamba activate ./condaenv`
- Note: You can activate the environment from anywhere by providing the full path to the `condaenv` directory.

4. Clone the [`dockerfiles`](https://github.com/mantidproject/dockerfiles) repo and navigate to `macOS/jenkins-node/ansible`.
5. Install or Update the required collections from Ansible Galaxy by running:
- `ansible-galaxy install -r requirements.yml --force`
6. Time to use that secret you made a note of. Create an `inventory.txt` file with the details of the machines to deploy to (one per line):

```ini
[all]
<IP_ADDRESS_OR_HOSTNAME_1> agent_name=<NAME_OF_AGENT_ON_JENKINS_1> agent_secret=<SECRET_DISPLAYED_ON_CONNECTION_SCREEN_1>
<IP_ADDRESS_OR_HOSTNAME_2> agent_name=<NAME_OF_AGENT_ON_JENKINS_2> agent_secret=<SECRET_DISPLAYED_ON_CONNECTION_SCREEN_2>
```

If you've forgotten the secret, it can be found under `Environment Variables` in the `System Information` section of the agent.

### Running the Script to Deploy the Agent

1. Add your SSH key to the host by running `ssh-copy-id mantidbuilder@<HOSTNAME>` in a terminal.
1. Run the playbook to deploy to all the machines defined in your `inventory.txt` file:

```sh
ansible-playbook -i inventory.txt jenkins-agent.yml -u mantidbuilder -K
```

2. When prompted, enter the agent's password that you made earlier. If you weren't the one who made the password, it should be in the `ISIS Jenkins Nodes` file on Keeper.
3. Wait for the play to complete and visit `https://builds.mantidproject.org/computer/NAME_OF_AGENT_ON_JENKINS`. The agent should be connected within five minutes.

- Note: The agent is kept connected to the controller by a crontab entry that runs on every 5th minute. This means that on first setup the agent may not connect until a minute divisible by five has passed.


## Troubleshooting

- You may need to log in manually or by using VNC at least once to allow the ansible script to run. This can be due to FireVault blocking SSH connections until the machine is unlocked.
- To make use of VNC from a mac: Open finder and press `Cmd+K`, then enter `vnc://<HOSTNAME>`. Use the `mantidbuilder` login for the machine.
7 changes: 7 additions & 0 deletions macOS/jenkins-node/ansible/jenkins-agent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- name: Deploy macOS Jenkins agent for Mantid.
hosts: all

roles:
- role: agent
tags: "agent"

3 changes: 3 additions & 0 deletions macOS/jenkins-node/ansible/requirements.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
collections:
- name: geerlingguy.mac
40 changes: 40 additions & 0 deletions macOS/jenkins-node/ansible/roles/agent/tasks/check-connection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#! /bin/sh

AGENT_NAME=${1}
AGENT_SECRET=${2}


echo "Check if the secret or name has changed and kill the current java process if it has. "

cron_entry=$(crontab -l | grep jenkins-agent.sh)
cron_name_and_secret=$(echo "$cron_entry" | grep -o "$AGENT_NAME .*")

if [[ "$AGENT_NAME $AGENT_SECRET" != "$cron_name_and_secret" ]]; then
pgrep java | xargs kill -9
fi


echo "Run the agent startup script in the background. "

$HOME/jenkins-agent.sh $AGENT_NAME $AGENT_SECRET &


echo "Wait for the script to get to its hang point. "

sleep 5


echo "Check that the script has connected the agent to the controller. "

jenkins_json=$(curl https://builds.mantidproject.org/manage/computer/$AGENT_NAME/api/json)
is_offline=$(echo "$jenkins_json" | grep \"icon\":\"symbol-computer-offline\")

if [[ $is_offline ]]; then
echo "Agent failed to connect to Jenkins controller. "
exit 1
fi


echo "Agent connected successfully. "

exit 0
17 changes: 17 additions & 0 deletions macOS/jenkins-node/ansible/roles/agent/tasks/java11.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Set up Java 11 Installation.

- name: Install Java 11.
community.general.homebrew:
name: java11
state: present

- name: Symlink Java 11.
shell: ln -sfn /opt/homebrew/opt/openjdk@11/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk-11.jdk
become: true
become_user: root

- name: Ensure that the java install has been added to the path.
ansible.builtin.lineinfile:
path: ~/.zshrc
line: export PATH="/opt/homebrew/opt/openjdk@11/bin:$PATH"
create: true
34 changes: 34 additions & 0 deletions macOS/jenkins-node/ansible/roles/agent/tasks/mac-sdk.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Download and set up the Mac OSX SDK ready for Conda to use.

- name: Install gnu-tar.
community.general.homebrew:
name: gnu-tar
state: present

- name: Ensure that gnu-tar has been added to the path
ansible.builtin.lineinfile:
path: ~/.zshenv
line: export PATH="/opt/homebrew/opt/gnu-tar/libexec/gnubin:$PATH"
create: true

- name: Download the Mac SDK.
ansible.builtin.get_url:
url: https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX10.10.sdk.tar.xz
dest: ~/
mode: '777'
force: true

- name: Unarchive the Mac SDK.
ansible.builtin.unarchive:
src: ~/MacOSX10.10.sdk.tar.xz
dest: ~/
remote_src: yes

- name: Move the Mac SDK into opt
shell: mv /Users/mantidbuilder/MacOSX10.10.sdk /opt
become: true

- name: Remove the downloaded Mac SDK Tarball.
ansible.builtin.file:
path: ~/MacOSX10.10.sdk.tar.xz
state: absent
73 changes: 73 additions & 0 deletions macOS/jenkins-node/ansible/roles/agent/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
# Deploy Jenkins agent on macOS

# Set up the environment

- name: Add user to sudoers on new macs.
shell: /Applications/Privileges.app/Contents/Resources/PrivilegesCLI --add
ignore_errors: true # Not all the macs have these, so don't panic if it fails.

# Install Requirements

- name: Install homebrew
include_role:
name: geerlingguy.mac.homebrew

- name: Make sure homebrew bin is in the path.
ansible.builtin.lineinfile:
path: /etc/paths
state: present
line: '/opt/homebrew/bin'
become: true
become_user: root

- name: Install git.
community.general.homebrew:
name: git
state: latest

- name: Install and Set up Java 11
include_tasks: java11.yml

- name: Check for the MacOSX SDK
stat:
path: /opt/MacOSX10.10.sdk
register: sdk_stats

- name: Download and Install MacOSX SDK
include_tasks: mac-sdk.yml
when: not sdk_stats.stat.exists

- name: Install node
community.general.homebrew:
name: node
state: present

- name: Install create-dmg
community.general.npm:
name: create-dmg
global: true
state: latest

# Configure macOS Settings.

- name: Disable screensaver.
shell: defaults write com.apple.screensaver idleTime 0

- name: Disable saved application states to avoid dialog.
shell: defaults write org.python.python NSQuitAlwaysKeepsWindows -bool false

- name: Ensure the machine boots back up after a power failure.
shell: systemsetup -setrestartpowerfailure on
become: true

# Test and start the agent. Note: Connection will only begin consistently every 5th minute if changes are made.

- name: Start the jenkins agent
include_tasks: start-jenkins-agent.yml

# Tidy up the environment.

- name: Remove user from sudoers on new macs.
shell: /Applications/Privileges.app/Contents/Resources/PrivilegesCLI --remove
ignore_errors: true # Not all the macs have these, so don't panic if it fails.
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Test and start the agent. Note: Connection will only begin consistently every 5th minute if changes are made.

- name: Download jenkins agent script.
shell: curl -o $HOME/jenkins-agent.sh https://raw.githubusercontent.com/mantidproject/mantid/refs/heads/main/buildconfig/Jenkins/jenkins-agent.sh

- name: Make the agent script executable.
shell: chmod 777 $HOME/jenkins-agent.sh

- name: Check the Jenkins agent connection script.
script: ./check-connection.sh {{ agent_name }} {{ agent_secret }}

- name: Setup a crontab entry to run the agent script every 5th minute.
ansible.builtin.cron:
name: "Run agent script"
minute: "*/5"
job: "$HOME/jenkins-agent.sh {{ agent_name }} {{ agent_secret }} >> ~/agentlog.txt 2>&1"