-
Notifications
You must be signed in to change notification settings - Fork 3
Setting up new server (replica) in an existing system
Return to main: Kerberos
WARNING: This file and the file referred to are not maintained. They have been tested for Redhat 9, but won't be updated for later. The current version is in the Rutgers internal Gitlab.
This uses kdc1.yml, kdc2.yml and kdc-data. They are in clhedrick/kerberos-ansible. The rest of the repo is no longer maintained, but these files are up to date as of Jun 15, 2022, for RHEL 9.
See Issues section at the end to see what went wrong in previous tries. With RHEL 9, things worked quite well.
This page describes setting up a new server where you have access to an existing server with the data. That's called creating a replica. You can completely replace the servers by killing each one and recreating it as a replica. As long as one server with the data is always left you can kill the rest and recreate them.
The process is mostly done using ansible. kdc1.yml starts with a minimal RHEL 9 installation and prepares it. Then you do the normal ipa install commands. kdc2.yml then adds the Rutgers stuff. See https://github.com/clhedrick/kerberos-ansible for the ansible setup. (It is missing our key tables, for obvious reasons. Those will have to come from the production servers.) (Our ansible setup has some extra things you won't need unless you're using our extra services. So look at the yml files to make sure they do what you want.) You'll also need the ansible host file. Here's what ours currently looks like:
krb1.cs.rutgers.edu krb2.cs.rutgers.edu krb4.cs.rutgers.edu [all:vars] kdc_ips=x.x.x.x,y.y.y.y,z.z.z.zkdc_ips should the IP addresses of the KDCs. If this is a test setup, it's the IP addresses of the test servers.
I haven't put the IPA commands into Ansible because they depend upon the location of certificate files, and may differ if you're doing a test where DNS points to the production servers rather than your test servers.
This page is out of order. The order to do things:
- Do a normal Redhat server install, setup the network, and get licensing working.
- Preliminary steps
- Ansible kdc1.yml
- IPA install
- Post-install. That has to be done as admin or netid.admin.
- Ansible kdc2.yml
- PKINIT setup. That requires certificates, which is why it's not in Ansible.
- On the old system, before you take it down, do "ipa-replica-manage dnarange-show" and save the results
- subscription-manager unregister
- after you take it down, on another system do "ipa-replica-manage del SERVER"
- get networking to work: nmcli c edit INTERFACE and setup ip address, gateway and DNS.
- subscription-manager register. user and password for our subscription are in 1password
- I used the RHEL web interface to add subscriptions for RHEL Linux to the server. We have two servers with the same hostname. Doing it with subscription-manager confused things.
- Add linux-admin with password from 1password.
- Install python3 if needed, though it came with RHEL server 9.
- Verify you can do ssh linux-admin@host from where you have the ansible files
- Put the certificate, chain, and key on the machine. Note that Internet 2 gives you a bad chain. I used the first cert from chain.pem and replaced the rest with the current self-signed cert for " C = US, ST = New Jersey, L = Jersey City, O = The USERTRUST Network, CN = USERTrust RSA Certification Authority" There should only be 2 certs in chain.pem. Once chain.pem is right "cat cert.pem chain.pem > fullchain.pem"
- Put /etc/krb5.anonymous.keytab and /etc/krb5.tgt.keytab and /etc/scripts.keytab on the machine. They aren't in ansible for security reasons. (I think scripts may no longer be used, but I can't be sure.)
- on krb1 only run the ansibleSetup role, because we need to be able to look at the hosts table for updating the Guacamole data
ansible-playbook -u linux-admin -k -K kdc1.yml --limit=krb1.cs.rutgers.edu --become-method=su
Now setup IPA. I haven't done this via ansible because locations of certs may be different, and it tends to fail and need error recovery.
ansible-playbook -u linux-admin -k -K kdc2.yml --limit=krb1.cs.rutgers.edu --become-method=su
These need to be done when kinit'ed as admin or netid.admin. That's why they're not in ansible
1) If you're doing all 3 servers, you'll be left with the 3 talking to each other, but just pairwise. You'll need to connect the remaining two. First do this to see what the current setup is
ipa topologysegment-find domain
There should be three agreements, bidirectional between each pair. Probably one is missing. If so, add it:
ipa topologysegment-add domain --leftnode=krb1.cs.rutgers.edu --rightnode=krb4.cs.rutgers.edu
Sometimes the install process creates a one-directional sync. Kill it with "ipa topologysegment-del domain NAME" and recreate it using the previous commmand. NAME is the name of the segment as shown by "ipa topologysegment-find domain".
2) In order for credserv to work, you need to skinit as an administrator and do
ipa role-add-member "Rutgers Credserv Service" --services=credserv/krb1.cs.rutgers.edu
for all kdcs.
3) The accounts web app needs to be able to set user passwords. IN order to do that, it needs to be defined as a password sync application. Unfortunately this configuration is per-server. So it has to be done on all new servers. As admin do "ldapmodify -Y GSSAPI < sync.ldif"
dn: cn=ipa_pwd_extop,cn=plugins,cn=config changetype: modify add: passSyncManagersDNs passSyncManagersDNs: krbprincipalname=http/[email protected],cn=services,cn=accounts,dc=cs,dc=rutgers,dc=edu passSyncManagersDNs: krbprincipalname=http/[email protected],cn=services,cn=accounts,dc=cs,dc=rutgers,dc=edu passSyncManagersDNs: uid=hedrick.admin,cn=users,cn=accounts,dc=cs,dc=rutgers,dc=edu
4) Installing a new replica could leave it without the randomly chosen range for new UIDs and GIDs, though for RHEL 9 it was OK. It will assign one automatically the first time you create a user or group. Hopefully it will pick something reasonable. If not, you may want to put back the old range that you saved from "ipa-replica-manage dnarange-show".
ipa group-add tempgroup ipa group-del tempgroup ipa-replica-manage dnarange-show
Make sure it has allocated a range that's close to the other servers. If not, you may want to do ipa-replica-manage dnarange-set.
5) See the section below on PKINIT setup. This is needed for "kinit -n" to work. However we currently using "kgetcred -a", so this isn't actually used. I still recommend doing it.
On the new KDC
- kinit with a user that doesn't use two factor.
- skinit with a user that uses two factor
- kinit or skinit with some user and do "kgetcred -l". It's best to do this with a user that has information registered. You can do "kgetcred -r" on one of our servers to do that.
- verify that "kgetcred -a" gives you a certificate as the anonymous user
- after a day, verify that the cron jobs are all working.
Now for IPA commands. If the servers you're setting up are in DNS, it's easy
- make sure there are no vestiges of the old server. For RHEL 9, "ipa host-del SERVER" did it.
- Make sure you don't have any kerberos tickets, "kdestroy -A". If the install fails, do this every time before a reinstall.
- make sure you have added the other servers to /etc/sysconfig/nftables.lcsr
- fullchain is a file with the cert first and then the chain. See above for issues we had. I had to fix up fullchain.pem
ipa-client-install ; in a reinstall you may need ipa-client-setup --force-join ipa-replica-install \ --dirsrv-cert-file /root/fullchain.pem \ --dirsrv-cert-file /root/privkey.pem \ --http-cert-file /root/fullchain.pem \ --http-cert-file /root/privkey.pem \ --no-pkinit [possibly --skip-conncheck]You'll be prompted for keys twice. Currently hit CR, but it's possible you'd need to use the "kerberos / ldap key for certs" in 1Password.
Obviously you should use the actual file names of your certificates. For the first file, I used the combined file, i.e. a file that starts with the system's cert, and then the intermediate certs.
Next
- Post-install section
- ansible kdc2.yml
- PKINIT setup
If you find this section confusing, you can skip it. We're not currently using PKINIT for anything. It can be done right after ipa-replica-install, or you can wait and do it later.
- mkdir /var/kerberos/certs
- cp cert.pem /var/kerberos/certs/kdc.crt
- cp privkey.pem /var/kerberos/certs/kdc.key
- cp chain.pem /var/kerberos/certs/cacert.pem
; pkinit_identity = FILE:/var/kerberos/krb5kdc/kdc.crt,/var/kerberos/krb5kdc/kdc.key pkinit_identity = FILE:/var/kerberos/certs/kdc.crt,/var/kerberos/certs/kdc.key pkinit_anchors = FILE:/var/kerberos/krb5kdc/kdc.crt ; pkinit_anchors = FILE:/var/kerberos/krb5kdc/cacert.pem pkinit_anchors = FILE:/var/kerberos/certs/cacert.pem
Then "systemctl restart krb5kdc". To use this data, the client krb5.conf needs the following:
pkinit_anchors = DIR:/etc/ssl/certs pkinit_eku_checking = kpServerAuth pkinit_kdc_hostname = krb2.cs.rutgers.eduOn the KDC, the hostname should be the local hostname. On a real client you'd have three lines, with the 3 servers.
To test it, do "kinit -n".
I set up a test copy of our 3 servers by duplicating snapshots of the production servers. The issue here is that the hostnames of the KDCs are built into a lot of the LDAP data. There's no practical way to rename a KDC. So instead I set up the tests systems to think they're the actual KDC hosts.
If you're testing this process for one KDC, this is easy. You just add your IP address with the name of the host you're pretending to be in /etc/hosts. But if your information isn't correct in DNS, you'll need to use a special DNS server that defines krb1, krb2 and krb4 to be the test systems. Our current ubuntu systems use systemd-resolved. To use such a system as a fake DNS server, add to /etc/systemd/resolved.conf
ReadEtcHosts=yes DNSStubListener=yes DNSStubListenerExtra=128.6.26.16where 128.6.26.16 is the address of that system. Then add the fake krb1, 2, and 4 to /etc/hosts.
Restart systemd-resolved.
- Obviously you have to configure networking to have the IP addresses of the fake servers.
- Make sure to change /etc/sysconfig/nftables.conf to have the right IP address for the other servers or you won't be able to talk between the servers.
- Sometimes when starting from a snapshot, /etc/dirsrv/slapd-CS-RUTGERS-EDU/dse.ldif is missing. This is the top-level data file for LDAP, so LDAP won't start if it's missing. I have a cron job that saves this file in /var/lib/ipa/backup/dse.ldif. So if it's missing you can restore it from that copy. In one case I restored it by taking it from the corresponding production server.
- If you're starting from a snapshot, kinit as an admin user, and use "ipa-replica-manage list -v krbX.cs.rutgers.edu" for all 3 servers to verify that replication is in sync. There's a good chance you'll have to resync by doing ipa-replace-manage re-initialize --from ... You may need to do it a few times, until you're in sync, because it's not always clear which server to use.
When you're doing testing you may need to delete a server and recreate it. You may also need to recover from failure.
Generally "ipa-replica-manage del SERVER" will delete the info in LDAP, including replication agreements. But I'd look at /etc/dirsrv/slapd-CS-RUTGERS-EDU/dse.ldif. Look for all occurrences of krbx. If there are any replication agreements left, here's how to delete one:
ldapmodify -ZZ -x -D "cn=Directory Manager" -W -H ldap://localhost -f delreplication dn: cn=meTokrb2.cs.rutgers.edu,cn=replica,cn=dc\3Dcs\2Cdc\3Drutgers\2Cdc\3Dedu ,cn=mapping tree,cn=config changetype: deleteThat will leave ruvs. The following should show them:
ipa-replica-manage list-ruvThere are several options for dealing with them. The simplest is
ipa-replica-manage clean-dangling-ruvHere are manual versions of these things:
Find out the replication id by looking at /etc/dirsrv/slapd-CS-RUTGERS-EDU/dse.ldif. Here's an example:
nsds50ruv: {replica 4 ldap://krb1.cs.rutgers.edu:389} 588f7d78000100040000 5daThat's id 4. If there's nothing there, don't bother. Do these things on both remaining servers. There are builtin jobs to clean these up. The following cleans up replication ID 55:
ldapmodify -a -D "cn=Directory Manager" -W -p 389 -h krb1.cs.rutgers.edu -x -f cleanruv dn: cn=clean 55, cn=cleanallruv, cn=tasks, cn=config objectclass: extensibleObject replica-base-dn: dc=cs,dc=rutgers,dc=edu replica-id: 55 cn: clean 55
1) Updating from 7, first had to disable CA service on krb1, or the replica wouldn't accept certs. This shouldn't be an issue in the future, since none of the new servers is set up as a CA. on krb1:
ldapmodify -Y GSSAPI < noca dn: cn=CA,cn=krb1.cs.rutgers.edu,cn=masters,cn=ipa,cn=etc,dc=cs,dc=rutgers,dc=edu changetype:modify delete:ipaConfigString ipaConfigString: enabledService ipaConfigString: caRenewalMaster
2) The key tables aren't on config.lcsr for obvious reasons, so kdc2.yml failed. Transfered keytables by hand and temporarily removed from from the yml
3) Somehow ended up with a one-way replication agreement.
ipa topologysegment-find domain
will show them. To kill the bad one:
ipa topologysegment-del domain krb4.cs.rutgers.edu-to-krb2.cs.rutgers.edu
To add it back
ipa topologysegment-add domain --leftnode=krb2.cs.rutgers.edu --rightnode=krb4.cs.rutgers.edu
When installing krb4, several problems.
1) when old remnants of krb4 weren't out of the database, lots of permission failures Make sure the host is deleted, and ldap/krbx.cs.rutgers.edu, hosts/krbx.cs.rutgers.edu don't exist.
2) The biggie is a failure
Restart of krb5kdc.service complete Waiting up to 300 seconds to see our keys appear on host ldap://krb1.cs.rutgers.edu Starting new HTTPS connection (1): krb1.cs.rutgers.edu:443 https://krb1.cs.rutgers.edu:443 "GET /ipa/keys/dm/DMHash?xxxxx HTTP/1.1" 502 415 Your system may be partly configured. Run /usr/sbin/ipa-server-install --uninstall to clean up. File "/usr/lib/python3.6/site-packages/ipapython/admintool.py", line 179, in execute return_value = self.run() File "/usr/lib/python3.6/site-packages/ipapython/install/cli.py", line 340, in run return cfgr.run() File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 360, in run return self.execute() File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 386, in execute for rval in self._executor(): File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 431, in __runner exc_handler(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 460, in _handle_execute_exception self._handle_exception(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 450, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 421, in __runner step() File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 418, in <lambda> step = lambda: next(self.__gen) File "/usr/lib/python3.6/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from six.reraise(*exc_info) File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/usr/lib/python3.6/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from value = gen.send(prev_value) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 655, in _configure next(executor) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 431, in __runner exc_handler(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 460, in _handle_execute_exception self._handle_exception(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 518, in _handle_exception self.__parent._handle_exception(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 450, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 515, in _handle_exception super(ComponentBase, self)._handle_exception(exc_info) File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 450, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 421, in __runner step() File "/usr/lib/python3.6/site-packages/ipapython/install/core.py", line 418, in <lambda> step = lambda: next(self.__gen) File "/usr/lib/python3.6/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from six.reraise(*exc_info) File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/usr/lib/python3.6/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from value = gen.send(prev_value) File "/usr/lib/python3.6/site-packages/ipapython/install/common.py", line 65, in _install for unused in self._installer(self.parent): File "/usr/lib/python3.6/site-packages/ipaserver/install/server/__init__.py", line 590, in main replica_install(self) File "/usr/lib/python3.6/site-packages/ipaserver/install/server/replicainstall.py", line 402, in decorated func(installer) File "/usr/lib/python3.6/site-packages/ipaserver/install/server/replicainstall.py", line 1298, in install custodia.import_dm_password() File "/usr/lib/python3.6/site-packages/ipaserver/install/custodiainstance.py", line 211, in import_dm_password cli.fetch_key('dm/DMHash') File "/usr/lib/python3.6/site-packages/ipaserver/secrets/client.py", line 120, in fetch_key r.raise_for_status() File "/usr/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status raise HTTPError(http_error_msg, response=self) The ipa-replica-install command failed, exception: HTTPError: 502 Server Error: Proxy Error for url: https://krb1.cs.rutgers.edu/ipa/keys/dm/DMHash?xxxx 502 Server Error: Proxy Error for url: https://krb1.cs.rutgers.edu/ipa/keys/dm/DMHash?ccc
It appears that this happens only when using commercial certs. It's trying to fetch the Directory Manager password (encrypted) from the primary to put it in the new sysstem. I commented out custodiainstance.py:211,
def import_dm_password(self): cli = self._get_custodia_client() # cli.fetch_key('dm/DMHash') <and copied it manually.
On the primary, open /etc/dirsrv/slapd-CS-RUTGERS-EDU/dse.ldif. Look for
nsslapd-rootpw: {SSHA}It should be under cn=config. Now shutdown ipa on the new server (ipactl stop), edit /etc/dirsrv/slapd-CS-RUTGERS-EDU/dse.ldif, and replace that line with the one you copied from the original server. Restart ipa.
3) After the system went into production, we found that the ipa commmand failed for some systems. It turned out that these systems were talking to krb4 for the IPA command, but getting authentication from a different system. During installation, the Kerberos data for the principal HTTP/krb4.cs.rutgers.edu had failed to propagate to the other systems, probably because they already had entries for that principal that hadn't been deleted when the original server was deleted. To fix it, do
ldapsearch -ZZ -x -D "cn=Directory Manager" -W -H ldap://localhost krbprincipalname=HTTP/[email protected] krbprincipalkeyon the system that is giving errors for ipa, in this case krb4. Do the same command on a different system. If this is the problem, you'll get different values for krbprincipalkey. You need to put the right value, which is the one on the system itself (krb4 in this case) on one of the other systems. It will propagate automatically to all of them.
ldapmodify -ZZ -x -D "cn=Directory Manager" -W -H ldap://localhost -f fixhttp dn: krbprincipalname=HTTP/[email protected],cn=services,cn=accounts,dc=cs,dc=rutgers,dc=edu changetype:modify replace:krbprincipalkey krbprincipalkey:: xxxx
Of course adjust the hostname in the file. Both commands will prompt for the Directory Manager password.
Note that krbprincipalkey is binary data. The ldapsearch and ldapmodify commands give it in base64. The "::" after the attribute name indicates that the value is base64.