Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BBR backup CredHub fails due to hostname verification failure with internal postgres server's certificate #229

Open
peterhaochen47 opened this issue Feb 11, 2021 · 2 comments

Comments

@peterhaochen47
Copy link

peterhaochen47 commented Feb 11, 2021

Steps to reproduce:

  1. Deploy concourse with internal postgres server:
bosh -d my-deployment deploy ./cluster/concourse.yml \
-l variables.yml \
-l versions.yml  \
-o ./cluster/operations/basic-auth.yml \
-o ./cluster/operations/backup-atc-colocated-web.yml \
-o ./cluster/operations/tls-vars.yml \
-o ./cluster/operations/tls.yml \
-o ./cluster/operations/privileged-https.yml \
-o ./cluster/operations/uaa.yml \
-o ./cluster/operations/credhub-colocated.yml \
-o ./cluster/operations/secure-internal-postgres.yml \
-o ./cluster/operations/secure-internal-postgres-uaa.yml \
-o ./cluster/operations/secure-internal-postgres-bbr.yml \
-o ./cluster/operations/secure-internal-postgres-credhub.yml \
-o ./cluster/operations/backup-credhub-web.yml
  1. Run bbr:
bbr deployment \
--target SOME-TARGET-IP \
--deployment my-deployment \
--username bbr_client \
--password MY-PASSWORD \
--ca-cert root_ca_certificate \
backup
  1. See that BBR of credhub failed with an error like this:
    psql: server certificate for "q-s0.db.concourse.concourse.bosh" does not match host name "192.168.1.152".
    And see that CredHub produces a bbr config json that looks like (where the host is an IP address):
{
  "username": "credhub",
  "password": "xxxx",
  "database": "credhub",
  "adapter": "postgres",
  "host": "192.168.1.152",
  "port": 5432,
  "tls": {
    "cert": {
      "ca": "-----BEGIN CERTIFICATE-----xxxxE-----\n"
    }
  }
}

Diagnosis

  • The internal postgres server will be deployed with a certificate generated by the config specified in cluster/operations/secure-internal-postgres.yml. We verified that the generated server cert will have the DNS address as its Common Name and SAN:
Common Name: q-s0.db.infra.concourse-colocated.bosh
Subject Alternative Names: q-s0.db.infra.concourse-colocated.bosh
  • CredHub BBR job is generating the bbr config json by accessing a bosh link of type database provided by the internal postgres job. CredHub BBR job accesses it via DATABASE-LINK.instances[0].address.
  • Depending on the bosh director version, this DATABASE-LINK.instances[0].address may return an IP address (eg: 192.168.1.152) or a DNS address (eg: q-s0.db.concourse.concourse.bosh) (See bosh link doc and DNS link). In this case, DATABASE-LINK.instances[0].address returns an IP address.
  • BBR receives a bbr config json that has the DB IP address, and reaches out to the IP address with hostname verification turned on (see psql doc's explanation of the verify-full sslmode). But since the postgres server cert only has the DNS address (q-s0.db.infra.concourse-colocated.bosh) as its Common Name & SAN. The hostname verification fails.

Related

Related issue reported by other here

Potential Fixes:

  1. Integrate or document the features.use_dns_addresses deployment manifest property (see doc) so that SOME-DB-LINK.instances[0].address returns a DNS address.
  2. (If possible) configure BBR to disable hostname verification when talking to an internal postgres server. See psql doc's explanation that the verify-full sslmode, which includes the hostname verification, is not required when using a local CA or self-signed certs
  3. Update cluster/operations/secure-internal-postgres.yml such that the internal postgres server's cert would have both its DNS address and its IP address as the cert's SAN, then the hostname verification will work regardless if BRR is talking to the postgres server using DNS address or IP address.

cc: @bruce-ricard

@peterhaochen47 peterhaochen47 changed the title BBR backup CredHub fails due to hostname verification failure with internal postgres server's certificate BBR backup CredHub fails due to hostname verification failure with internal postgres server's certificate - DRAFT Feb 11, 2021
@peterhaochen47 peterhaochen47 changed the title BBR backup CredHub fails due to hostname verification failure with internal postgres server's certificate - DRAFT BBR backup CredHub fails due to hostname verification failure with internal postgres server's certificate Feb 11, 2021
@mjenk664
Copy link

mjenk664 commented Feb 26, 2021

Hey @peterhaochen47 ,

I was able to find a Workaround, see my notes below on Potential Fix 1) and 2).

Potential Fix 1)

This suggestion did not work for me. After adding

features:
   use_dns_addresses: true

to my concourse manifest, re-deploying, and performing the backup - I received the following error:
concoruse-err

In the above screenshot, the BOSH DNS Name is used. However, the Full DNS name is returned. Unfortunately, it still does not match the Common Name that is used on the postgres server certificate.

There is another property in the bosh docs for features.use_short_dns_addresses that states this property is used for certificate common names, so I tried that. However, this also did not work.

When adding this property to the manifest and re-deploying Concourse, the deployment fails when it starting the uaa Job on the Web VM with the following error:

image (1)


Potential Fix 2)

I was able to get Potential Fix 2) working using the bbr-postgres-db Job from the postgres bosh release

The bbr-postgres-db job has a property to disable SSL Hostname Verfication:
postgres-ssl-verify

It appears there is no option that exists for this in the bbr-credhubdb job, which is why the backup fails with the hostname verification error when consuming this job from the credhub release

Below are the steps I implemented:

  1. Created a new operations file called backup-postgres.yml with the following config:
# Add release for backup-and-restore-sdk
- type: replace
  path: /releases/name=backup-and-restore-sdk?
  value:
    name: backup-and-restore-sdk
    version: ((bbr_sdk_version))
    url: https://bosh.io/d/github.com/cloudfoundry-incubator/backup-and-restore-sdk-release?v=((bbr_sdk_version))

# Add the database-backup-restore job to the db VM
- type: replace
  path: /instance_groups/name=db/jobs/-
  value:
    release: backup-and-restore-sdk
    name: database-backup-restorer

# Add the bbr-postgres-db job to enable BBR backups for the postgres db
# NOTE: When TLS is enabled for postgres, the BBR backup fails with a hostname verification error
#       Therefore, we must set the ssl_verify_hostname property to FALSE (Sets to TRUE by default)
#---------------------------------
- type: replace
  path: /instance_groups/name=db/jobs/-
  value:
    name: bbr-postgres-db
    release: postgres
    properties:
      postgres:
        databases:
          - credhub
          - atc
          - uaa
        ssl_verify_hostname: false
  1. Removed the backup-atc.yml from my deploy script and added my newly created backup-postgres.yml ops file:
#!/bin/bash

bosh deploy \
  -d concourse ./cluster/concourse.yml \
  -l vars.yml \
  -l versions.yml \
  -o ./cluster/operations/basic-auth.yml \
  -o ./cluster/operations/privileged-https.yml \
  -o ./cluster/operations/tls.yml \
  -o ./cluster/operations/encryption.yml \
  -o ./cluster/operations/uaa.yml \
  -o ./cluster/operations/credhub-colocated.yml \
  -o ./cluster/operations/secure-internal-postgres.yml \
  -o ./cluster/operations/secure-internal-postgres-bbr.yml \
  -o ./cluster/operations/secure-internal-postgres-uaa.yml \
  -o ./cluster/operations/secure-internal-postgres-credhub.yml \
  -o ./backup-postgres.yml
  1. Re-deployed Concourse

  2. Once Concourse re-deployed, I ran my backup-concourse Job and was able to successfully take a backup of the postgres database containing the ATC, CredHub, and UAA databases:

concourse-postgres-bbr

After taking the backup, I untarred the tgz/tar file and I could see the .sql files for each database.

root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100# ls -l backup/
total 160
drwx------ 1 root root      0 Feb 26 04:50 concourse_20210226T045031Z
-rw-r--r-- 1 root root 163236 Feb 26 04:50 product_concourse_2021-02-26-04-50-54.tgz

root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100# cd backup/
root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100/backup# tar -xvf product_concourse_2021-02-26-04-50-54.tgz 
concourse_20210226T045031Z/db-0-bbr-postgres-db.tar
concourse_20210226T045031Z/manifest.yml
concourse_20210226T045031Z/metadata

root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100/backup# ls -l concourse_20210226T045031Z/
total 368
-rw-r--r-- 1 root root 358400 Feb 26 04:50 db-0-bbr-postgres-db.tar
-rw-r--r-- 1 root root   8276 Feb 26 04:50 manifest.yml
-rw-r--r-- 1 root root    460 Feb 26 04:50 metadata

root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100/backup# cd concourse_20210226T045031Z/
root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100/backup/concourse_20210226T045031Z# tar -xvf db-0-bbr-postgres-db.tar 
./
./postgres_uaa.sql
./postgres_credhub.sql
./postgres_atc.sql

root@3333a357-0504-458c-51f0-a1c4dac8a6f1:/tmp/build/c03f8100/backup/concourse_20210226T045031Z# ls -l
total 712
-rw-r--r-- 1 root root 358400 Feb 26 04:50 db-0-bbr-postgres-db.tar
-rw-r--r-- 1 root root   8276 Feb 26 04:50 manifest.yml
-rw-r--r-- 1 root root    460 Feb 26 04:50 metadata
-rw-r--r-- 1 root root 261771 Feb 26 04:50 postgres_atc.sql
-rw-r--r-- 1 root root  30200 Feb 26 04:50 postgres_credhub.sql
-rw-r--r-- 1 root root  53704 Feb 26 04:50 postgres_uaa.sql

Let me know if you have any questions on this workaround. I hope this helps!

@xtremerui
Copy link
Contributor

After speaking with Bosh team, solution#3 is not feasible as bosh links doesn't support returning IP and DNS hostname at the same time.

So it seems @mjenk664 's potential solution#2 is the best so far. At least you can keep the backup-postgres.yml for next re-deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants