RFC-2136
The goal of this document is to provide a configuration overview of the various
facilities required to deploy cert-manager against a RFC2136 compliant DNS
server such as BIND named
. This capability is also commonly known as “dynamic
DNS”.
Unlike the peer of other cert-manager DNS integrations, named
is a bit of a
“Swiss Army Knife” of domain name servers. Over the years, it has been highly
optimized to provide maximal vertical scalability for a single node, as well as
horizontal scalability with service provider interfaces. This flexibility makes
it impossible to go into every possible named
deployment that a user may run
in to though. Instead, this document will try to make sure your server is ready
to accept requests from cert-manager using command line tools, then get on to
the making the two work together.
Transaction Signatures ⇒ TSIG
Dynamic DNS updates are essentially server queries which otherwise might return resource records (RRs). Since DNS servers are commonly exposed to the public internet, being able to push an unauthenticated update to any server that responds to queries would be immediately untenable.
In the eyes of the named
architects, the generic solution to this problem
space was twofold. The first is to require manual enablement of updates at a
zone level, such as example.com
. In a naive network, there is no requirement
that zone updates have any security to them, and clients can be configured such
that they can provide updates without any authentication. An example of where
this is useful is for machines booting using DHCP, in this case the machines
know about themselves and the DNS server can be configured to accept updates
when they come from the address being configured.
This clearly has limitations in situations such as cert-manager and the DNS01
challenge. In this environment, a TXT RR must be created after coordination with
the ACME server. After negotiating with the ACME server, a the TXT RR that is
published on the domain validates that the domain is legitimately engaged with
the process of creating a certificate for it. In the bigger picture of DNS, this
means that an arbitrary actor (cert-manager, in this case) must be able to add
one of these KV mappings to the domain and delete it after the certificate has
been issued. cert-manager
does not have a convenient physical characteristic
such as a DHCP allocation to validate it’s requests.
For cases like this, we need to be able to sign a request that is being sent to the DNS server. We do that through TSIGs, or Transaction SIGnatures.
Configuration Step 1 - Set up your DNS server for secure dynamic updates
There are many excellent tutorials on the net that walk through
preparing a basic named
server for dynamic updates:
- https://www.cyberciti.biz/faq/unix-linux-bind-named-configuring-tsig/
- https://tomthorp.me/blog/using-tsig-enable-secure-zone-transfers-between-bind-9x-servers
More complex name
deployments will not use text files, but rather may use LDAP
or SQL for a database for resource records. An additional wrinkle is metadata
configuration, such as for zone metadata like enabling dynamic updates or access
control lists (ACLs) for a zone. There are too many configurations to go into
here, but you should be able to find the documentation to do so.
Whatever your deployment is, the goal at this stage has nothing to do with
cert-manager and everything to do with a tool called nsupdate
generating
updates signed with TSIG. Once this is out of the way, you can attack the
cert-manager configuration with far greater confidence.
Using nsupdate
Most paths to configuring BIND named
will go through using dnssec-keygen
.
This command-line tool generates a named private key that is used for signing
TSIG requests. When a request is signed, both the signature and the name of the
private key are attached to the request in an unencrypted form. In this manner,
when the request is received, the name of the private key can be used to by the
recipient to find the private key itself, build a new signature with it, and
compare the two for acceptance.
Since there are dozens of ways to have your named
server misconfigured, we’ll
use nsupdate
to test that the server behaves as expected before we get there.
https://debian-administration.org/article/591/Using_the_dynamic_DNS_editor_nsupdate
is a solid breakdown of how to use the tool.
To get started, we’ll simply run nsupdate -k <keyID>
where keyID
is the value
returned from dnssec-keygen
. This will read the key from disk and provide a
command prompt to issue commands. In general, we want to write a simple TXT RR
and make sure we can delete it.
$ nsupdate -k <keyID>
update add www1.example.com txt testing
send
… test here with `nslookup`
update delete www1.example.com txt
send
… test here with `nslookup`
Any failures to write, read or delete the record will mean that cert-manager will not be able to do so either, no matter how well it is configured.
Configuration Step 2 - Set up cert-manager
Now we get to the fun stuff, seeing everything work. Remember that we need to
set up the ACME DNS01 issuer and challenge mechanism as well as the rfc2136
provider. Since the documentation covers the other parts sufficiently, let’s
focus on the provider here.
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: example-issuer
spec:
acme:
...
solvers:
- dns01:
rfc2136:
nameserver: <address of authoritative nameserver configured above>
tsigKeyName: <key name used in `dnssec-keygen`, use something semantically meaningful in both environments>
tsigAlgorithm: HMACSHA512 // should be matched to the algo you chose in `dnssec-keygen`
tsigSecretSecretRef:
name: <the name of the k8s secret holding the TSIG key.. not the key itself!>
key: <name of the key *inside* the secret>
For example:
rfc2136:
nameserver: 1.2.3.4:53
tsigKeyName: example-com-secret
tsigAlgorithm: HMACSHA512
tsigSecretSecretRef:
name: tsig-secret
key: tsig-secret-key
For this example configuration, we’ll need the following two commands. The
first, on your named
server generates the key. Note how example-com-secret
is both in the tsigKeyName
above and the dnssec-keygen
command that follows.
$ dnssec-keygen -r /dev/urandom -a HMAC-SHA512 -b 512 -n HOST example-com-secret
Also note how the tsigAlgorithm
is provided in both the configuration and the
keygen
command. They are listed at
https://github.com/miekg/dns/blob/v1.0.12/tsig.go#L18-L23
.
The second bit of configuration you need on the Kubernetes side is to create a
secret. Pulling the secret key string from the <key>.private
file generated
above, use the secret in the placeholder below:
$ kubectl -n cert-manager create secret generic tsig-secret --from-literal=tsig-secret-key=<somesecret>
Note how the tsig-secret
and tsig-secret-key
match the configuration in the
tsigSecretSecretRef
above.
Rate Limits
The rfc2136
provider waits until all nameservers to in your domain’s SOA RR
respond with the same result before it contacts Let’s Encrypt to complete the
challenge process. This is because the challenge server contacts a
non-authoritative DNS server that does a recursive query (a query for records it
does not maintain locally). If the servers in the SOA do not contain the correct
values, it’s likely that the non-authoritative server will have bad information
as well, causing the request to go against rate limits and eventually locking
the process out.
This process is in place to protect users from server misconfiguration creating a more subtle lockout that persists after the server configuration has been repaired.
As documented elsewhere, it is prudent to fully debug configurations using the ACME staging servers before using the production servers. The staging servers have less aggressive rate limits, but the certificates they issue are not signed with a root certificate trusted by browsers.
What’s next?
This configuration so far will actually do nothing. You still have to request a certificate as described here. Once a certificate is requested, the provider will begin processing the request.
Troubleshooting
- Be sure that you have fully tested the DNS server updates using
nsupdate
first. Ideally, this is done from a pod in the same namespace as therfc2136
provider to ensure there are no firewall issues. - The logs for the
cert-manager
pod are your friend. Additional logs can be generated by adding the--v=5
argument to the container launch. - The TSIG key is encoded with
base64
, but the Kubernetes API server also expects that key literals will be decoded before they are stored. In some cases, a key must be double-encoded. (If you’ve tested usingnsupdate
, it’s pretty easy to spot when you are running into this.) - Pay attention to the refresh time of the zone you are working with. For zones
with low traffic, it will not make a significant difference to reduce the
refresh time down to about five minutes while getting initial certificates.
Once the process is working, the beauty of
cert-manager
is it doesn’t matter if a renewal takes hours due to refresh times, it’s all automated! - Compared to the other providers that often use REST APIs to modify DNS RRs,
this provider can take a little longer. You can
watch kubectl certificate yourcert
to get a display of what’s going on. It’s not uncommon for the process to take five minutes in total.