Certificate Transparency & Alternative Name Disclosure

Maybe you’ve heard of Certificate Transparency and its log. Citing Wikipedia: “Certificate Transparency (CT) is an Internet security standard and open source framework for monitoring and auditing digital certificates.” Basically, it gives you information about any public certificate that is issued. Besides its advantages, I thought of one possible problem as it leaks all FQDNs to the public when using TLS certificates, for example from Let’s Encrypt.

A similar problem might arise when using a single X.509 certificate with a couple of DNS names (subject alternative name SAN) from which one should be kept “private”. It will be publicly known as well.

Hence I made a self-experiment in which I generated two certificates with random names, monitoring the authoritative DNS servers as well as the IPv6 addresses of those names in order to check who is resolving/connecting to otherwise unknown hostnames. Here we go:

TL;DR: Over a test period of 8 months my (1) completely hidden FQDN was resolved by DNS queries about 700 times, while 73 IPv6 HTTPS connections came in. My (2) second FQDN (SAN on my blog) was queried about 500 times, while 273 IPv6 connections came in, split into port 80 and 443. That is: Both completely unused hostnames are “leaked” and scanned by the public, just by using X.509 certificates!

The Setup

I used randomly generated hostnames and IPv6 addresses:

A Let’s Encrypt certificate with a single DNS name which I have never used anywhere. I didn’t even google for it or the like. Its name is 7qftpqiw5m.ib.weberdns.de having a single AAAA record pointing to 2001:470:765b:0:747d:36b6:0d74:f26a. Issued at 2019-10-30 14:49 UTC, CT log here: https://crt.sh/?id=2053680948.
Another LE certificate with a DNS name of xd524olksc.ib.weberdns.de (with AAAA to 2001:470:765b:0:d302:1962:0df9:91eb) along with weberblog.net and blog.webernetz.net (my old hostname for the blog). In fact, I used this certificate for 2,5 months on my blog itself, giving many entities the possibility to see the hostname. Issued at 2019-10-30 15:09 UTC, CT log here: https://crt.sh/?id=2053730087. It was active on my blog until 2020-01-15 13:55 UTC.

As both certificates were issued by Let’s Encrypt, their validity expired after 90 days.

Since I am controlling the authoritative DNS servers for *.ib.weberdns.de (Infoblox VMs with query logging enabled) as well as the IPv6 space 2001:470:765b::/64 (a HE tunnel broker connection through my Palo Alto Networks firewall) I was able to catch every single DNS query and IP connection attempt. ;) However, all connections were blocked completely. No service was listening on those IPv6 addresses at all.

Many DNS Queries and IP Connections

After the two certificates were issued, I put the second one (with the SAN) on my blog. For both hostnames, I saw a couple of DNS requests on the DNS servers immediately. As well as incoming connections to the IPv6 addresses:

The TLS certificate on my blog during the test period.

Live watching the very first DNS queries on the two DNS servers right after the certs were issued.

Blocking (but logging) incoming IPv6 connections to the test hostnames.

Here is a screenshot from the Palo Alto, showing the two blocking rules and their hits:

Note that I did *only* analyze incoming IPv6 connections. Yes, this is not optimal, but it is 100 % likely that all those attempts came from explicit stations querying my hostnames rather than normal port scans. (Port scans to random IPv6 addresses are unlikely as the host ID space in a /64 is 2^64. Refer to my post about the Internet’s Noise.)

My Analysis

I grepped through the log files from the authoritative DNS servers as well as from the firewall. As expected, there are hundreds of connection attempts to both hostnames:

	Cert 1 only CT	Cert 2 CT + Blog SAN
DNS queries for A (unique sources)	642 (307)	357 (228)
DNS queries for AAAA (unique sources)	37 (32)	117 (99)
DNS queries for CAA (unique sources)	3 (3)	24 (14)
All queried RRs	642 A 37 AAAA 22 MX 6 DS 5 NS 3 TXT 3 SOA 3 CAA 2 CNAME	357 A 117 AAAA 29 MX 24 CAA 8 TXT 8 NS 7 SOA 4 DS 3 CNAME 2 DNSKEY
IPv6 connections (unique sources) [unique /32]	73 (15) [2]	273 (46) [14]
Destination Ports	73x 443	154x 80 122x 443
Sourcing ASes	14 DigitalOcean, LLC 1 Google LLC	14 DigitalOcean, LLC 8 Quintex Alliance Consulting 6 Emerald Onion 4 Google LLC 4 F3 Netze e.V. 2 Zwiebelfreunde e.V. 2 Nexeon Technologies, Inc. 1 OVH Ltd 1 Keyweb AG 1 Joey Julian Koenig 1 Hydra Communications Ltd 1 Hurricane Electric LLC 1 Alec Larsen

Q.E.D.

While the first FQDN got more DNS queries, the second one received much more connection attempts.

I have not looked into every single IPv6 source address, but there seem to be some interesting ones. Of course, it’s Google. And quite often DigitalOcean which probably only hosts some kind of stuff? (Does anyone know where Let’s Encrypt itself is hosted?) Furthermore, at least one Tor exit node is present according to the whois search. And Technische Universitaet Muenchen querying the CAA records of the hostname just a few seconds after it appeared in the CT log. Probably some research going on here? Same for Cloudflare, which only queried the CAA record.

Further Analysis

As always, you could do so much more with this data. At first, drawing some more nice graphs to have an easier understanding of the data than the raw values. ;) But also some more analysis such as:

timeline of queries <- this would probably reveal much more queries in the first few hours compared to the rest of the test period
correlation of the DNS clients (that sent the DNS query) and the IPv6 addresses that sourced actual connections to the servers <- probably no big correlation here since the first show the recursive DNS resolvers while the second the Google crawlers
a deeper look at the source IPv6 addresses <- more details about search engine crawlers, university projects, TOR exit nodes, and random private clicks as someone saw the weird hostname on my blog’s certificate.

Conclusion

Don’t expect that you can receive valid X.509 certificates on the Internet for private use cases. Every single FQDN is immediately publicized in the CT logs and will be scanned! Keep that in mind.

My analysis used only IPv6 addresses behind those hostnames. Since the query rate for A records was much higher, it is likely that there are a couple of hundred connection attempts for every hostname that ever appeared in the public CT logs or that is just a subject alternative name on a publicly used certificate.

Appendix

If you want to have a look at the raw logs, here we go:

I used the following shell commands for my analysis (refer to Logfile Parsing):

#both DNS names from both authoritative DNS servers

cat /var/log/firewalls/2001:470:765b::d031:53/*/*/* | grep 7qftpqiw5m > ~/cert1-dns-ib1

cat /var/log/firewalls/2001:470:765b::d031:53/*/*/* | grep xd524olksc > ~/cert2-dns-ib1

cat /var/log/firewalls/2001:470:1f0b:16b0::d032:53/*/*/* | grep 7qftpqiw5m > ~/cert1-dns-ib2

cat /var/log/firewalls/2001:470:1f0b:16b0::d032:53/*/*/* | grep xd524olksc > ~/cert2-dns-ib2

#both policy logs

cat /var/log/firewalls/2001:470:765b::1/*/*/* | grep 2867f699-8191-4169-98ae-d45d7f349c29 > ~/cert1-palo

cat /var/log/firewalls/2001:470:765b::1/*/*/* | grep 151bf66d-2f75-43ab-b8ec-9dcc14f51fac > ~/cert2-palo

#both authoritative DNS logs into one file

cat cert1-dns-ib1 cert1-dns-ib2 > cert1-dns-all

cat cert2-dns-ib1 cert2-dns-ib2 > cert2-dns-all

#omitting DNSSEC zone updates

cat cert1-dns-all | grep -v ZRQ > cert1-dns-all-cleaned

cat cert2-dns-all | grep -v ZRQ > cert2-dns-all-cleaned

#DNS queries for A records

cat cert1-dns-all | grep query | grep "IN A " | wc -l

cat cert1-dns-all | grep query | grep "IN A " | awk '{print $8}' | sed s/#.*// | sort -g | uniq

cat cert1-dns-all | grep query | grep "IN A " | awk '{print $8}' | sed s/#.*// | sort -g | uniq | wc -l

cat cert2-dns-all | grep query | grep "IN A " | wc -l

cat cert2-dns-all | grep query | grep "IN A " | awk '{print $8}' | sed s/#.*// | sort -g | uniq

cat cert2-dns-all | grep query | grep "IN A " | awk '{print $8}' | sed s/#.*// | sort -g | uniq | wc -l

#DNS queries for AAAA records

cat cert1-dns-all | grep query | grep "IN AAAA" | wc -l

cat cert1-dns-all | grep query | grep "IN AAAA" | awk '{print $8}' | sed s/#.*// | sort -g | uniq

cat cert1-dns-all | grep query | grep "IN AAAA" | awk '{print $8}' | sed s/#.*// | sort -g | uniq | wc -l

cat cert2-dns-all | grep query | grep "IN AAAA" | wc -l

cat cert2-dns-all | grep query | grep "IN AAAA" | awk '{print $8}' | sed s/#.*// | sort -g | uniq

cat cert2-dns-all | grep query | grep "IN AAAA" | awk '{print $8}' | sed s/#.*// | sort -g | uniq | wc -l

#list of queried RRs

cat cert1-dns-all | grep query | awk '{print $13}' | sort | uniq -c | sort -hr

cat cert2-dns-all | grep query | awk '{print $13}' | sort | uniq -c | sort -hr

#incoming connections

cat cert1-palo | wc -l

cat cert2-palo | wc -l

#unique source addresses

cat cert1-palo | cut -d ',' -f 8 | sort | uniq | wc -l

cat cert2-palo | cut -d ',' -f 8 | sort | uniq | wc -l

#connected destination ports

cat cert1-palo | cut -d ',' -f 26 | sort | uniq -c

cat cert2-palo | cut -d ',' -f 26 | sort | uniq -c | sort -r

#whois queries for all unique source addresses

for i in `cat cert1-palo | cut -d ',' -f 8 | sort | uniq` ; do whois $i ; done | grep -e org-name -e OrgName

for i in `cat cert2-palo | cut -d ',' -f 8 | sort | uniq` ; do whois $i ; done | grep -e org-name -e OrgName

God’s blessing!

Photo by AbsolutVision on Unsplash.

5 thoughts on “Certificate Transparency & Alternative Name Disclosure”

Corey Bonnell says:

2020-07-01 at 17:49

Nice write-up! The queries for CAA records shortly after CT logging are likely coming from this research project: https://caastudy.github.io

Peter Wu says:

2020-07-01 at 19:06

Some of the queries came from the following Cloudflare IP ranges (these are listed on https://cloudflare.com/ips):
108.162.192.0/18
141.101.64.0/18
162.158.0.0/15
2400:cb00::/32

They appear to be recursive queries sent by the 1.1.1.1 resolver on behalf of its users. I expect Google to be in a similar situation, with queries going to 8.8.8.8.

Pingback: Weekend Reads 071020 – rule 11 reader
Sunny says:

2020-07-31 at 08:38

I was planning to take advantage of this CT log, to monitor for new hosts in my company which skip our Security approval. Im planning to write a small tool to keep checking for unauthenticated endpoints for a subscribed domain in a CT log.

DonkeyKong says:

2023-02-24 at 03:09

This is the reason, why i only use wildcard certificates for my private stuff. No CT privacy issues anymore. Cheers to the DNS-01 acme challenge and Let’s Encrypt.

Weberblog.net

IT-Security, Networks, IPv6, VPN, DNSSEC, NTP