Some readers will probably be aware that we are amongst the proponents of a quite strict stance when it comes to filtering IPv6 packets with (certain) Extension Headers and/or fragmentation, because those can be the source of many security problems (as laid out here, here or here). Actually I still think it was a very good idea of, amongst others, Randy Bush and Ron Bonica to suggest the deprecation of IPv6 fragmentation in the IETF.
On the other hand there are voices arguing that fragmented IPv6 packets will be needed in some cases, namely DNS[SEC]-related ones.
In this post I will discuss some details of this debate (taking place in many circles, incl. this thread on the ipv6-hackers mailing list which, btw, you should subscribe to).
For that purpose let’s dig right into the intricacies of DNS. Once a client (for the protocol: this can also be a “DNS server” which then operates recursively, that is querying other – usually “authoritative” – servers for information/DNS records) performs a lookup operation, it has two choices as for the transport protocol: UDP or TCP. Going over UDP is considered as the default approach, not least because this is what the initial DNS specification (RFC 1035) suggests. As of the same RFC the response usually was expected not to exceed 512 bytes. Once it does (e.g. in case of a zone transfer), TCP was to be used.
However, as of subsequent specifications, the client can additionally signal that it supports certain extensions, the most important (for our discussion) of which is EDNS0 (RFC 6891). By using this option the client indicates it is capable and willing to get back a response which exceeds the traditional 512 byte size of UDP based responses, together with an actual value (of bytes).
Common values for the “UDP payload size” (which I’ll also call just “buffer [size]” in the following) are in the 1280 to 1410 byte range (for reasons to be found below, see also RFC 6891, sect. 6.2.3.: “Choosing between 1280 and 1410 bytes for IP (v4 or v6) over Ethernet would be reasonable.”), or 4096 (which is generally supposed to fulfill all length needs induced by DNSSEC, see also the “DNS Response Size Considerations” section in this document).
As of the (current interpretation) of the relevant specs a client is also allowed to use TCP from the beginning, but in practice this does not happen very often, for several reasons not to be discussed at the moment.
Now let’s have a quick look at some aspects of potential behavior on the server side. Here, the main flag that a server has available to “signal sth” in direction of the client is the “truncated response” [TC] flag which indicates that the requested information could not be fit into the client’s buffer size (512 bytes in case EDNS0 is not set, $BUFFER once EDNS0 is used). Using TC essentially says: “dear client, the capabilities you offer are insufficient for using UDP for our communication, so please use TCP instead”.
Furthermore, for our discussion we have to keep in mind that middleboxes (e.g. firewalls) can come into play, in particular as I’ll focus on enterprise networks.
Mainly, these middleboxes can perform one or several of the following things (for more potential failure cases see here):
- do not allow DNS over TCP (I’ve seen this a lot 10-15 years ago but from my observations this is quite uncommon nowadays, in enterprise space).
- do not allow DNS “with options”, namely/including EDNS0. Again, this can/could be seen here+there but I do not consider this very common in our customer networks today either.
- drop DNS over UDP packets larger than 512 bytes. Once more, I’m not aware of many firewalls currently doing this.
- drop all fragments (well, this is in the middle of the debate 😉 and apparently this happens in quite some cases, see the measurement results in this paper. At this point it might further be noteworthy that, in the context of DNS, fragmentation itself might be the source of specific attacks as Amir Herzberg and Haya Shulman laid out in their “Fragmentation Considered Poisonous” paper, see also subsequent discussion on the dns-operations mailing list).
Last but not least there’s another factor to be considered: given UDP – in contrast to TCP – does not provide a way of segmenting traffic (which TCP actually can, based on the MSS) it has to use IP layer fragmentation once a datagram size exceeds the (local) link’s MTU. This behavior is, again, in the heart of our debate here.
Looking at the overall picture, in real life the most common cases might be the following:
Case 1: Server receives query (over UDP without EDNS0 set) which can be answered within 512 bytes: send response over UDP, all good.
Case 2: Server receives query (over UDP without EDNS0 set) which can not be answered within 512 bytes: respond with TC=1, and hope client will come back using TCP (which may or may not work, depending on $MIDDLEBOXES).
Case 3: Server receives query (over UDP with EDNS0 set), but the response does not fit into client-signaled buffer: respond with TC=1, and hope client will come back using TCP (if that works, see above).
Case 4: Server receives query (over UDP with EDNS0 set), the response fits into client-signaled $BUFFER and the response is smaller than local MTU: send response in a single UDP packet, all good.
Case 5: Server receives query (over UDP with EDNS0 set), the response fits into client-signaled $BUFFER but the response _is bigger_ than MTU: send response in several fragments (which may or may not make it through to the client, depending on $MIDDLEBOXES). Please keep in mind though that quite some clients already avoid that scenario as they just signal $BUFFER sizes below 1500 (about 30% of queries with EDNS0 set, as of this source).
Case 6: Server receives query over TCP and responds, using several segments if needed.
Thinking about a client’s options in the “Case 5” scenario – there’s actually mostly (an easy) one, that is sending the query via TCP. There’s a fair chance this one
- actually gets through $MIDDLEBOXES.
- is processed by the server (Standards Track RFC 5966 “DNS Transport over TCP – Implementation Requirements” comes to mind).
- the response datagrams get back to the client (as now it’s TCP segments as opposed to IP fragments).
You may (rightfully) tempted to ask: what’s the point then at all? Actually there’s one caveat, that is potentially increased load on the DNS server processing the query, given TCP requires more resources than UDP.
In the end of the day we have a classic tragedy of the commons situation here: individual organizations (dropping IPv6 fragments at their network borders) increase their own security posture by implementing a measure which in turn increases the load on systems providing Internet infrastructure services. Are those prepared to handle the additional load? You bet, they better are (while there are DNS-savvy people who think that “most people are capable of writing the one-line perl script that will put a dns responder into tcp exhaustion“. I don’t expect that though).
Let me summarize:
- in the vast majority of situations the above “Case 5” will not happen at all, for one reason or another.
- in cases where it happens client-side TCP fallback will solve the problem, admittedly at the expense of some additional CPU cycles/bytes in memory used on some servers. Trying to be a good netizen, can this be considered a reasonable risk-reward situation? Yes, I think so.
- pursuing a mid-term strategy of (only) using DNS over TCP might be a good idea anyway, for obvious reasons.
To further conclude I’d like to point our readers to the (as usual: very sound) study Geoff Huston recently performed on “Measuring DNS Behaviour“, involving several millions of queries for DNSSEC (read: large) records. In their conclusion he states:
“the number of users who are unable to resolve a DNS name when DNS responses of this size are involved appear to be no more than 1% of all users, and likely to be significantly lower.”
So, just maybe, from an end-user perspective, there is not even a problem…
In any case we see no reason to deviate from our advice to drop all IPv6 fragments at network borders of enterprise networks, and we’ll happily discuss the implications here (by means of comments) or on other channels ;-).
Thank you for reading so far, everybody have a great day
Enno
Very good arcticle, thanks, Enno!
If you have any more statistics on how many organisations are blocking IPv6 fragments vs. how many DNSSEC use-cases have problems, please blog about it again!