Building

I Don’t Have Any Neighbors – A Deep Dive into DHCPv6, Part 1

Probably due to the (“secondary”) role it has been historically assigned within the IPv6 universe, DHCPv6 is a protocol which is very different from its IPv4 counterpart. Some of the differences and similarities have been discussed recently (e.g. see Scott Hogg‘s article on “High Availability DHCPv6“). This post aims at covering a fundamental, yet widely unknown or misunderstood difference, that is the properties of DHCPv6 addresses and their behavior on the local-link.

Let’s assume that an organization wants to use DHCPv6 in a way mostly consistent with their IPv4 networks and the associated operations model. Hence, besides providing some technical background and explanation, the main question of this post can be formulated as follows: how can one implement DHCPv6 in a way that leads to network behavior (=> network operations model, incl. troubleshooting) similar to that in IPv4 networks?

In IPv4 once a node (say, “Alice”) wants to communicate with another node (“Bob”), the first node will use its subnet mask (like a kind-of glasses) to find out if the other node is a neighbor (read: within the same subnet) and in case it is, the initiator will perform an ARP broadcast (“who has?”) to determine the destination’s MAC address.
In contrast, in very short, IPv6 does not have a concept of a “subnet mask” for that (or any other) purpose, but the so-called “on-link” flag of an address/prefix is used for this determination. In the RFC 5942 IPv6 Subnet Model: The Relationship between Links and Subnet Prefixes, which is the most important RFC for our discussion, it is stated:

“IPv4 implementations typically associate a netmask with an address when an IPv4 address is assigned to an interface. That netmask together with the IPv4 address designates an on-link prefix.
[…]
The behavior of IPv6 as specified in Neighbor Discovery (ND) [RFC4861] is quite different. The on-link determination is separate from the address assignment. A host can have IPv6 addresses without any related on-link prefixes or can have on-link prefixes that are not related to any IPv6 addresses that are assigned to the host. Any assigned address on an interface should initially be considered as having no internal structure as shown in [RFC4291].
In IPv6, by default, a host treats only the link-local prefix as on-link.
The reception of a Prefix Information Option (PIO) with the L-bit set [RFC4861] and a non-zero valid lifetime creates (or updates) an entry in the Prefix List. All prefixes on a host’s Prefix List (i.e., those prefixes that have not yet timed out) are considered to be on-link by that host.”

Let’s perform some experiments to find out what this means in practice, why it is so crucial (e.g. for troubleshooting) and how all this can potentially be adjusted/fine-tuned.
We start with a very simple network including a (here Cisco-based) router and two nodes (one running Windows & the other one running Linux, but that’s not really important for the overall behavior => the overall purpose & message of this post), all sitting on the same local-link. The relevant part of the router’s config looks like this:

interface Vlan100
no ip address
ipv6 address FE80::1 link-local
ipv6 address 2001:DB8:5:5::1/64
ipv6 enable

Please note that, for illustration purposes, we use a minimal IPv6 config on the VLAN interface here (as opposed to a typical production config which usually includes all types of ND related stuff like “IPv6 nd router-preference high”; more on this in another post).
In our lab, the above config leads to the following configuration of the two adjacent nodes:

Windows:

IPv6 Address. . . . . . . . . . . : 2001:db8:5:5:f9ca:18b6:7015:cca6
Link-local IPv6 Address . . . . . : fe80::f9ca:18b6:7015:cca6%13
Default Gateway . . . . . . . . . : fe80::1%13

Linux:

enno@mobile32:~$ ip -6 addr sh

eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:db8:5:5:3aea:a7ff:fe85:c926/64 scope global dynamic
valid_lft 2591863sec preferred_lft 604663sec
inet6 fe80::3aea:a7ff:fe85:c926/64 scope link
valid_lft forever preferred_lft forever

 

network_basic

Quickly checking they can communicate with each other:

D:\>ping 2001:db8:5:5:3aea:a7ff:fe85:c926

Pinging 2001:db8:5:5:3aea:a7ff:fe85:c926 with 32 bytes of data:
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time=1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms

 

Now let’s bring (“managed” variant) DHCPv6 into the picture/game. To keep things simple we use the Cisco device as a DHCPv6 server (which is not necessarily a good idea for a production environment given the Cisco “built-in” DHCPv6 server lacks quite some configuration knobs and logging capabilities compared with e.g. ISC dhcpd) but serves well for illustration purposes.

We will first use a prefix different from the SLAAC provided one, which – of course – is not a brilliant network design 😉 but will help to clarify things:

Router(config)#ipv6 dhcp pool Clients
Router(config-dhcpv6)#add prefix 2001:db8:6:6::/64
Router(config-dhcpv6)#exi
Router(config)#int vlan100
Router(config-if)#ipv6 dhcp server Clients
Router(config-if)#ipv6 nd managed-config-flag
Router(config-if)#exi

Here’s the situation/configuration this creates on the nodes within that subnet:

Windows:

IPv6 Address. . . . . . . . . . . : 2001:db8:5:5:f9ca:18b6:7015:cca6
IPv6 Address. . . . . . . . . . . : 2001:db8:6:6:40f5:46c0:f0ed:4168
Link-local IPv6 Address . . . . . : fe80::f9ca:18b6:7015:cca6%13
Default Gateway . . . . . . . . . : fe80::1%13

Linux:
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:db8:6:6:c56c:aade:e363:ad79/64 scope global
valid_lft forever preferred_lft forever
inet6 2001:db8:5:5:3aea:a7ff:fe85:c926/64 scope global dynamic
valid_lft 2591873sec preferred_lft 604673sec
inet6 fe80::3aea:a7ff:fe85:c926/64 scope link
valid_lft forever preferred_lft forever

Somebody with an IPv4-centric mindset might be surprised here that on both systems there are SLAAC generated and DHCPv6 assigned addresses at the same time. Suffice to say that having multiple addresses, from potentially different sources, is fully ok in the IPv6 world (and even foreseen for a number of scenarios). So for those readers with an IPv4 background here’s a quick first take-away: just because a system gets a DHCPv6 provided address, usually does NOT mean that no other addresses (from other sources) can exist in parallel. On some OSs (e.g. Windows) even static addresses, SLAAC generated ones and DHCPv6 provided ones can happily co-exist at the same time. We will get back to this later.

So far so good, now let the fun begin.
What if Alice (the Windows system) just pings Bob (Linux) as above, just now pinging his DHCPv6 provided address instead of the SLAAC address? Thinking in terms of IPv4 this should easily be doable (Bob sits in the same subnet as Alice, so just ARP/perform ND for his address and so on), right?
Here’s what actually happens:

D:\>ping 2001:db8:6:6:c56c:aade:e363:ad79

Pinging 2001:db8:6:6:c56c:aade:e363:ad79 with 32 bytes of data:
Destination net unreachable.
Destination net unreachable.
Destination net unreachable.
Destination net unreachable.

What the … is this?
Maybe we mistyped the destination IPv6 address? => double-checked, it’s ok [well, I copy+pasted from above].
Maybe Bob’s link is down or sth? Let’s check if Alice can still ping Bob’s SLAAC address.
D:\>ping 2001:db8:5:5:3aea:a7ff:fe85:c926

Pinging 2001:db8:5:5:3aea:a7ff:fe85:c926 with 32 bytes of data:
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms
Reply from 2001:db8:5:5:3aea:a7ff:fe85:c926: time<1ms

In German there’s a saying  (from the realm of football) that goes like “die Wahrheit liegt auf dem Platz” which can be roughly translated to “the truth lies on the [football] ground”. As a networking guy let me adapt this to “the truth is in the packets”. In other words: what does Wireshark display as for the situation? Let’s take a look:

unreachable1_excerpt3

 

So, apparently

– Alice chooses her SLAAC provided address as source address for the outgoing ping.
– Alice sends the packet to the router (take a careful look at the destination MAC address of the first packet!). at the same time she does NOT perform ND for Bob’s address.
– the router sends back an ICMPv6 packet type 1 (“Destination unreachable”).

The following questions come to mind:

a) why does Alice choose her SLAAC address as the source address?
Readers thinking in terms of IPv4 (and even some with a basic understanding of RFC 6724 Default Address Selection for Internet Protocol Version 6 (IPv6)) might have expected that she takes an address from the 2001:db8:6:6::/64 to reach another address with the same prefix.

b) why does this stuff happen in this way at all?

c) and, of course, how can all this be “cured” in the sense of our overall guiding question “we just want DHCPv6 to behave like DHCPv4”?

Let me try to answer those one by one.

First, please remember that a DHCP provided address is NOT considered to be on-link. RFC 5942 states, with this regard, in section 4 on “Host Rules”:

“A correctly implemented IPv6 host MUST adhere to the following rules:

1. The assignment of an IPv6 address — whether through IPv6
stateless address autoconfiguration [RFC4862], DHCPv6 [RFC3315],
or manual configuration — MUST NOT implicitly cause a prefix
derived from that address to be treated as on-link and added to
the Prefix List. A host considers a prefix to be on-link only
through explicit means, such as those specified in the on-link
definition in the Terminology section of [RFC4861] (as modified
by this document) or via manual configuration.”

[still, please keep in mind, that the reception of a router advertisement with prefix information “usually” – more on this below – will lead to the SLAAC address being considered on-link; see the RFC 5942 quote at the beginning of this post]

Hence the Windows system considers the (DHCP provided) 2001:db8:6:6::/64 prefix/network to be one “without any neighbors” (like in a dial-up network), so the destination addresses within that one can only be reached through a router.
As for the source address selection – probably (actually I don’t have any better explanation) – rule 5.5 of RFC 6724 (it’s a Win Server 2012 system which follows RFC 6724, in contrast to, for example, Server 2008 which follows RFC 3484) kicks in, which in turn states:

“Rule 5.5: Prefer addresses in a prefix advertised by the next-hop.

If SA or SA’s prefix is assigned by the selected next-hop that will
be used to send to D and SB or SB’s prefix is assigned by a different
next-hop, then prefer SA.”

For the same reason the system sends the ICMP echo request to the router. I mean, what else, considering there’s no perceived neighbors, due to the missing on-link flag.

The router (serving as DHCPv6 server at the same time) has no interface configured with an address from the 2001:db8:6:6::/64 prefix and subsequently no route to that destination/prefix exists (in the present lab setting the router does not have an IPv6 default route) that’s why it sends back an ICMP unreachable (“Dear sender, I have no idea how to get to that destination”).

Now some of of you might argue: “well, that setting with a router having an interface within 2001:db8:5:5::/64 and DHCPv6 addresses being distributed from 2001:db8:6:6::/64 is an unrealistic one (in real-life)” and of course you’re right. Let’s change this then to a “more proper setting” and move the router’s node-facing interface to the 2001:db8:6:6::/64 prefix/network:

Router(config)#int vlan100
Router(config-if)#no ipv6 add 2001:db8:5:5::1/64
Router(config-if)#ipv6 add 2001:db8:6:6::1/64
Router(config-if)#exi

After an interface down/up both on the Windows and the Linux systems this gives:

Windows

IPv6 Address. . . . . . . . . . . : 2001:db8:6:6:40f5:46c0:f0ed:4168
IPv6 Address. . . . . . . . . . . : 2001:db8:6:6:f9ca:18b6:7015:cca6
Default Gateway . . . . . . . . . : fe80::1%13

Linux

eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:db8:6:6:94e:e384:7d17:4ed4/64 scope global
valid_lft forever preferred_lft forever
inet6 2001:db8:6:6:3aea:a7ff:fe85:c926/64 scope global dynamic
valid_lft 2591970sec preferred_lft 604770sec
inet6 fe80::3aea:a7ff:fe85:c926/64 scope link
valid_lft forever preferred_lft forever

So both systems now have two addresses with the same prefix, a SLAAC generated one and a DHCPv6 provided one. Looking closer on the Windows box shows:

D:\>netsh int ipv6 sh add

Interface 13: Ethernet

Addr Type DAD State Valid Life Pref. Life Address
——— ———– ———- ———- ————————
Dhcp Preferred 1d23h32m40s 23h32m40s 2001:db8:6:6:40f5:46c0:f0ed:4168
Public Preferred 29d23h59m25s 6d23h59m25s 2001:db8:6:6:f9ca:18b6:7015:cca6
Other Preferred infinite infinite fe80::f9ca:18b6:7015:cca6%13

What happens now once Alice pings Bob via his DHCP address?

D:\>ping 2001:db8:6:6:94e:e384:7d17:4ed4

Pinging 2001:db8:6:6:94e:e384:7d17:4ed4 with 32 bytes of data:
Reply from 2001:db8:6:6:94e:e384:7d17:4ed4: time=1ms
Reply from 2001:db8:6:6:94e:e384:7d17:4ed4: time<1ms
Reply from 2001:db8:6:6:94e:e384:7d17:4ed4: time<1ms
Reply from 2001:db8:6:6:94e:e384:7d17:4ed4: time<1ms

So, this works and the Windows system even uses its own DHCPv6 provided address for the task:

ping_working_two_addresses_excerpt

Mission accomplished then?

Well, unfortunately, no. Alas not even close, as we’ll see.
First we now have -two- global IPv6 addresses on both systems with only one of them being provided from a “centrally managed mechanism” (DHCPv6) which is probably not a desired state from a network & security management perspective.
Furthermore this can lead to decision problems (source address selection!) with the associated “solution space” (RFCs 3484/6724 and their actual stack implementations) being quite immature/unreliable as from our experience in various networks.

To make matters worse, for non-local communication the SLAAC generated address is chosen:

ping_Remote_excerpt

which then leads directly into troubleshooting hell for the poor sods responsible for diagnosing client-related connection problems.

So we’re not yet there. We have to get rid of the SLAAC generated address, in order to have “a clean state with only one global IPv6 address on the clients, and ideally that one being the DHCPv6 provided one”.

How to achieve this will be covered in the second part of this post to follow very soon.
For today I hope to have shed light on some subtle but important differences between DHCPv6 and DHCPv4. We’re happy to receive any comments from your real-life DHCPv6 deployment,
best

Enno

 

 

 

 

 

 

Comments

  1. Hi Enno,

    A really enlightinining article with very good points (especially the v4 versus v6 on-link considerations).

    Question: After configuring the router with the additional IPv6 address in the prefix 2001:db8:6:6::/64, does it automatically include PIOs for this prefix – with the L-bit set – in its sending RAs, or not?

    Thanks!

    Antonios

    1. Hi Antonios,

      thanks for the feedback. As for your question: yes, assigning the address to the router interface automatically leads to router advertisements containing PIO for the prefix, with the L-bit set.
      That’s the way it works according to the specs (namely RFC 4861). Depending on the actual capabilities and configuration knobs of the respective L3 infrastructure, this can be adjusted => see the second part.
      But, in absence of anything else, it’s: routing entity has an address => emits RAs, includes related PIO in RAs, which in turn have L-bit set.
      best

      Enno

  2. Hi Enno,

    Very nice as usual. Another interesting aspect of this is exploring the on-link/off-link options. As you know, the ipv6 nd prefix tells the router which prefixes are on/off-link. Of course, for explicit addresses, this is automatic. Where there are additional prefixes on-link and the router doesn’t have an explicit address, these must be configured. The default for an ipv6 nd prefix is on-link. The on-link flag can be cleared for a Private VLAN or where you want all traffic to flow through the router for policy/control (e.g. broadband/cable network). Furthermore you can use the off-link option if you don’t want the router to add the prefix to the local routing table (or you want it removed). Lots of interesting combinations/possibilities.

    –Jim

  3. Enno,

    Question – I’ve heard you mention this sentiment many times:
    …this can lead to decision problems (source address selection!) with the associated “solution space” (RFCs 3484/6724 and their actual stack implementations) being quite immature/unreliable as from our experience in various networks.

    I’m not sure I would call address selection unreliable. I think the problem is that each O/S has implementation quirks (doesn’t strictly adhere to the RFCs). However, I do agree that this could be better documented. I have noticed for example different behavior between Windows and Linux for choosing ULAs/GUAs as a source even when the prefix policies are the same. Let me know if you want to collaborate on an article. This is something I’ve been meaning to do for a while and last I checked Gert said he’d peer review. Let me know.

    –Jim

    1. Jim,

      a thorough documentation on the actual behavior of different OSs and caveats wrt source address selection is a huge desideratum in the IPv6 community, me seems at least.
      So, yes, definitely let’s collaborate on this. Will contact you by PM.
      have a great weekend
      Enno

  4. Enno,

    I especially liked how you show that on-link determination is separate from DHCPv6 addressing. This made me curious as I’ve read about it in the RFCs but never really dug in. On Windows 7 at least, you can tell which prefixes are advertised as on-link by examining the routing table (route print -6). If the prefix is advertised with the on-link flag set, you will see the /64 in the routing table listed as on-link. If the prefix isn’t advertised or has the on-link flag cleared, you won’t see it in the routing table. If you set/add a prefix with the on-link flag set, it shows up quickly. However, if you remove the on-link flag or stop advertising a prefix it persists until the valid lifetime expires. I think you may be able to influence this down but I believe there’s a lower limit to prevent DoS – this may be in the RFC but I couldn’t find it in 4861 with a quick look.

    Have a great weekend,
    –Jim

Leave a Reply to James Small Cancel reply

Your email address will not be published. Required fields are marked *