today I’m going to suspend the “Developing an Enterprise IPv6 Security Strategy” series for a moment and discuss some other aspects of IPv6 deployment.
We’ve been involved in a number of IPv6 projects in large organizations in the past few years and in many of those there was a planning phase in which several documents were created (often these include a road map, an address concept/plan and a security concept).
Point is: at some point it’s getting real ;-), read: IPv6 is actually enabled on some systems. Pretty much all enterprise customers we know start(ed) their IPv6 deployment “at the perimeter”, enabling IPv6 (usually in dual-stack mode) on some systems/services facing the Internet and/or external parties.
Unfortunately there’s a number of (seemingly small) things that can go wrong in this phase and “little errors” made today are probably meant to stay for a long time (in German we have the nice phrase “Nichts ist so dauerhaft wie ein Provisorium”, and I’m sure people with an IT operations background will understand this even without a translator…).
In this post I will hence lay out some things to consider when you enable IPv6 on perimeter elements for the first time.
To start I’ll provide three examples of such “small misses with long-term impact” from different organizations:
In one organization it was planned to pursue an approach of selective route propagation/selective announcements (see here for an overview what this means) and hence to propagate only some /48s (“DMZ networks”). Alas one of their uplink providers performed “strict prefix filtering” (and the other just had no clue 😉 ) and required that they propagate their full covering aggregate (a /32). Which in their specific setup meant that outbound network traffic could potentially pass another firewall than the inbound return traffic… which in turn their stateful firewalls didn’t like too much…
Lesson learned: ask your uplinks first if they’re willing to accept /48 routes without covering aggregate (better yet: make this a mandatory requirement in your periodic tender for Internet circuits) and potentially perform some tests with a dummy /48, observing relevant looking glasses over a certain period of time (RIPEstat Widgets can be very useful here).
In another organization Cisco 7200s with NPE-G1 did a solid job in a certain infrastructure role (this was 2014, before software maintenance ended) in their MPLS network. When they deployed IPv6 in a part of that infrastructure it turned out that VRF-aware OSPFv3 was not supported on any image of the 72K platform => some ugly hacks including redistribution of static routes (!) were needed to realize a specific setup.
Lesson learned: have an IPv6 test lab including main elements of the production networks. Don’t be overoptimistic wrt IPv6 support on legacy platforms (and don’t trust “we fully support IPv6 on those devices” claims from the vendors, but you knew that already, right?).
In yet another environment, in the transit networks between their border routers and their Internet providers’ BGP peers, addresses from the organization’s (prefix) space were used for one uplink, whereas on the second uplink/in the second transit network address space from the respective carrier was used. This is not only inconsistent (and hence violates the simplicity principle) but more importantly this also means that in case the second uplink provider is changed/replaced one day (you know, those periodic tender processes…) will most probably require some additional manual steps (adjusting ACLs on the border router’s external interface, re-configuration of external/cloud-based monitoring services etc.).
Lesson learned: discuss beforehand the addressing approach on transit networks with the respective carriers involved (again, see the checklist in this post). For the record: in general we recommend to go with own address space wherever possible (keep in mind that you’re customer, read: the party issuing the monthly cheque…).
Now I’d like to discuss an actual case study which includes several typical elements to clarify before configuring the first IPv6 address on a device in a perimeter network. In the respective environment back in 2014 major preparatory and IPv6 planning steps were performed, but due to some re-org activities not much happened last year. Meanwhile there’s some momentum again (btw, you’ve probably noticed that global IPv6 deployment has reached 10%) and just recently one of their guys approached me: “Enno, we plan to enable IPv6 on the first services/devices now. As you were involved in the planning phase, any recommendations as for the addresses/prefixes to use for our remote access gateways?”.
[It should be noted that “enabling IPv6 for remote access services” is a quite common use case in Germany as increasingly issues (remote users using VPN technologies having connection problems) can be observed given Dual-Stack Lite is used in some main cable networks here.]
My reply was along the lines: “hold on, we’ll have to discuss and decide some points first”. These include:
Background: like a few other very large enterprise organizations at the time they have opted to apply for IPv6 address space allocations at several RIRs (in this case RIPE NCC, ARIN, APNIC and LACNIC), following advice like the one given here by Ivan Pepelnjak or just “to be on the safe side”. While this doesn’t hurt in the planning & preparations stage, it becomes quite relevant & technically interesting once prefixes have to be assigned to three (groups of) remote access devices located each in a datacenter in the EMEA, APAC and North America region.
Main Approaches/Alternatives: one can either assign prefixes from the respective region (‘s allocation by $REGION’s RIR) for each segment holding VPN gateways – let’s call this the “multiple address space approach” – or, for the moment, just assign prefixes from one RIR’s allocation (say: from RIPE space) and (try to) perform out of region announcements (I discussed those here). Let’s call the latter the “cohesive address space approach” (I don’t like to use the term “contiguous” in this context).
Advantages/Disadvantages of the “multiple address space approach” include:
- Pro: It’s consistent with the initial, “conservative” mindset/strategy.
- Pro: It could be helpful in the long term (e.g. to avoid local providers refusing to accept out of region announcements, in particular such with prefix lengths between /33 and /48), but this is exactly the core of the debate about this approach (we don’t have a crystal ball if this will ever become a problem).
- Con: the creation of respective route6 objects in different RIRs can be cumbersome or tricky (once the allocations and the corresponding aut-num/AS objects are not within the same RIR context), but this is mostly just a matter of effort.
- Con: in the long term following this path will lead to fragmented address space within the organization’s global network (subsidiaries in the US use a different address space than those in APAC which in turn is different from that in EMEA etc.) which is certainly not desirable from an operations perspective (again, RFC 3439 comes to mind).
Current proposal (barring approval of IPv6 task force/steering committee): we’ll probably go with a “cohesive address space approach”, for the following reasons:
- the organization has one global network services provider, which probably facilitates handling.
- gain experience wrt out of region announcements. Overall the people involved do not expect major issues here. I’ll keep you posted ;-).
- avoid “fragmented” internal addressing.
- if needed we can (somewhat) easily change to the other strategy (DNS is your friend…).
Route Propagation Strategy
Background: due to the complexity of their organization (and a certain volatility as for mergers & acquisitions, associated [network operation] responsibilities etc.) they had already decided that they plan to go with a “null-route specific not-to-be-reachable segments” strategy (instead of “selective route propagation”; again, see this post for a more detailed discussion of these strategies).
Main Approaches/Alternatives: at this (early) point of time, the question comes up: implement the above strategy from the very beginning or, just maybe, go with selective announcements now and monitor the situation? Advantages/Disadvantages of the latter (try selective propagation first) include:
- Pro: we can gain experience with the approach and find out if “strict IPv6 prefix filtering” is (still) really a problem. One might note that currently ~45% of the IPv6 routes in the DFZ are /48s and the majority of those is without covering aggregate.
- Pro: We don’t get all the “usual noise” (network traffic from bots and the like) for a full /32 from the very beginning.
- Con: it’s not aligned with the long term strategy (which still might change though).
Current proposal (barring approval of IPv6 task force/steering committee): we’ll probably go with a “for the moment just announce segments with the VPN gateways” approach, for the following reasons:
- Gain experience (not least as for the global network provider’s maturity when it comes to route filtering & propagation) and avoid noise (see above).
- If needed we can easily propagate the covering aggregate(s). Stateful firewalls shouldn’t be affected here due to the overall network design.
Support of VPN Solutions Involved for Planned Setup
Background: IPv6 will only be enabled for the connectivity to the VPN devices, but no IPv6 address pools will be configured on those (=> no IPv6 addressing within the tunnel, as this would require many more steps incl. IPv6 deployment within the corporate network, and it’s not required to mitigate the issues related to Dual-Stack Lite). BUT: depending on the technologies involved other issues may occur once a client has a local IPv6 environment but does not use IPv6 in the tunnel, like the ones Johannes Weber described in this post or those described in Cisco Bug CSCur82067 (unfortunately in this organization there’s a huge number of old[er] AnyConnect versions out there).
Current proposal: deploy chosen approach (no IPv6 in the tunnel) and monitor situation. Upgrade individual clients to a newer AnyConnect version where needed (they have an internal “IT support for VIPs” team… 😉 ). For the record: we have another customer running a similar setup with AnyConnect v3.1.05187 and they don’t see any problems.
How the VPN Devices Get Their Default Route(s)
This is another particularly interesting one. Background: while the VPN gateways will be configured with static addresses, this does not necessarily mean that their default route is also configured in a static manner. I mean the basic IPv6 architecture foresees ICMPv6 Router Advertisements for this purpose, right?
Main Approaches/Alternatives: essentially we’ll have to decide between “configure everything incl. default gateway” in a static manner (like suggested in our IPv6 hardening guides, pls keep this in mind also) or “configure the addresses in a static manner but receive default route via RAs”. The placement of the VPN devices will be behind the border routers but in front of the firewalls (for certain reasons there’s different firewalls & vendors in parallel [not in row/behind each other]) so they would receive their RAs from the internal interfaces of the border routers (with the A-flag in the PIO cleared, of course. in this specific case it could even make sense to clear the L-flag but we’ll discuss this in another post…).
Advantages/Disadvantages of a “static address but default route via RAs” approach include:
- Pro: It’s the natural way within the IPv6 (architecture) universe.
- Pro: going with a static default route usually means one has to disable local RA processing on a system (to avoid undesired interactions if there’s still RAs turning up). This can be operationally expensive (and not an easy task on an appliance anyway) and is “against the nature of IPv6”. In short: avoid this.
- Pro: RA-based strategy allows for more flexible handling in the future.
- Con: human nature likes “being in control”. Going with an “everything is configured statically” satisfies this need…
- Con: having the border routers emit RAs on their internal interfaces means that all other devices in the segment (the firewalls and other elements) receive those, too. One has to test if this causes any undesired behavior/interactions and maybe one has to adjust logging (“drop these without logging them”).
Current proposal: we’ll go with a “static address plus default route via RA” configuration approach on the VPN devices. I’ll keep you posted as for our the things we see…
I hope the above could make clear that there’s a number of things to consider in the course of seemingly small steps while on the journey to IPv6. As always we’re happy to receive feedback of any kind.
If you want to discuss IPv6 stuff in person with us/me you might join the Troopers IPv6 Security Summit (I’ll give a talk on “Remote Access and Business Partner Connections in IPv6 Networks” there…) or you could attend my “IPv6 in Enterprise Networks” training in April.
Have a great week everybody