A Regex for IPv6 Addresses
Stephen Ryan at Dartware has produced a regular expression (regex) that can be used to match any legal format of an IPv6 address. This is useful for determining whether a particular string is, in fact, a legal IPv6 address.
A quick search of Google for "IPv6 regex" or "regex for IPv6" gives lots of possibilities, many of which work for many cases. It's vastly harder to make one that works for all legal addresses.
The regex for IPv6 below differs from those others in that it handles all the cases specified by RFC4291, section 2.2, "Text Representation of Addresses", and in particular, it offers the ability to recognize an IPv4 dotted quad address at the end.
Here's Stephen's regex for IPv6 addresses. (Note: this should all be on one line.)
Code: |
/^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/ |
The regex matches the following IPv6 address forms. Note that these are all the same address:
fe80:0000:0000:0000:0204:61ff:fe9d:f156 // full form of IPv6
fe80:0:0:0:204:61ff:fe9d:f156 // drop leading zeroes
fe80::204:61ff:fe9d:f156 // collapse multiple zeroes to :: in the IPv6 address
fe80:0000:0000:0000:0204:61ff:254.157.241.86 // IPv4 dotted quad at the end
fe80:0:0:0:0204:61ff:254.157.241.86 // drop leading zeroes, IPv4 dotted quad at the end
fe80::204:61ff:254.157.241.86 // dotted quad at the end, multiple zeroes collapsed
In addition, the regular expression matches these IPv6 forms:
::1 // localhost
fe80:: // link-local prefix
2001:: // global unicast prefix
The attached Perl script tests the regular expression against sample IPv6 addresses, both good and bad. The program also compares against a couple of sample regex for IPv6 expressions I found via a quick Google search, one of which was from somebody complaining about errors in other regular expressions found via quick Google searches. Â It prints a dot for each succesfully matched (or not matched!) potential address, 1 if one of the other regexes failed it, or 2 if both failed it. It also prints a big nasty failure message if it matches when it shouldn't, or vice versa.
Of course, it's possible that we've missed a case here, so if you find a good counterexample, please send us a note at [email protected]
Get the code:
We have set up a public Mercurial (hg) repository that has the files associated with the project. You can retrieve them from our BitBucket.org repository. The repository includes:
- The Javascript that does the heavy lifting of comparing to the regex and creating the best representation of the address.
- Test cases both in Perl and Javascript. These contain nearly 500 valid and invalid addresses for developing your own regex.
Try it now: The IPv6 Address Validation page lets you test your IPv6 addresses against the validator right away.
License
IPv6 Regex by Dartware, LLC is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.