|Title||Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists|
|Publication Type||Conference Proceedings|
|Year of Conference||2018|
|Authors||Gasser O, Scheitle Q, Foremski P, Lone Q, Korczynski M, Strowes SD, Hendriks L, Carle G|
|Conference Name||ACM Internet Measurement Conference 2018|
|Conference Location||Boston, MA, USA|
Network measurements are an important tool in understanding the Internet. Due to the expanse of the IPv6 address space, exhaustive scans as in IPv4 are not possible for IPv6. In recent years, several studies proposed to use target lists of IPv6 addresses, called hitlists. In this paper, we show that addresses in IPv6 hitlists are heavily clustered. We present novel techniques that allow to push IPv6 hitlists from quantity to quality. We perform a longitudinal active measurement study over 6 months, targeting more than 50 M addresses. We develop a rigorous method to detect aliased prefixes, which identifies 1.5% of our prefixes as aliased, pertaining to about half of our target addresses. Using entropy clustering, we group the entire hitlist into just 6 distinct addressing schemes. Furthermore, we perform client measurements by leveraging crowdsourcing. To encourage reproducibility in network measurement research and serve as a starting point for future IPv6 studies, we publish source code, analysis tools, and data.