This is code for the paper:
@inproceedings{toposcope,
title={TopoScope: AS Relationship Inference with Ensemble Learning and Bayesian Networks},
author={Jin, Zitong and Li, Zhenyu and Liu, Yong and Li, Zhen and Zhang, Zhi-Li},
booktitle={Proceedings of the Internet Measurement Conference},
year={2020}
}The source code was originally forked from the Toposcope Repository by Ziton Jin.
I am maintaining this repository for my own research purposes, making modifications as needed.
TopoScope is an AS relationship inference algorithm that combines ensemble learning and Bayesian networks. In addition, TopoScope also supports hidden link inference. You can learn more about TopoScope in IMC 2020.
TopoScope runs with python 3.6.8. To ensure future reproducibility, I am using docker on Ubuntu 18.04 to be able to run the code in an older version of Python that it was written in.
$ docker build -t research/toposcope .
$ docker run -it --rm -v $(pwd):/app -w /app research/toposcope bashInstall Python dependencies
$ pip install --user -r requirements.txt
Download AS to Organization Mapping Dataset from CAIDA
https://www.caida.org/data/as-organizations/
Download PeeringDB Dataset from CAIDA
Before March 2016: http://data.caida.org/datasets/peeringdb-v1/
After March 2016: http://data.caida.org/datasets/peeringdb-v2/
Download Prefix2AS Dataset from CAIDA
http://data.caida.org/datasets/routing/routeviews-prefix2as/
__Prepare BGP paths __
Use BGP RIB dumps to prepare BGP dumps. I used PCH, RIPE RIS and RouteViews.
The ASes on each BGP path should be delimited by '|' on each line, for example, AS1|AS2|AS3.
Parse downloaded BGP paths
$ python3 uniquePath.py -i=<aspaths file> -p=<peeringdb file>
# e.g. python uniquePath.py -i=aspaths_2019.txt -p=peeringdb_2019.json
# Output is written to 'aspaths.txt'.Run AS-Rank algorithm to bootstrap TopoScope
$ perl asrank.pl aspaths.txt > asrel.txtRun Toposcope
$ python3 toposcope.py -o=<ASorg file> -p=<peeringdb file> -d=<temporary storage folder name>
#e.g. python toposcope.py -o=asorg_2019.txt -p=peeringdb_2019.json -d=tmp/
# Output is written to 'asrel_toposcope.txt'.Output data format
<provider-as>|<customer-as>|-1
<peer-as>|<peer-as>|0
<sibling-as>|<sibling-as>|1
Hidden link inference
The ASes on each BGP path should be delimited by '|' on each line, followed by '&' and prefix, for example, AS1|AS2|AS3&prefix.
Parse downloaded BGP paths
$ python3 cleanPrefix.py -i=<asprefix file> -p=<peeringdb file>
# e.g. python cleanPrefix.py -i=asprefix_2019.txt -p=peeringdb_2019.json
# Output is written to 'fullVP.txt', 'aspaths0.txt', 'aspaths1.txt', 'asprefix0.txt', 'asprefix1.txt', 'chooseVP0.txt' and 'chooseVP1.txt'.Run AS-Rank algorithm to bootstrap TopoScope
$ perl asrank.pl aspaths0.txt > asrel0.txt
$ perl asrank.pl aspaths1.txt > asrel1.txtYou can also use basic inference result of TopoScope instead of ASRank to finish this step.
Find miss edges and choose ASes similar to full VPs
$ python3 getMissEdge.py
# Output is written to 'triplet_miss0.txt' and 'triplet_miss1.txt'.
$ python3 chooseAS.py
# Output is written to 'chooseAS.txt'.Run Toposcope to find hidden links
$ python3 newlink.py -f=<prefix2AS file>
# e.g. python newlink.py -f=pfx2as_2019.txt
# Output is written to 'futher0.txt' and 'futher1.txt'.Infer AS relationships of hidden links
$ python3 linkRel.py
# Output is written to 'asrel_hidden.txt'Output data format
<provider-as>|<customer-as>|-1
<peer-as>|<peer-as>|0
You can contact me at thdmar002[at]myuct.ac.za if you have any questions.