Active Geolocation Project

Command-line Measurement

In addition to the web-based demo and measurement client that you may already have seen, we have a command-line tool for taking measurements. It is more accurate than the browser-based measurement, and it can be used on computers that you only have remote shell access to. However, it doesn't work on Windows (if you know how to time TCP handshakes with high accuracy on Windows, we’d be glad to take your patches), and it is less user-friendly.

When you run the tool, you need to tell it the location of the computer it’s running on, and whether it’s using a network proxy. It will communicate with this server, and with roughly 200 “landmark” servers around the world (the “anchor” servers of the RIPE Atlas). We record all the information you provide, the round-trip time measurements, and the public IP address of the computer running the tool (if you are using a proxy, this will be the proxy’s address).

CMU’s Office of Research Integrity and Compliance requires us to warn you that the data we collect is considered “personally identifiable information” under US law, and that the university is legally required to store all research data for at least three years. We take care to protect it:

However, participating in this study could still harm you, especially if our database is stolen. The thief would be able to associate your IP address with your location, more precisely than they could if they only knew your IP address. If there’s someone specific who must not find out where you live, we recommend you don’t send us data collected by a computer in your house.

Also, for legal reasons, please don’t send us any data if you are younger than 18 years old.

If you want to ask questions about this study, before or after sending us data, please contact Zachary Weinberg at zackw@cmu.edu, or Nicolas Christin at nicolasc@cmu.edu.

If you have concerns about this study, or questions about your rights as a research participant, you can contact the Office of Research Integrity and Compliance directly, at irb-review@andrew.cmu.edu or 412-268-1901 or 412-268-5460.

Determining the computer’s location

The best way to get the computer’s location is with GPS; most smartphones can take a GPS reading. The iPhone ships with a “Compass” utility that, among other things, shows you your latitude and longitude (in degrees, minutes, and seconds; you will have to convert). For Android, you need a third-party app: we suggest “My GPS Coordinates”. If you don’t have a GPS-capable phone or dedicated receiver, or you can’t go to where the computer is and take a GPS reading, the next best option is to look up the postal address of the building in an address-to-location service, such as latlong.net.

Don’t look the computer’s IP address up in a geolocation service to get its latitude and longitude, because one of the goals of this project is to audit the accuracy of those services.

Running the command-line tool

The measurement software is only available as a Git checkout. It has two components, one written in Python and the other in C. The Python component has no dependencies outside the standard library, and is known to work with versions 2.7, 3.4, and 3.5 of the interpreter. The C component is self-contained (not a Python module), depends only on standard ISO C and POSIX interfaces, and is known to work on recent versions of Linux, FreeBSD, and OSX. It should work on any modern Unix. (If you know how to time TCP handshakes with high accuracy on Windows, we’d be glad to take your patches.) You are encouraged to read the source code of both components before running them.

Once you have the latitude and longitude of the computer, open a shell window and type the following commands:

git clone https://github.com/zackw/active-geolocator
cd active-geolocator/measurement-client
./configure
make
./probe --latitude=<LATITUDE> --longitude=<LONGITUDE>

where <LATITUDE> and <LONGITUDE> are the latitude and longitude you looked up earlier, as decimal degrees. Use negative numbers for south of the equator / west of Greenwich. If you don’t want the data you submit to be included in any future publication of a redacted version of our database, append --no-publication to the probe command.

The final probe command may take as much as an hour to run, but 5 to 20 minutes is more typical. It reports its progress once a minute. The results are automatically uploaded to the project website, and are also written to a file probe-result-YYYY-MM-DD-N.json in the working directory.

If any of the above commands fail, please file an issue. We will need to see the unedited, complete output of the commands up to the point where they failed, and we will also need to know which operating system and compiler you are using. If a file named config.log exists in the working directory, please attach it to the issue (you may have to rename it config.txt first, because Github). Do not attach any probe-result-*.json files to Github issues; this would link your physical location to your Github identity.

If you have access to computers in several cities, please do run the software on all of them. It’s not as helpful to run it on more than one computer in the same city, unless their routes to the Internet backbone are very different (for instance, if one gets its connectivity from a residential ISP, and another from a cellphone system).

If you have access to VPN or SOCKSv5 proxies, and you can find out the accurate location of the proxy as well as the client host, please do run the measurement through the proxy. For VPNs, activate the VPN as the default route, then do:

./probe --latitude=<CLIENT LAT> --longitude=<CLIENT LONG> \
        --proxy-latitude=<PROXY LAT> --proxy-longitude=<PROXY LONG>

For SOCKS you must explicitly state the proxy’s address. Authentication is not supported.

./probe --latitude=<CLIENT LAT> --longitude=<CLIENT LONG> \
        --proxy-latitude=<PROXY LAT> --proxy-longitude=<PROXY LONG> \
        --socks5=<HOST:PORT>