Geolocation using Python
By Kalyani Rajalingham, published 09/02/2021 in Tutorials
Geolocating is the process of retrieving location-related information about a given IP address. And yes! It can be done using Python! So, let’s get right to it.
The first thing to do is to recover the html code from the webpage using the following:
import requests
from bs4 import BeautifulSoup
url = 'https://tools.keycdn.com/geo'
url2 = requests.get(url).text
I have chosen this particular website because it uses GET requests, and so this code will work. However, if you choose another website with a POST request, then the code will have to be modified. Now, we need an input from the user - what is the IP address that they wish to look up?
input = input("What IP do you want to enter?: ")
Next, we create a new url. On the website, when you input an IP address, namely 8.8.8.8, it creates a get request of the following form:
https://tools.keycdn.com/geo?host=8.8.8.8
Now, if we look at this address, what we see if that the IP address comes right after the host and the equal sign. So let’s create one with the input we got (since we just asked the user for an input of IP address):
url_new="https://tools.keycdn.com/geo?host="+input
Ok, so now that we’ve got the new url. Why not send a get request? So, to do that, we do the following:
session = requests.Session()
val = session.get(url_new).text
At this point, we’ve got a whole html response page stored in“val”. What does this mean? It means that we’ve got all the information about the latitude, longitude, and location information stored in “val”. We just need to retrieve it.
How do we retrieve it?
fin = BeautifulSoup(val, "html5lib")
fin2 = fin.find_all("dd", attrs={'class':'col-8 text-monospace'})
In this case, all location-related information is stored under a “dd” tag with a class called col-8 text-monospace. So, I’m asking BeautifulSoup to retrieve it all for me!
And so BeautifulSoup has retrieved, but it has stored it as follows:
[<dd class="col-8 text-monospace">United States (US)</dd>,
<dd class="col-8 text-monospace">North America (NA)</dd>,
<dd class="col-8 text-monospace">37.751 (lat) / -97.822 (long)</dd>,
<dd class="col-8 text-monospace">2021-02-04 13:23:59 (America/Chicago)</dd>, <dd class="col-8 text-monospace">8.8.8.8</dd>,
<dd class="col-8 text-monospace">dns.google</dd>,
<dd class="col-8 text-monospace">GOOGLE</dd>,
<dd class="col-8 text-monospace">15169</dd>]
Now that’s great! But how do we get the values inside of the tags? We can use the get_text() function.
for value in fin2:
value = value.get_text()
print(value)
The latter will print out the following for 8.8.8.8:
United States (US)
North America (NA)
37.751 (lat) / -97.822 (long)
2021-02-04 13:26:23 (America/Chicago)
8.8.8.8
dns.google
GOOGLE
15169
And that’s how it’s done!
Happy Coding!