So I have a list of UK postcodes and I want to find out the latitude and longitude of that postcode to plot it on to a map. There are numerous ways to do this, including:
1. Visiting Google Maps
2. Entering the postcode in to the search box
3. Right-clicking on the little point on the map and selecting “What’s here?”
4. Copying and pasting the lat and long that magically appears in the search box.
That’s nice, quick and easy BUT what if I’ve got hundreds or thousands of postcodes to search? I can’t spend the whole sunny weekend doing the same procedure can I?
I’m self-learning Tableau at the moment and it seems to have great location support…if you provide it with US addresses (gnrrr), but I wanted to find a way of plotting a series of points on a map of the UK derived using their addresses. A bit of Google searching led me to UK Postcodes (set up by Stuart Harrison and based on Ordnance Survey OpenData) , a site that lets you enter a UK postcode and returns lots of information about that location (e.g. long, lat, county and right down to district and ward) and what made me excited was that the site had an API allowing you to pass it a postcode via URL (e.g. http://www.uk-postcodes.com/postcode/NW18TQ.json) and it would output the meta data it in either XML, CSV, JSON or RDF for you. PERFECT!
After a further read around the site, I found that Edd Robinson had created a library acting as a wrapper for the API which I could import in to my own Python project. And so, without further ado here is my Python code:
from postcodes import PostCoder
f = open(‘Extracted_Data_from_Postcodes.csv’, ‘w’)
i = 0
pc = PostCoder()
loc_lat = ""
loc_long = ""
with open(‘Postcodes_to_Extract.csv’, ‘rU’) as g:
reader = csv.reader(g)
for row in reader:
#Col 0 = ID :: Col 1 = Postcode
result = pc.get(str(row))
if str(result) != "None":
location = result[‘geo’]
for key, value in location.items():
if (key == "lat"):
loc_lat = value
if (key == "lng"):
loc_long = value
#ID, Postcode, Lat, Long
write_to_file = str(row) + ", " + str(row) + ", " + str(loc_lat) + ", " + str(loc_long) + "n"
#Add the iteration count to output to screen to see how far we are up to.
print str(i + 1) + ", " + write_to_file
#If there is a problem translating the postcode, output "BROKEN" to file to manually check.
write_to_file = "BROKEN, " + str(result) + ", " + str(row) + ", " + str(row) + "n"
print str(i + 1) + ", BROKEN, " + str(row) + ", " + str(row)
i = i + 1
My input file looked like this:
109484, SG5 4PF
109486, MK44 2DB
109487, LU4 9UJ
109488, LU6 1RE
109489, MK43 8DY
109490, MK45 5JH
109491, MK44 3QD
109492, MK45 3BX
109493, MK17 9QL
109494, MK43 9JT
And my screen output looked like this:
1, BROKEN, 109484, SG5 4PF
2, 109486, MK44 2DB, 52.214741, -0.461977
3, 109487, LU4 9UJ, 51.927696, -0.500824
4, 109488, LU6 1RE, 51.879322, -0.563452
5, 109489, MK43 8DY, 52.164539, -0.623209
6, 109490, MK45 5JH, 51.982376, -0.495111
7, 109491, MK44 3QD, 52.137440, -0.377085
8, 109492, MK45 3BX, 52.080341, -0.446214
9, 109493, MK17 9QL, 51.989906, -0.619803
10, 109494, MK43 9JT, 52.095955, -0.528752
Afterwards, I was then able to manually check out why SG5 4PF failed, actually, I’m not sure why it failed but I was able to correct the lat & long via a Google Map search. Exactly what I needed with minimal effort and a chance to flex my muscles with Python again. I then imported the CSV in to Tableau, but that’s the subject of a future blog post…