With the opening up of UK Postcode data on 1st April, I've taken the time to create a new zipcodes.uk.mysql file which includes all UK postcode, "states", latitude, longitude,tz,dst and country data. City data is provided where this can be deduced from postal districts, but in the vast majority of cases this data is not available.

However, the zipped file is 12mb, so how do I send it to the maintainers?

Sources are detailed in the header, which I've included below.

---------------------------
-- This data is made available through the licensing terms of OS OpenData.
-- http://www.ordnancesurvey.co.uk/oswebsite/opendata/licence/docs/licence.pdf
-- Contains Ordnance Survey data (c) Crown copyright and database right 2010
-- Contains Royal Mail data (c) Royal Mail copyright and database right 2010

-- UK postcodes and Transverse Mercator Projection coordinates from OS OpenData.
-- Equations fo conversion of Transverse Mercator Projection to latitude, longitude from equations in http://www.ordnancesurvey.co.uk/oswebsite/gps/docs/convertingcoordinates...

-- Postal districts to "state" (i.e. UK county) info from http://en.wikipedia.org/wiki/List_of_postcode_districts_in_the_United_Ki...
-- Additional Place/Postal Districts from www.geonames.org

-- City has been provided where a Postal District is used by just one place.
-- Additional city data not possible as any 1 postal district may cover more than 1 place. E.g. WD4 is a Watford postcode used for 4 nearby villages. You'd have to know for each WD4 postcode, what village the address was in.

-- Created by Duncan Worrell / Webgreen Systems Limited / www.webgreen.co.uk

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

geodaniel’s picture

It's great that this has already been made available, good work.

That said, instead of including the whole file in the module download it would probably make sense to point to a geocoding service (e.g. UK Postcodes) that can host this data centrally. Perhaps this full list could then be made available as a separate package that people could download and import into their Drupal site if they needed it?

dunx’s picture

Sure for some, the current UK import for the Location module based on postal districts may be close enough. I've contacted bdragon to see what the maintainers want to do.

There's lots of geocoding services and Google's works really well, but that's not the same as having the data available direct to Location. I do agree, this should be an optional download though!

I still don't have a source for "the place where the postcode is". UK-Postcodes doesn't have it and even Google gets it wrong. Try both for WD4 9BS, which is in Chipperfield. UK-P has it down as in the ward of "Bovingdon, Flaunden and Chipperfield" and maps.google has it down as "Kings Langley".

I've also looked at the Gazetteer data released under OpenData and that doesn't contain the necessary info either.

martinbutt’s picture

Hi dunx. Has there been any progress with this?

Peng.Pif’s picture

Subscribe.

ludo1960’s picture

Subscribe +1

hutch’s picture

Before this thread dies altogether, @dunx if you want, contact me via my Drupal contact form, I want to get the UK data sorted, perhaps we can work out a way to get the data to me so I can at least fix the existing 'first part only' entries.
I am considering wether to create another module eg 'location_extras' or somesuch for things like this or if that is not feasible perhaps hosting the data or at least shrinking it without losing data. Something must be done ;-)
see #830352: postcode database files for au & be have lat/lon reversed. #7

DanielJohnston’s picture

Subscribing. Might be worth posting the file on your own website or some file hosting site somewhere? Thanks for your work on this.

ludo1960’s picture

Yeah, I agree completely! Double thanks for your efforts!

martinbutt’s picture

I've redone the conversion and zipped up the sql file in this torrent: http://www.getaheadweb.co.uk/torrents/zipcodes.uk.mysql.zip.torrent Please download and seed!

hutch’s picture

OK, I am attempting to set up my server so it can join in, no gui of course so I'm looking at btdownloadheadless program
Not very familiar with this protocol

ludo1960’s picture

Maybe better off with Vuze, get it here http://www.vuze.com/ and many thanks for your work Martin!

Good comparison here http://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients

hutch’s picture

I managed to download the file and have had a look, there is a lot of data there but the second part of the postcodes are not there, without them the search won't work ;-(

martinbutt’s picture

hutch’s picture

Fetched the file in #13 and inserted it into an empty copy of zipcodes table, went in without error.
There are now 1,750,752 records, Wow!

There are some tweaks needed, here is what I've figured out so far:

Some zipcodes have spaces, some don't, it needs to be consistent,
I vote for no spaces and user input should be filtered accordingly
with something like

$zip = preg_replace("/\s+/", '', $zip);

Not sure right now where that code would go.

The old Counties of Montgomeryshire, Radnorshire and Brecknockshire are now part of Powys
UPDATE `zipcodes` SET `state`='Powys' WHERE `state`='Montgomeryshire'
UPDATE `zipcodes` SET `state`='Powys' WHERE `state`='Radnorshire'
UPDATE `zipcodes` SET `state`='Powys' WHERE `state`='Brecknockshire'

There may be others, I just happen to know about these.
eg Avon versus Somerset and West Midlands versus Warwickshire

There are some empties
# cat zipcodes.uk.mysql | grep "''" | wc -l
60341

But a lot of good uns too ;-)
# cat zipcodes.uk.mysql | grep -v "''" | wc -l
1690414

So about 3.5% empty, not so bad!

I ran
SELECT DISTINCT(`state`) FROM `zipcodes` ORDER BY `state`

Got 132 unique entries
Comparing this to the province list in location.uk.inc which has 232 entries so a lot of reconciling to do there.

province list includes Northern Ireland, Scilly, Wight and Jersey

Damn good start though :-)

martinbutt’s picture

Hi hutch,

Thanks for the feedback. These problems would be best fixed in my script.

For the counties, there seems to be a bit of an issue around which list to use. See: http://www.abcounties.co.uk/counties/confusion.htm . I'm not sure if we can have multiple entries per postcode... If we can, then I can put a line in for historical, administrative and ceremonial counties.

Most of the empties can be fixed with some tweeking, which will also help the deficit between location.uk.inc too.

Cheers

hutch’s picture

I just had a look at the abcounties link, what a mess! I can't immediately see where we could put alternative names, the province_list needs the three-letter code to be unique and need to be what works with geocoding.

Anyway if you can get the gaps filled that would be a good step forward ;-)

martinbutt’s picture

hutch’s picture

Wahey, no empties anywhere!
This is the one to go for I reckon.

alanpeart’s picture

I just wanted to express my thanks for this script. It took me an embarrassingly long time to figure out that some of my problems implementing a UK postcode search solution on a website was due to limited postcode data shipping with Location. If you want to make the new script a torrent I'll be happy to seed it.

wozischra’s picture

I just made a line count

wozischra:~/codepoint-open/data/CSV# wc *

on the current Code-Point Open CSV files (august 2010) and I got 1.696.432 lines as result.
In the zipcodes.uk.mysql.20100715.gz there are 1.747.822 lines. How could that be possible? I think the amount can increase in newer versions of Code-Point Open because Royal Mail adds new postcodes from time to time but it shouldn't decrease.

Apart from that many thanks for your work :)

gargoyle’s picture

FileSize
1.78 KB
1.52 KB

For anyone else that is interested in having a look at this kind of data, I have attached two scripts I used when I wrote a simple geo-coding routine a little while ago. You can use these to work on the Code Point datasets you get direct from OS.

The import.txt is a PHP script that will import the Code Point data from the given path (See first few lines). It only imports the postcode (into two columns) and keeps the location info as easting and northings, since I am only using it for a very rough "Find places within xxx miles of this place" type of lookups.

The second file (class.php) is my lookup code. If a full postcode is not supplied, then it does the lookup by picking the center of the supplied area.

Oh, and the $postcode = $filter->filter($postcode) line just splits the postcode into a prefix and suffix by the rules given here:- http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastand... Basically, the second half of the postcode is always 3 characters.

Just sharing - hope someone finds it useful.

gittosj’s picture

Version: 6.x-3.0 » 6.x-3.x-dev
Component: Data update » Code
FileSize
9.49 KB

Many thanks for putting the database together - a terrific achievement. I've had to tweak my location/supported/location.uk.inc to get everything working correctly - I've stripped the Google geocode logic and taken parts of the file from others and then added in hutch's preg_replace to strip whitespace from the postcode as its compared to the database list. I was stumbling across an intermittent bug where certain postcodes worked with space, eg OX2 7EJ, whereas one 400 yards away like OX2 7NN had to be submitted stripped as OX27NN to work.

I've tweaked the code in the same way a schoolboy might repair an aircraft engine: lots of interest and enthusiasm but almost no understanding of what I'm doing and although its now sputtered into life, it would be good to have a properly qualified mechanic check it before is allowed to take to the air - attached below:

hutch’s picture

There is a stray 'e' on line 63, I will go through it more thoroughly as soon as I can.

hutch’s picture

FileSize
4.92 KB

Attached is an update for supported/location.uk.inc
I have been using a similar file for some months, the only differences is some tidyup and some sanity checks to minimalise WSODs.

martinbutt’s picture

Sorry I have been absent for a little while. I definitely see this as a triumph for the open source community, it's nice to give something back.

wozischra, I appended to original dataset to the beginning of the new one, so that it would work with partial postcodes, e.g. "SW1". I don't know if I needed to do this, but it seemed to do no harm.

To everyone else that is keeping this going, thank you.

mos2710’s picture

Can you please describe how to implement this in laymen's terms? I know I have a need of it, but no idea how to implement it in practice.

Thanks
Mos2710

hutch’s picture

Well, you can either learn how to use patch, see http://drupal.org/patch/apply or you can work it out manually by downloading supported/location.uk.inc from location module and http://drupal.org/files/issues/location-uk-765564-24.patch and opening them in a decent text editor and applying the patch by hand.
Hint: the lines prepended with '-' are to be removed and the lines prepended with '+' are to be added.

gittosj’s picture

Could the patch here: http://drupal.org/node/765564#comment-3573794 be committed to dev?

Not sure if it will require some config options to switch from Google geocoding to postcode database but its a huge improvement on Google - much faster and more reliable. I'm afraid I don't have the expertise to hack / adapt it. Does anyone have the time?

ankur’s picture

The database directory, at last check, is currently around 8 MB in size. My preference would be to avoid adding another 12 MB and to maybe host this somewhere else, with documentation on an admin screen (or even just the location module project page) about where additional .mysql files can be downloaded.

It might also help us work around having to investigate the issue of whether or not the licensing for this data is compatible with it being on Drupal.org's contrib module repository hosting.

hutch’s picture

In my experience geocoding on postcodes is better when done direct to Google, their support for UK postcodes has improved significantly over the last year or so.

Luttappi’s picture

Title: Full UK zipcodes mysql file » Full UK zipcodes mysql file using Google
Version: 6.x-3.x-dev » 7.x-3.x-dev

Hi Hutch,

Could you please explain, what you meant by "doing direct with Google"? I have got geocoding working, but only for few post codes and I can't get any of the links listed here (torrent as well as dropbox) working to download the rest of the post codes. My zipcode table only contains less than 3,000 entries. I would love to give this task to google, if there is an option available.

Many Thanks,

hutch’s picture

I have a development instance of location 7.x-3.x-dev installed without the zipcodes table filled in at all, I am getting good geocoding on UK addresses. I was just stating my opinion that UK postcode geocoding works better without any entries than with them. This might not be true of other countries.
I have of course enabled geocoding for UK under Geocoding Options.

podarok’s picture

Status: Needs review » Active

#1931088: [META] Fixing tests tests were broken, so triggering to active

podarok’s picture

Assigned: dunx » Unassigned
Status: Active » Needs review

bot

Status: Needs review » Needs work

The last submitted patch, location-uk-765564-24.patch, failed testing.

kevster’s picture

In order to fix the issue with whether a user enters a UK postcode with or without a space (really ugly UX to throw up an error message when they dont add a space) I updated the location.uk.inc here around line 267:

//$result = db_query("SELECT latitude, longitude FROM {zipcodes} WHERE country = '%s' AND zip = '%s'", $location['country'],preg_replace('/\s+/', '',$location['postal_code']));
  $result = db_query("SELECT latitude, longitude FROM {zipcodes} WHERE zip = '%s'",truncate_utf8($location['postal_code'],3));

and then around 302

//$res = db_query("SELECT * FROM {zipcodes} where country = '%s' AND zip = '%s'", $location['country'],preg_replace('/\s+/', '',$location['postal_code']));
  $res = db_query("SELECT * FROM {zipcodes} where country = '%s' AND zip = '%s'", $location['country'],truncate_utf8($location['postal_code'],3));

This just takes the outcode part and ignores anything after first 3 chars.

And then in handlers\location_views_handler_filter_proximity.inc around line 21

//$result = db_query("SELECT zip FROM {zipcodes} WHERE zip = :zip LIMIT 1", array(":zip" => $form[$key]['postal_code']["#value"]));
$postcode_outcode = truncate_utf8($form[$key]['postal_code']["#value"],3);
      $result = db_query("SELECT zip FROM {zipcodes} WHERE zip = :zip LIMIT 1", array(":zip" => $postcode_outcode));

This works for me as I don't need it super accurate. All the postcodes seem to be stored with spaces in.

andersi’s picture

I am using Location 7.x-3.7

I find myself feeling some confusion with regards to obtaining lat / long information from postcodes for GB in Location.

For my project I am not interested in using any geocoding services; the postcode database file has all the information within it that I require so there is from what I can see no need to use Google's Geocode service.

If I have "none" selected within:

Admin > Config > Content Authoring > Location

for UK I find that I have the lat / long is set to 0.00000 and that I then have to manually go in and select "autofill empty locations" from

Admin > Config > Web services

Upon doing this lat / long is correctly populated.

That doesn't help with 'real time' population of the lat / long data.

I've followed other posts that suggest that one needs to select a geocode service; when I do this the lat and long are correctly populated in 'real time' however I have no way to tell if I am using the details from the postcodes imported from the location sub folder or whether the lat / long data is being obtained from Google. If it's coming from Google then I and others that may be using it may be in breach of Googles Terms and Conditions; yes a link to the T&C is kindly included but as stated before; I'm wanting the lookups to be done straight from the table; not from Google. It doesn't follow to me that I should be selecing Geocoding from a Google when what I'm actually wanting to do is "Geocode from the imported postcode file".

I would have logically presumed that within:

Admin > Config > Content Authoring > Location

There would be some way to specify from the imported database file for those countries that have postcode lookup file included with Location.