import osmnames to postgres

download links & more info on osmnames: https://github.com/osmnames/osmnames

Data import:

create table if not exists geonames (
	name text,
	alternative_names text,
	osm_type text,
	osm_id bigint,
	class text,
	type text,
	lon numeric,
	lat numeric,
	place_rank integer,
	importance numeric,
	street text,
	city text,
	county text,
	state text,
	country text,
	country_code text,
	display_name text,
	west numeric,
	south numeric,
	east numeric,
	north numeric,
	wikidata text,
	wikipedia text,
	housenumbers text
);
delete from geonames;

create table if not exists housenumbers (
	osm_id bigint,
	street_id bigint,
	street text,
	housenumber text,
	lon numeric,
	lat numeric
);
delete from housenumbers;

COPY geonames FROM 'S:/ome/location/planet-latest_geonames.tsv' DELIMITER E'\t' CSV HEADER;
COPY housenumbers FROM 'S:/ome/location/planet-latest_housenumbers.tsv' DELIMITER E'\t' CSV HEADER;

Note that with v2 of osmnames there may be duplicates (https://github.com/OSMNames/OSMNames/issues/162)

therefore it is worth cleaning it up:

select distinct on (osm_id) * into geonames_clean from geonames;
select distinct on (osm_id) * into housenumbers_clean from housenumbers; 

Add some pks & indexes, so we can speed stuff up a bit:

ALTER TABLE public.geonames_clean
    ADD PRIMARY KEY (osm_id);
ALTER TABLE public.housenumbers_clean
    ADD PRIMARY KEY (osm_id);
CREATE INDEX geonames_clean_country_code_idx
    ON public.geonames_clean USING btree
    (country_code COLLATE pg_catalog."default");

Data extract for a country & glue house numbers:

select
	gn.*,
	hn.osm_id as hn_osm_id,
	hn.street_id as hn_street_id,
	hn.street as hn_street,
	hn.housenumber as hn_housenumber,
	hn.lon as hn_lon,
	hn.lat as hn_lat
into
	nl_geonames_complete
from
	geonames_clean gn 
	left outer join housenumbers_clean hn on gn.osm_id = hn.street_id
where
	gn.country_code = 'nl';

Finally output the extract to a tab separated file:

copy (select * from nl_geonames_complete) to 'S:/ome/location/nl_geonames_complete.tsv' with CSV HEADER DELIMITER E'\t';
blog comments powered by Disqus