pgsql - list tables with column names

Recently I had to automate postgis export via ogr2ogr. In order to list all the sptatial table a following query came in handy: 

select
	t.table_name,
	c.column_name
from
	information_schema.tables t
	left outer join information_schema.columns c on t.table_name = c.table_name
where
	t.table_schema = 'public' and
	t.table_name like 'vw_%' and
	c.column_name = 'wkb_geometry'

and the actual command generation

select
	'ogr2ogr -f "ESRI Shapefile" d:\temp\' || t.table_name || '.shp PG:"host=myhost user=myloginname dbname=mydbname password=mypassword" "' || t.table_name || '"'
from
	information_schema.tables t
	left outer join information_schema.columns c on t.table_name = c.table_name
where
	t.table_schema = 'public' and
	t.table_name like 'vw_%' and
	c.column_name = 'wkb_geometry'

expose postgres to outer world

fire wall

Whatever you need for firewall...

pg_hba.conf

host    all             all             XXX.XXX.XXX.XXX/0           md5

postgresql.conf

listen_addresses = 'localhost,XXX.XXX.XXX.XXX'

import osmnames to postgres

download links & more info on osmnames: https://github.com/osmnames/osmnames

Data import:

create table if not exists geonames (
	name text,
	alternative_names text,
	osm_type text,
	osm_id bigint,
	class text,
	type text,
	lon numeric,
	lat numeric,
	place_rank integer,
	importance numeric,
	street text,
	city text,
	county text,
	state text,
	country text,
	country_code text,
	display_name text,
	west numeric,
	south numeric,
	east numeric,
	north numeric,
	wikidata text,
	wikipedia text,
	housenumbers text
);
delete from geonames;

create table if not exists housenumbers (
	osm_id bigint,
	street_id bigint,
	street text,
	housenumber text,
	lon numeric,
	lat numeric
);
delete from housenumbers;

COPY geonames FROM 'S:/ome/location/planet-latest_geonames.tsv' DELIMITER E'\t' CSV HEADER;
COPY housenumbers FROM 'S:/ome/location/planet-latest_housenumbers.tsv' DELIMITER E'\t' CSV HEADER;

Note that with v2 of osmnames there may be duplicates (https://github.com/OSMNames/OSMNames/issues/162)

therefore it is worth cleaning it up:

select distinct on (osm_id) * into geonames_clean from geonames;
select distinct on (osm_id) * into housenumbers_clean from housenumbers; 

Add some pks & indexes, so we can speed stuff up a bit:

ALTER TABLE public.geonames_clean
    ADD PRIMARY KEY (osm_id);
ALTER TABLE public.housenumbers_clean
    ADD PRIMARY KEY (osm_id);
CREATE INDEX geonames_clean_country_code_idx
    ON public.geonames_clean USING btree
    (country_code COLLATE pg_catalog."default");

Data extract for a country & glue house numbers:

select
	gn.*,
	hn.osm_id as hn_osm_id,
	hn.street_id as hn_street_id,
	hn.street as hn_street,
	hn.housenumber as hn_housenumber,
	hn.lon as hn_lon,
	hn.lat as hn_lat
into
	nl_geonames_complete
from
	geonames_clean gn 
	left outer join housenumbers_clean hn on gn.osm_id = hn.street_id
where
	gn.country_code = 'nl';

Finally output the extract to a tab separated file:

copy (select * from nl_geonames_complete) to 'S:/ome/location/nl_geonames_complete.tsv' with CSV HEADER DELIMITER E'\t';

pgsql - update jsonb with behavior similar to monogdb $inc operator

I was looking into upserting data into pgsql jsonb field but with a behavior similar to mongodb $inc operator.

upsert itself was solved by using:

insert ...  
on conflict(cols) do update set ...


this snippet is just a dirty reminder on how jsonb can be updated with new keys:

select
  d || jsonb_build_object(
    'x', coalesce(cast(d->>'x' as int), 0) + 40,
    'y', coalesce(cast(d->>'y' as int), 0) + 40
  )
from (
  select '{"x":5000, "z":100}'::jsonb as d
) as data

And a more complete example:

CREATE TABLE monitoring.usage_stats
(
  id serial NOT NULL,
  date timestamp without time zone NOT NULL,
  application_id uuid NOT NULL,
  organization_id uuid NOT NULL,
  user_id uuid,
  token_id uuid,
  referrer text,
  id_1 text,
  id_2 text,
  id_3 text,
  id_4 text,
  id_5 text,
  counters jsonb NOT NULL DEFAULT '{}'::jsonb,
  CONSTRAINT "PK_monitoring.usage_stats" PRIMARY KEY (id)
);

CREATE UNIQUE INDEX usage_stats_uq_usage_stats_composite_key
  ON monitoring.usage_stats
  USING btree
  (date, application_id, organization_id, user_id, token_id, referrer COLLATE pg_catalog."default", id_1 COLLATE pg_catalog."default", id_2 COLLATE pg_catalog."default", id_3 COLLATE pg_catalog."default", id_4 COLLATE pg_catalog."default", id_5 COLLATE pg_catalog."default");
INSERT INTO monitoring.usage_stats (
    date,
    application_id,
    organization_id,
    user_id,
    token_id,
    referrer,
    id_1,
    id_2,
    id_3,
    id_4,
    id_5,
    counters
)
VALUES (
    @date,
    @applicationId,
    @organizationId,
    @userId,
    @tokenId,
    @referrer,
    @id1,
    @id2,
    @id3,
    @id4,
    @id5,
    @counters::jsonb
)
ON CONFLICT (date,application_id,organization_id,user_id,token_id,referrer,id_1,id_2,id_3,id_4,id_5)
DO UPDATE SET
    counters = usage_stats.counters || jsonb_build_object('test',coalesce(cast(usage_stats.counters->>'test' as int), 0) + 2);

Postgres - logical replication

Master DB:

1. Create a publication

CREATE PUBLICATION alltables FOR ALL TABLES;

2. Create a replication user

create role replication_user with login password 'some-pass' replication; 
--GRANT whatever permissions are required
grant usage on schema some_schema to replication_user;
grant select on all tables in schema some_schema to replication_user;

Note: with logical replication, user does not seem to have to have the replication rights; this is required with std streaming replication

3. Modify the postgresql.conf to allow connections to the machine via its ip

listen_addresses = 'localhost,server_ip'

and set up a proper WAL lvl

wal_level = logical			# minimal, replica, or logical

4. Modify pg_hba.conf to allow connections from specified slaves

host    all     replication_user         slave_ip/0                 md5

5. Dump db structure

pg_dump.exe --host localhost --port 5432 --username "some-user-name" --no-password  --format plain --schema-only --verbose --file "C:\temp\db_schema.sql" --schema "schema-name" "db-name"

6. Review publication and replication slots (when a subscription is created)

select * from pg_publication;
select * from pg_replication_slots; --when subscription is created
--when need to drop replication slot
select pg_drop_replication_slot('slot_name')

7. Check replication status

select * from pg_publication;
select * from pg_stat_replication;


Replica DB:

1. recreate db structure - this is required in order to make the subscription work

createdb --host localhost --port 5432 --username user-name  db-name
psql --host localhost --port 5432 --username user-name -w --dbname db-name --command "create extension if not exists postgis"
psql --host localhost --port 5434 --username postgres -w --dbname db-name --file "c:\temp\db_schema.sql"

Note: installing postgis is not required of course

2. Create a subscription

create subscription my_sub 
connection 'host=master-ip dbname=db-to-replicate user=replication_user password=some-password port=5432' 
publication alltables;

3. Review subscription

select * from pg_subscription;

4. Check replication status

select * from pg_stat_subscription;

More info here: https://www.postgresql.org/docs/10/logical-replication.html