Skip to content. | Skip to navigation

Personal tools
Log in
You are here: Home

Open Source Posts

Jim Mlodgenski: PostgreSQL Studio 2.0 is out

From Planet PostgreSQL. Published on Dec 01, 2015.

PostgreSQL Studio is a fully open source web based UI for PostgreSQL that is ideal for Postgres in the cloud. With the impending release of PostgreSQL 9.5, PostgreSQL Studio 2.0 has just been released supporting some of the new features in Postgres. The big new feature that requires UI changes is support for Row Level Security. With Row Level Security, policies need to be created on each table for the role accessing the data. These policies can become rather complex so seeing tied to the table can greatly help with understanding the security placed on your data.


In addition to the new features, we also updated the versions of the libraries and stopped support for JDK 6.

You can download PostgreSQL Studio at:

And while I have you attention…
The call for presentations for PGConf US 2016 is now open! Submissions will be allowed through January 31st, 2016. All speakers will be notified by February 15, 2016. Please note that as with our past conferences, there will be no extensions of the submission deadline. For more information and submission details, please visit

Cyber Monday: 50% off Django book and videos

By Caktus Consulting Group from Django community aggregator: Community blog posts. Published on Nov 30, 2015.

Are you looking for a good gift for a current or future Django developer? Check out Caktus technical director Mark Lavin's work for O'Reilly:

Lightweight Django (book co-written with Julia Elman) Intermediate Django: Build modern, scalable, and maintainable web applications (videos).

Both are now 50% off at the O'Reilly site with the discount code CYBER15.

Comics v2.2.0 released with Django 1.5 support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 30, 2015.

Version 2.2.0 of my comics aggregator is now released. It features a general upgrade of dependencies, including the move from Django 1.4 to Django 1.5, and a lot of updates to comic crawlers. The Django upgrade was completed months ago and it’s been running my Comics instance since, so it’s about time to get it released before Django 1.6 arrives in a month or two. Regarding the crawler updates, it’s a bit sad to see that many of the crawlers have been broken for months without me or anybody else noticing, but it’s hard to catch some content lacking in the middle of a firehose of similar content. I guess I’ll have to make it a monthly task to look through the crawler status page of my Comics instance and do patch releases with updated crawlers.

Check out the project docs for more information about Comics and this release in particular.

Comics v2.3.0 released with better mobile support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 30, 2015.

Version 2.3.0 of my comics aggregator is now released. As always, dependencies have been updated, including the routine Django upgrade from Django 1.5 to 1.6, just in time for the upcoming 1.7 release. The largest change this time around is the move from Bootstrap 2 to 3, which includes a refreshed, flatter design and lots of tweaking to make Comics both look good and work nicely on mobile devices, something it didn’t use to do.

The dependency overview at has been a great motivation for doing some maintenance work on Comics. The only dependency left outdated is django-registration, as 1.0 moves to class-based views, which requires some larger changes to my customizations. Thus, the upgrade of django-registration has been deferred together with the related upgrade of my vendorized copy of django-invitation.

Most, if not all, of the other dependencies seems to support Python 3 now, though some lack the right Trove classifiers in their PyPI registration, so they are not correctly labeled by I found an open pull request for cssmin and a recent commit for django-compressor adding the missing Trove classifiers. I’ve also done my part to improve the Python 3 metadata by sending a pull request to django-bootstrap-form.

Check out the Comics project docs for more information about Comics and this release in particular.

Craig Kerstiens: Node, Postgres, MassiveJS - A better database experience

From Planet PostgreSQL. Published on Nov 30, 2015.

First some background–I’ve always had a bit of a love hate relationship with ORMs. ORMs are great for basic crud applications, which inevitably happens at some point for an app. The main two problems I have with ORMs is:

  1. They treat all databases as equal (yes, this is a little overgeneralized but typically true). They claim to do this for database portability, but in reality an app still can’t just up and move from one to another.
  2. They don’t handle complex queries well at all. As someone that sees SQL as a very powerful language, taking away all the power just leaves me with pain.

Of course these aren’t the only issues with them, just the two ones I personally run into over and over.

In some playing with Node I was optimistic to explore Massive.JS as it seems to buck the trend of just imitating all other ORMs. My initial results–it makes me want to do more with Node just for this library. After all the power of a language is the ecosystem of libraries around it, not just the core language. So let’s take a quick tour through with a few highlights of what makes it really great.

Getting setup

Without further adieu here’s a quick tour around it.

First let’s pull down the example database from PostgresGuide

Then let’s setup out Node app:

$ npm init
$ npm install massive --save

Connecting and querying

Now let’s try to connect and say query a user from within our database. Create the following as an index.js file, then run with node index.js:

var massive = require("massive");
var connectionString = "postgres://:@localhost/example";

var db = massive.connectSync({connectionString : connectionString});

db.users.find(1, function(err,res){

Upon first run if you’re like me and use the PostgresGuide example database (which I now need to go back and tidy up), you’ll get the following:

db.users.find(1, function(err,res){
TypeError: Cannot read property 'find' of undefined

I can’t describe how awesome it is to see this. What’s happening is when Massive loads up it’s connecting to your database, checking what tables you have. In this case though because we don’t have a proper primary key defined it doesn’t load them. It could treat id as some magical field of course like Rails used to and ignore the need for an index, but instead it not only encourages a good database design but requires it.

So let’s go back and create our index in our database:

$ psql example
$ alter table users add primary key (id);

Alright now let’s run our script again with node index.js and see what we have:

{ id: 1,
  email: '',
  created_at: Thu Sep 24 2015 03:42:52 GMT-0700 (PDT),
  deleted_at: null }

Perfect! Now we’re all connected and it even queried our database for us. Now let’s take a few more look at some of the operators.

Running an arbitrary query will let me run any arbritrary SQL. An example such as"select 'hello'") will produce [ { ‘?column?’: ‘hello’ } ].

This makes it nice and easier for us to break out of the standard ORM model and just run SQL.

Find for quick look ups

Similar to so many other database tools find will offer you the most common quick look ups:

$ db.users.find({email: ''}, function(err, res){console.log(res)});
$ db.users.find({'created_at >': '2015-09-24'}, function(err, res){console.log(res)});

And of course there’s a where operator for multiple conditions.

Structuring queries in your application

While in the next post I’ll dig in deep to JSON, this is perhaps my favorite feature of Massive… It’s design for pulling out queries into individudal SQL files. Simply create a db folder and put your SQL in there. Let’s take the most basic example of our user email lookup and put it in user_lookup.sql

FROM users
WHERE email = $1

Now back in our application we can run this and pass in a parameter to it:

db.user_lookup([''], function(err,res){

This separation of our queries from our code makes it easier to track them, view diffs, and even more so create very readable SQL.

Up next

So sure, you can connect to a database, you can query some things. There were a couple of small but more novel things that we blew through in here. First is the fact I didn’t have to define all my schema, it just knew it as it really should. The separation of SQL queries you’ll custom write into files is simple, but will make for much more maintainable applications over the long term. And best of all is the JSON support, which I’ll get to soon…

Amit Kapila: Parallel Sequential Scans in play

From Planet PostgreSQL. Published on Nov 29, 2015.

Parallelism is now reality in PostgreSQL.  With 9.6, I hope we will see many
different form of queries that can use parallelism to execute.  For now, I will
limit this discussion to what we can already do, which is Parallel Sequential

Parallel Sequential Scans are used to scan a relation parallely with the help of
background workers which in turns improve the performance of such scans.  I
will discuss about the scenarios where user can expect a performance boost
due to this feature later in this blog, but first let us understand the basic feature
and how it works.  Three new GUC parameters have been added to tune the
usage of this feature.

max_parallel_degree - This is used to set the maximum number of workers that
can be used for an individual parallel operation.  It is very well possible that the
requested number of workers are not available at execution time.  Parallel workers
are taken from the pool of processes established by max_worker_processes which
means that value of max_parallel_degree should be lesser than max_worker_processes.
It might not be useful to set the value of this parameter more than the number of CPU
count on your system.

parallel_tuple_cost - This is used by planner to estimate the cost of transferring a
tuple from parallel worker process to master backend.  The default is 0.1.  The more
the number of tuples that needs to be passed from worker backend processes to
master backend process, the more this cost will be and more overall cost of
parallel sequential scan plan.

parallel_setup_cost - This is used by planner to estimate the cost of launching parallel
worker processes and setting up dynamic shared memory to communicate.
The default is 1000.

Now let us see the simple example to demonstrate how parallel sequential scan works:
 create table tbl_parallel_test(c1 int, c2 char(1000));   
insert into tbl_parallel_test values(generate_series(1,1000000),'aaaaa');
Analyze tbl_parallel_test;
Explain analyze select * from tbl_parallel_test where c1 < 10000 and
c2 like '%bb%';
Seq Scan on tbl_parallel_test
(cost=0.00..157858.09 rows=1 width=1008)
(actual time=378.414..378.414 rows=0 loops=1)
Filter: ((c1 < 10000) AND (c2 ~~ '%bb%'::text))
Rows Removed by Filter: 1000000
Planning time: 0.075 ms
Execution time: 378.431 ms
(5 rows)

Set the max parallel degree to enable the use of parallelism in queries.
 set max_parallel_degree = 6;  
Explain analyze select * from tbl_parallel_test where c1 < 10000
and c2 like '%bb%';
Gather (cost=1000.00..29701.57 rows=1 width=1008)
(actual time=182.708..182.708 rows=0 loops=1)
Number of Workers: 5
-> Parallel Seq Scan on tbl_parallel_test
(cost=0.00..28701.47 rows=1 width=1008)
(actual time=179.496..1081.120 rows=0 loops=1)
Filter: ((c1 < 10000) AND (c2 ~~ '%bb%'::text))
Rows Removed by Filter: 1000000
Planning time: 0.078 ms
Execution time: 200.610 ms
(7 rows)

Here, we can see how changing max_parallel_degree allows the usage of parallel workers
to perform parallel sequential scans.  We can notice in above example that even though we
have set max_parallel_degree as 6, still it uses 5 workers and the reason for same is that
currently the parallel workers are choosen based on size of relation.

Next, let us discuss about usage of functions in parallel query. A new clause PARALLEL
is added to the CREATE FUNCTION statement.  There are three valid values that can be
used by user with this clause.

1. PARALLEL Unsafe - This indicates that the function can't be executed in parallel mode
and the presence of such a function in a SQL statement forces a serial execution plan.
2. PARALLEL Restricted - This indicates that the function can be executed in parallel mode,
but the execution is restricted to parallel group leader.  As of now, if the qualification for any
particular relation has anything that is parallel restricted, that relation won't be chosen for
3. Parallel Safe - This indicates that the function is safe to run in parallel mode without

The default value for function is PARALLEL Unsafe.

Now let us see the impact of using Parallel Safe and Unsafe function in the queries.  I will
continue using the query used in previous example to explain the concept.

Create a Parallel Safe function
 create or replace function calc_factorial(a integer, fact_val integer)  
returns integer
as $$
perform (fact_val)!;
return a;
$$ language plpgsql PARALLEL Safe;
Use it in query
 Explain analyze select * from tbl_parallel_test where  
c1 < calc_factorial(10000, 10)
and c2 like '%bb%';
Gather (cost=1000.00..75154.99 rows=1 width=1008)
(actual time=120566.456..120566.456 rows=0 loops=1)
Number of Workers: 5
-> Parallel Seq Scan on tbl_parallel_test
(cost=0.00..74154.89 rows=1 width=1008)
(actual time=119635.421..359721.498 rows=0 loops=1)
Filter: ((c2 ~~ '%bb%'::text) AND (c1 < calc_factorial(10000, 10)))
Rows Removed by Filter: 1000000
Planning time: 54.904 ms
Execution time: 120622.631 ms
(7 rows)

Here we can see that Parallel Plan is chosen and the parallel safe function
is pushed to workers for evaluation of quals.

Now lets change that function as Parallel Unsafe and see how the above
query behaves.

  Alter Function calc_factorial(integer, integer) PARALLEL Unsafe;   
Explain analyze select * from tbl_parallel_test where
c1 < calc_factorial(10000, 10)
and c2 like '%bb%';
Seq Scan on tbl_parallel_test
(cost=0.00..407851.91 rows=1 width=1008)
(actual time=33166.138..33166.138 rows=0 loops=1)
Filter: ((c2 ~~ '%bb%'::text) AND (c1 < calc_factorial(10000, 10)))
Rows Removed by Filter: 1000000
Planning time: 0.162 ms
Execution time: 33166.208 ms
(5 rows)

So using parallel unsafe functions in queries would lead to serial plans.

Next, let us see the Performance characteristics of Parallelism:

Non-default settings used to collect performance data:
 shared_buffers=32GB; min_wal_size=5GB; max_wal_size=10GB  
checkpoint_timeout =30min; max_connections=300;

Test setup
 create table tbl_perf(c1 int, c2 char(1000));  
insert into tbl_perf values(generate_series(1,30000000),'aaaaa');
Explain analyze select c1 from tbl_perf where
c1 > calc_factorial($1,10) and
c2 like '%aa%';
The function calc_factorial is same as used in previous example and the values passed
to it are such that the desired percentage of rows can be selected.  Example
 --"to select 1% of rows, below query can be used"  
Explain analyze select c1 from tbl_perf where
c1 > calc_factorial(29700000,10) and
c2 like '%aa%';"
--"to select 10% of rows, below query can be used"
Explain analyze select c1 from tbl_perf where
c1 > calc_factorial(27000000,10) and
c2 like '%aa%';"
--"to select 25% of rows, below query can be used"
Explain analyze select c1 from tbl_perf where
c1 > calc_factorial(22500000,10) and
c2 like '%aa%';"
Performance Data -

1. With increase in degree of parallelism (more parallel workers), the time to complete
the execution reduces.
2. Along with workers, master backend also participates in execution due to which you
can see more time reduction in some cases.
3. After certain point, increasing max parallel degree won't help.

The cases we have seen in this blog are mostly the cases where parallel query helps by
using the workers, however there exists some cases like when qualification is very cheap
where it hurts or won't help even by employing more number of workers.  There is
more investigation needed to make sure that planner won't choose such plans for parallelism.

Paul Ramsey: PostGIS Gotchas @ PgConfSV 2015

From Planet PostgreSQL. Published on Nov 28, 2015.

I attended PgConf Silicon Valley a couple weeks ago and gave a new talk about aspects of PostGIS that come as a surprise to new users. Folks in Silicon Valley arrive at PostGIS with lots of technical chops, but often little experience with geospatial concepts, which can lead to fun misunderstandings. Also, PostGIS just has a lot of historical behaviours we've kept in place for backwards compatibility over the years.

Thanks to everyone who turned out to attend!

solaimurugan vellaipandian: RPostgresql : how to pass dynamic parameter to dbGetQuery statement

From Planet PostgreSQL. Published on Nov 27, 2015.

RPostgresql : R and PostgreSQL Database

Working with RPostgreSQL package

How to pass dynamic / runtime parameter to dbGetQuery in RPostgrSQL ?
#use stri_paste to form a query and pass it into dbGetQuery icd = 'A09' require(stringi) qry <- stri_paste("SELECT * FROM visualisation.ipd_disease_datamart WHERE icd ='", icd, "'",collapse="") rs1 <- dbGetQuery(con, qry)
Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (cannot allocate a new connection --
maximum of 16 connections already opened)
library(RPostgreSQL) drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="DBName", host="",port=5432,user="yes",password="yes")
close the connection. max 16 connection can able to establish from R to PostgreSQL, if exceeds this limit, will throw error.
##list all the connections

## Closes the connection

## Frees all the resources on the driver
#OR on.exit(dbUnloadDriver(drv), add = TRUE)

How to close/drop all the connection Postgresql session.?
We can terminate the PostgreSQL connection using "pg_terminate_backend" SQL command.
In my case I was open up 16 connection using RPostgreSQL unfortunately forget to release them. So I ended up with Max. connection exceed limit.
SELECT pg_terminate_backend( FROM pg_stat_activityWHERE client_addr = '' and pid > 20613 AND pid pg_backend_pid();
In above query, pg_stat_activity will return list of all the active connection.
I have terminating only the connection from R session which made from the (client_addr) IP
Next :

Hans-Juergen Schoenig: Flexible grouping: Some dirty SQL trickery

From Planet PostgreSQL. Published on Nov 27, 2015.

While doing PostgreSQL consulting for a German client, I stumbled over an interesting issue this week, which might be worth sharing with some folks out on the Internet, it’s all about grouping. Suppose you are measuring the same thing various times on different sensors every, say, 15 minutes. Maybe some temperature, some air pressure or […]

The post Flexible grouping: Some dirty SQL trickery appeared first on Cybertec - The PostgreSQL Database Company.

Vasilis Ventirozos: Backups Restores and PITR nowdays

From Planet PostgreSQL. Published on Nov 24, 2015.

Its been almost a year since 9.4 was released, it included many nice features, but the one that changed my day to day administration was replication slots.

In OmniTI , we use something called OmniPITR a lot, OmniPITR could be called a wal management suite, it can be used to wal ship, to replay wals to clean up wals when they are not needed and it can be used for backups. Usually the way I use it is to first set up streaming replication to a replica, and on top of streaming replication setup wal shipping replication, the reason is simple, i want to have all the wals around in case i lose the slave and i didn't really like the wal_keep_segments approach.

Backups are usually being taken from a slave in order to reduce load from the master.

Briefly, OmniPITR does this by sending a pg_start/stop_backup to the master, puts a pause removal file on the slave so wals are kept and makes 2 tar.gz files for the base backup and the wals.

Usually i have a backup server keeping all the wal files and the backups, which means a second (direct or indirect) wal shipping destination.
All this is nice and it works really well but with replication slots this could be more simple and more efficient.

Recently, I had to setup a backup for a 1 master 2 slave setup on 9.4.

The requirements of the backup would be the following:

  • backup will be taken from the slave.
  • backup will be incremental allowing PITR
  • backup will be stored remotely on another machine.
  • minimum possible wal shipping targets.
  • replicas should be identical and adding a new replica should be trivial.
  • backup should be initiated from the server that keeps backups and not from a slave.
  • backups should be tested by restoration every week.

Pretty straight forward stuff really.

I setup archiving to the backup server with a simple scp over rsync command,
archive_command = 'rsync -a %p postgres@<Backup server IP>:/data/walarchive/%f'

I created a backup user that will use .pgpass
touch ~/.pgpass ; chmod 0600 ~/.pgpass

added :

<master ip>:5432:template1:backup:pgbackup
<slave1 ip>:5432:template1:backup:pgbackup
<slave2 ip>:5432:template1:backup:pgbackup
<slave-x ip>:5432:template1:backup:pgbackup

Allowed my backup user in pg_hba.conf (in all db servers)
and then i simply used pg_basebackup like this :

rm -rf /data/base/* && pg_basebackup -D /data/base/ -h <slave ip> -U backup -Ft -z -U backup && mv /data/base/base.tar.gz /data/backups/basebackup_`date +"%Y-%m-%d"`.tar.gz

I would like if pg_basebackup could customize the backup name and if it used replication slots (coming in 9.5) but none if it is really a problem when it comes to backups.

I added a recovery.conf that looks like this :
restore_command = '<PATH/TO/pg_standby> -t /data/backups/ -l /data/walarchive %f %p %r'
trigger_file = '/data/backups/'
recovery_end_command = 'rm /data/backups/'
recovery_target = 'immediate'

The parameter “recovery_target” specifies that recovery should end as soon as a consistent state is reached, i.e. as early as possible. When restoring from an online backup, this means the point where taking the backup ended.

NOTE that the recovery.conf file will exist if the backup was taken from a slave, always remember to edit it and replace its entries with the ones above.

Some notes would be that before you start the database.
Remove all logs from pg_log and verify that:

archive_command ='/bin/true'
synchronous_standby_names = ''

At this moment all you have to do is to start the database and monitor the log.
The log file of a freshly restored database should look like this :

postgres@backup:/data/backups/pg_log$ tail -f postgresql-2015-11-13_020746.log
2015-11-13 02:07:46 EET [] [2344]: [1-1] user=,db=,e=00000 LOG:  database system was interrupted; last known up at 2015-11-13 02:06:20 EET
2015-11-13 02:08:10 EET [] [2344]: [2-1] user=,db=,e=00000 LOG:  starting point-in-time recovery to earliest consistent point
2015-11-13 02:08:10 EET [] [2344]: [3-1] user=,db=,e=00000 LOG:  restored log file "00000002.history" from archive
2015-11-13 02:08:10 EET [] [2344]: [4-1] user=,db=,e=00000 LOG:  restored log file "000000020000000200000054" from archive
2015-11-13 02:08:10 EET [] [2344]: [5-1] user=,db=,e=00000 LOG:  redo starts at 2/54000028
2015-11-13 02:08:10 EET [] [2344]: [6-1] user=,db=,e=00000 LOG:  consistent recovery state reached at 2/540000F0
2015-11-13 02:08:10 EET [] [2344]: [7-1] user=,db=,e=00000 LOG:  recovery stopping after reaching consistency
2015-11-13 02:08:10 EET [] [2344]: [8-1] user=,db=,e=00000 LOG:  recovery has paused
2015-11-13 02:08:10 EET [] [2344]: [9-1] user=,db=,e=00000 HINT:  Execute pg_xlog_replay_resume() to continue.
2015-11-13 02:08:10 EET [] [2342]: [3-1] user=,db=,e=00000 LOG:  database system is ready to accept read only connections

at this point run :

psql -c "select pg_xlog_replay_resume()" template1

you should see in the log file :

2015-11-13 02:10:08 EET [] [2344]: [13-1] user=,db=,e=00000 LOG:  archive recovery complete
2015-11-13 02:10:09 EET [] [2344]: [14-1] user=,db=,e=00000 LOG:  MultiXact member wraparound protections are now enabled
2015-11-13 02:10:09 EET [] [2342]: [4-1] user=,db=,e=00000 LOG:  database system is ready to accept connections
2015-11-13 02:10:09 EET [] [2394]: [1-1] user=,db=,e=00000 LOG:  autovacuum launcher started

Now, assuming that you want to perform PITR.

Follow exactly the same restore procedure as previously described but this time the recovery.conf should look like this :

restore_command = '<PATH/TO/pg_standby> -t /data/backups/ -l /data/walarchive %f %p %r'
#recovery_target_time = '2015-11-13 00:09:00'

# or
#recovery_target_xid = '1966'

trigger_file = '/data/backups/'
#recovery_target_inclusive = 'true'
recovery_end_command = 'rm /data/backups/'

recovery_target_time: This parameter specifies the time stamp up to which recovery will proceed.

recovery_target_xid: This parameter specifies the transaction ID up to which recovery will proceed. Keep in mind that while transaction IDs are assigned sequentially at transaction start, transactions can complete in a different numeric order. The transactions that will be recovered are those that committed before (and optionally including) the specified one. The precise stopping point is also influenced by recovery_target_inclusive.

recovery_target_inclusive: Specifies whether to stop just after the specified recovery target (true), or just before the recovery target (false). Applies when either recovery_target_time or recovery_target_xid is specified. This setting controls whether transactions having exactly the target commit time or ID, respectively, will be included in the recovery. Default is true.

The rest of this procedure should be identical as previously described.

Automate the restore procedure on the backup server , set some maintenance crontab entries that will delete old backups and WALs and you are have yourself a very simple but efficient backup strategy.

Remember , that testing your backups is equally important with backups!

Thanks for reading

Federico Campoli: Meetup live hangout

From Planet PostgreSQL. Published on Nov 24, 2015.

Three days to go for the next Brighton PostgreSQL meetup.
I'll run a live hangout of the talk.

You can join the event there.

The record will become available on youtube shortly after the talk's end.

Alexey Lesovsky: Introduction to pg_stat_activity.

From Planet PostgreSQL. Published on Nov 24, 2015.

PostgreSQL RDBMS has a very informative built-in tools to track the state of the database. One such tool is pg_stat_activity. This is a system view that allows to monitor the databases processes in real time. This view is comparable to the system command top, it is one of the first places from which database administrator can start an investigation if some problem occurs. In this post I explain some useful examples how this view can be used to detect abnormal activity. To start, we need just some postgresql client, psql for example.

As a first step you can use an easy query:
# SELECT * FROM pg_stat_activity;

It is not important which database you are connected to, pg_stat_activity is shared among databases.
pg_stat_activity shows single line for the each database connection. There is a dedicated UNIX process for each connection. For one connection there are several attributes:
pid - process ID which handles the connection. In the PostgreSQL terminology this process is called backend or worker.
datid, datname - ID and name of the database to which client connects.
usesysid, usename - ID and name which was used to connect.
client_addr, client_port, client_hostname - network settings of the client, network address, port and hostname.
application_name - an arbitrary name that can be specified when client connects or a session varialble.
backend_start, xact_start, query_start, state_change - timestamps, which indicate when the process (backend), transaction or query within a transaction has been started, and the last time the state of the process was changed.
state, waiting - state of the process, and a flag which indicates if the process is waiting for another process.
backend_xid, backend_xmin - running transaction ID (xid) and transaction ID (xmin) which determines visibility scope for currently running transaction.
query - text of the query which currently runs by a worker or was recently executed in this connection.

Since each line describe a single connection, we can easily find out how many client connections established to the database.
# SELECT count(*) AS total_conns FROM pg_stat_activity;

Next, we can use the WHERE clause and add filter options. For example, we can see how many connections established from the host with address
# SELECT count(*) FROM pg_stat_activity WHERE client_addr = '';

Of course, we can use different conditions and filter other fields, including and combining them with AND and OR.
Very important attributes are the transaction and query start time. Using the current time, we can calculate the duration of a transactions and queries. This is very useful for the long transaction detection.
client_addr, usename, datname,
clock_timestamp() - xact_start AS xact_age,
clock_timestamp() - query_start AS query_age,
state, query
FROM pg_stat_activity
ORDER BY coalesce(xact_start, query_start);

-[ RECORD 1 ]-------------------------------------------------------
client_addr |
usename     | james
datname     | sales_db
xact_age    |
query_age   | 12 days 05:52:09.181345
state       | idle
query       | <query text>

As we can see in the output, we have a request which runs for 12 days. This is not an active transaction because xact_age field is empty. Using state field we can figure out the connection state - currently it's idle connection. Most likely James executed some query at Friday, not disconnected from the database and went on vacation for two weeks.
As mentioned above, in this example, we use a state field. Let us examine this in more details. The state field determines the current status of the connection and can be in one of several states:
active - this operating state of the process means that the process executes the query, that is, doing a useful work.
idle - is idling, no useful work is being done.
idle in transaction - is idling in an open transaction. This means that the application opened the transaction and does nothing anymore. Long running transactions (more than one hour) with such status are harmful to the database and should be forced to close, and the causes of such behavior at the application level should be removed.
idle in transaction (aborted) - is a broken transaction,at least one of the queries inside the transaction was failed, and other queries will be ignored, until transaction aborts.
fastpath function call - the backend is executing a fast-path function.
disabled - this is a dummy state, it is shown only if the track_activities option is disabled.
Now we have an idea of ​​the connection statuses, we can see the total activity in the database.
client_addr, usename, datname, state, count(*)
FROM pg_stat_activity
GROUP BY 1, 2, 3, 4 ORDER BY 5 DESC;
client_addr  | usename  | datname  |        state        | count
-------------+----------+----------+---------------------+-------    | app_user | sales_db | idle                | 28    | app_user | sales_db | active              | 15    | app_user | sales_db | idle in transaction | 3    | bg_user  | sales_db | active              | 6   | james    | sales_db | idle                | 2   | helen    | shop_db  | active              | 1

Note, in example above the most of the connections are idle. If the total number of idle connections is dozens or several hundreds, then you definitely need to start thinking towards using pgbouncer, to reduce the number of idle processes. We also can see idle in transaction processes, it is important to monitor them and close them using pg_terminate_backend() if they hang too long. To determine the age of such connections is not a big problem, because we already know how to do this.
client_addr, usename, datname,
clock_timestamp() - xact_start AS xact_age,
clock_timestamp() - query_start AS query_age,
state, query
FROM pg_stat_activity
WHERE state = 'idle in transaction'
ORDER BY coalesce(xact_start,query_start);

client_addr | usename  | datname  |    xact_age     |    query_age    |        state        |   query
------------+----------+----------+-----------------+-----------------+---------------------+-------------   | app_user | sales_db | 00:00:06.001629 | 00:00:00.002542 | idle in transaction | <query text>   | app_user | sales_db | 00:00:05.006710 | 00:00:00.003561 | idle in transaction | <query text>   | app_user | sales_db | 00:00:00.009004 | 00:00:00.001629 | idle in transaction | <query text>

As we can see the age of the transaction is relatively small, so we can filter the short queries. Also we filter idle connections, because they are not so interesting. Let's filter out everything which is faster than 10 seconds.
client_addr, usename, datname,
now() - xact_start AS xact_age,
now() - query_start AS query_age,
state, query
FROM pg_stat_activity
(now() - xact_start) > '00:00:10'::interval OR
(now() - query_start) > '00:00:10'::interval AND
state <> 'idle'
ORDER BY coalesce(xact_start, query_start);
client_addr | usename  | datname  |    xact_age     |    query_age    |        state        |   query
------------+----------+----------+-----------------+-----------------+---------------------+-------------   | app_user | sales_db | 00:00:12.013319 | 00:00:05.002151 | active              | <query text>   | app_user | sales_db | 00:00:10.083718 | 00:00:10.083718 | idle in transaction | <query text>

But that's not the end, there can be situations when a query is blocked by another request or transaction. To identify such connections the waiting attribute can be used (true if disabled, false if no locks).
client_addr, usename, datname,
now() - xact_start AS xact_age,
now() - query_start AS query_age,
state, waiting, query
FROM pg_stat_activity
WHERE waiting
ORDER BY coalesce(xact_start, query_start);
client_addr | usename  | datname  |    xact_age     |    query_age    | state  | waiting |   query
------------+----------+----------+-----------------+-----------------+--------+---------+-------------   | app_user | sales_db | 00:00:16.736127 | 00:00:02.839100 | active | t       | <query text>

Presence of waiting processes is a bad sign, it is usually an evidence of poor designed application. Locking occurs in a situation where two or more concurrent transactions (or queries) try to access the same resource, such as a set of rows of a table. A simple example, transaction A update set of rows M, transaction B when tries to update the same set of rows and will wait until the transaction A will be either commited or aborted.
Besides pg_stat_activity, such situations can be tracked down in the postgresql log, in case when log_lock_waits is enabled.

[UPDATE waiting] LOG: process 29054 still waiting for ShareLock on transaction 2034635 after 1000.160 ms
[UPDATE waiting] DETAIL: Process holding the lock: 29030. Wait queue: 29054.
[UPDATE waiting] CONTEXT: while updating tuple (0,68) in relation "products"
[UPDATE waiting] STATEMENT: update products set price = 20 where id = 1;
[UPDATE waiting] LOG: process 29054 acquired ShareLock on transaction 2034635 after 9049.000 ms
[UPDATE waiting] CONTEXT: while updating tuple (0,68) in relation "products"
[UPDATE waiting] STATEMENT: update products set price = 20 where id = 1;

What's going on here?
  1. The process with PID 29054 is blocked and is waiting. Process with PID 29030 holds the lock.
  2. The text of the waiting query was logged too.
  3. After the 9 seconds through the process with PID 29054 obtain resources and was able to complete the query.
A deadlock situation is also possible. That happens when to commit some transaction PostgreSQL needs to acquire a lock on the resource, currently blocked by another transaction, which can release the lock only after obtaining another lock on the resource currently blocked by the first transaction. When this situation occurs, postgres deadlock detection mechanism terminates one of the transactions and that allows another transactions to proceed. Default deadlock timeout is 1 second and can be configured through deadlock_timeout. Such situations are also logged in the postgresql log.

[UPDATE] ERROR: deadlock detected
[UPDATE] DETAIL: Process 29054 waits for ShareLock on transaction 2034637; blocked by process 29030.
Process 29030 waits for ShareLock on transaction 2034638; blocked by process 29054.
Process 29054: update products set price = 20 where id = 1;
Process 29030: update products set price = 20 where id = 2;
[UPDATE] HINT: See server log for query details.
[UPDATE] CONTEXT: while updating tuple (0,68) in relation "products"
[UPDATE] STATEMENT: update products set price = 20 where id = 1;
[UPDATE waiting] LOG: process 29030 acquired ShareLock on transaction 2034638 after 2924.914 ms
[UPDATE waiting] CONTEXT: while updating tuple (0,69) in relation "products"
[UPDATE waiting] STATEMENT: update products set price = 20 where id = 2;

what was happened here?
  1. The process with PID 29054 waits, as it was blocked by a process with PID 29030.
  2. In turn, the process with PID 29030 waits too, as it was blocked by a process with PID 29054.
  3. We can see further queries because of which deadlock was occured.
  4. The process with PID 29030 was unlocked, so query within the process 29054 has been forced to cancel and transaction moved to "idle in transaction (aborted)" state.
Anyway, the waiting processes and deadlocks are bad, such incidents must be investigated and must be eliminated. Maybe someone will ask: Why they are so bad? I will answer, while the process is blocked, the application which sent the query also waits, in this case the end user may think that the system slows down and can be frustrated. Nobody likes long response time.
To wrap it up, now you know how to use pg_stat_activity effectively, and can create your own view. It will be the first mate when something in the database goes out of control. Here is an example of a view which shows abnormal activity in the database.

# CREATE VIEW db_activity AS
pid, client_addr, client_port,
datname, usename,
clock_timestamp() - pg_stat_activity.xact_start AS ts_age,
clock_timestamp() - pg_stat_activity.query_start AS query_age,
clock_timestamp() - pg_stat_activity.state_change AS change_age,
waiting, state, query
FROM pg_stat_activity
(clock_timestamp() - xact_start) > '00:00:00.1'::interval OR
(clock_timestamp() - query_start) > '00:00:00.1'::interval
AND state IN ('idle in transaction (aborted)', 'idle in transacaction')
) AND pid <> pg_backend_pid()
ORDER BY COALESCE (xact_start, query_start);

That is all. Thanks!

Moya Template Language

By Will McGugan from Django community aggregator: Community blog posts. Published on Nov 23, 2015.

Moya's template language is a spiritual successor to Django, and Jinja templates. It borrows a number of constructs from both, and adds a few of its own.

There is a slight difference in the syntax in that variables are substituted with ${variable} rather than {{ variable }}, but tags still use the familiar {% tag %} syntax.

Template languages are generally quite extensible, so there is probably nothing that Moya template can do that Django / Jinja can't (via a plugin or extension), but there are a few things which are I think are more elegant in Moya's templates language (and built-in). I'm going to talk about a few of them in this post.

The Attrib Tag

How often have you written code like this?

<div id="{{ article_id }}"
    class="{% if type %}{{ type }}{% endif %}{% if active %} active{% endif %}{% if highlight %} highlight{% endif %}">
    {{ content }}

This code renders a div with an id attribute and a few optional classes. It is easy to follow, but verbose. It's also a little error prone to write; we have to be careful about the whitespace between the classes, or we could end up generating the nonsense class activehighlight.

Moya offers the the {% attrib %} template tag which is a shortcut to generate a sequence of attributes. Here is the equivalent Moya code using {% attrib %}:

<div {% attrib (id=article_id, class=[type, active and 'active', highlight and 'highlight']) %}>

The attrib tag takes a dictionary and generates attributes based the keys and values. If the value is a list, Moya joins strings with a space, and ignores any value that evaluates to false. It will also omit any attributes that would evaluate to an empty string (compared to the original which could potentially render a superfluous class="").

The attrib tag is also faster, since there is template logic to run.

The Markup Tag

It is not uncommon to have to write copy in templates. But writing copy in HTML is a pain, and somewhat error prone. You can easily break the entire page with a typo.

Moya has a {% markup-block %} tag which renders the enclosed text in the markup of your choice, so you can embed markdown (for example) directly in to your template. Here's an example:

{% markup-block as 'markdown' %}
# Our Coffees
All our coffees are *organically produced* and from [sustainable sources](/where-we-get-our-coffee/) only. 
{% end-markup-block %}

This produces the following markup:

<h1>Our Coffees</h1>
<p>All our coffees are <em>organically produced</em> and from <a href="/where-we-get-our-cofee/">sustainable sources</a> only.</p>

Another way of rendering markups is with the {% markup %} tag which renders a template variable rather than enclosed text. This is more suitable for rendering content stored in the database. Here's an example:

{% markup post.content as 'markdown' %}

This will render the string content in an object called post as markdown.

The Sanitize Tag

Web developers will no-doubt be familiar with the dangers of rendering untrusted markup (i.e. from comments). Since Moya has a batteries included attitude to templates, there is a built-in tag for this.

The {% sanitize %} tag escapes any potentially dangerous markup based on a set of rules. The defaults are quite conservative and only permit basic formatting markup. Here's an example:

{% sanitize %}
My <b>exploit</b>!
<script>alert('how annoying')</script>
{% end-sanitize %}

This generates the following markup:

My <b>exploit</b>!
&lt;script&gt;alert(&#39;how annoying&#39;)&lt;/script&gt;

The script tag has been escaped, so it will display as text (you can also opt to remove untrusted markup entirely).

This tag uses the excellent bleach library to do the sanitizing.

More Information

To read more about the Moya template languages, see the docs. The docs are currently in a catch-up faze after 3 months of development, so are missing a few recently added template tags. Feel free to ask me about those, but I will be writing them up in the coming weeks.

Fixing SSL certificate chains

By Reinout van Rees' weblog from Django community aggregator: Community blog posts. Published on Nov 22, 2015.

This blog post applies when the following two cases are true:

  • Your browser does not complain about your https site. Everything seems fine.
  • Some other tool does complain about not finding your certificate or not finding intermediate certificates. What is the problem?

So: your browser doesn't complain. Let's see a screenshot:

Browser address bar with a nice green closed lock, so ssl is fine

Examples of the errors you can see

Some examples of complaining tools. First curl:

$ curl https://api.letsgxxxxxxx
curl: (60) SSL certificate problem: Invalid certificate chain
More details here:

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

curl has the right error message: Invalid certificate chain.

Let us look at wget:

$ wget https://api.letsgxxxxxx
--2015-11-23 10:54:28--  https://api.letsgxxxxx
Resolving api.letsgxxxxxx...
Connecting to api.letsgxxxxxx||:443... connected.
ERROR: cannot verify api.letsgxxxxxx's certificate, issued by 'CN=COMODO RSA
  Domain Validation Secure Server CA,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB':
  Self-signed certificate encountered.
To connect to api.letsgxxxxxx insecurely, use `--no-check-certificate'.

wget is right that it cannot verify .... certificate. But its conclusion Self-signed certificate encountered is less helpful. The certificate is not self-signed, it is just that wget has to treat it that way because the certificate chain is incorrect.

If you talk to such an https URL with java, you can see an error like this:
PKIX path building failed:
unable to find valid certification path to requested target

This looks quite cryptic, but the cause is the same. SunCertPathBuilderException: CertPath sure sounds like a path to a certificate that it cannot find.

A final example is with the python requests library:

>>> import requests
>>> requests.get('https://api.letsgxxxxxx')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/requests/", line 69, in get
    return request('get', url, params=params, **kwargs)
  File ".../requests/", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File ".../requests/", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File ".../requests/", line 573, in send
    r = adapter.send(request, **kwargs)
  File ".../requests/", line 431, in send
    raise SSLError(e, request=request)
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

How to determine what's wrong

So... you yourself discover the problem. Or a customer calls that he's getting an error like this. Even though everything seems right if you test the https site in the browser.

Solution: go to

If that site says everything is completely right, then you're done. If it still complains about something, you've got work to do.

Most of the checkmarks are probably green:

Green checkmarks in front of many common SSL checks

In cases like this, the problem is in the certificate chain at the bottom of the page. Here's an example of one of our own sites from a few months ago:

Broken chain icon indicating the exact problem spot

Note the "broken chain" icon halfway. Just follow the chain from top to bottom. Everything has to be perfect. We start with the * which is issued by GeoTrust SSL CA - G2.

The certificate GeoTrust SSL CA - G2 in turn is issued by GeoTrust Global CA.

The problem: the next certificate in the chain is not about GeoTrust Global CA, but about GeoTrust SSL CA, which is different. Here the chain breaks. It does not matter that the fourth certificate is about the GeoTrust Global CA we were looking for. The chain is broken. The order in which the certificates are placed must be perfect.

After fixing the order of the certificates in our certificate file, the problem was fixed:

Chain icons indicating that the chain is unbroken

Why is a chain needed?

There are lots of certificates in the wild. All the browsers (and java, and your OS and...) often only store a handful (well, 20+) "root certificates". All the other certificates have to trace their origin back to one of those root certificates.

That is where the intermediate certificates come in: they're a cryptographically signed way to trace the validity of your certificate back to one of the known-good root certificates.

How to fix it

  • If you're handling certificates yourself, you ought to know which files to edit. The main problem will be getting the right intermediary certificates from the issuing party. Often you only get "your" certificate, not the intermediary ones. Ask about it or google for it.

  • Often you won't maintain those certificates yourself. So you have to get your hosting service to fix it.

    If you let someone else take care of the certificate, point them at and tell them to make sure that page is completely happy.

    In my experience (=three times in the last two years!) they'll mail back with "everything works now". But it still won't work. Then you'll have to mail them again and tell them to really check and probably provide screenshots.

Good luck!

Leo Hsu and Regina Obe: PostGIS 2.2.0 windows bundle for PostgreSQL 9.5beta2 32 and 64 bit

From Planet PostgreSQL. Published on Nov 21, 2015.

We just pushed out installers for PostGIS 2.2.0 for PostgreSQL 9.5beta2 windows both 32-bit and 64-bit on Application Stackbuilder. These installers are also available as standalone listed on PostGIS windows page. This is the first PostGIS 2.2.0 release for the PostgreSQL 9.5 32-bit and a rerelease for PostgreSQL 9.5 x 64-bit (this time compiled against beta2 instead of beta1).

On quick testing the PostGIS 2.2 beta1 release and pgRouting 2.1.0 worked fine on 9.5beta2, however you may want to reinstall anyway just to be safe. You can just reinstall over your existing install, no need to uninstall first. Similarly just upgrading a PostgreSQL 9.5beta1 to 9.5beta2 seemed to not require pg_upgrade or dump/restore, so safe to just upgrade from 9.5beta1 to 9.5beta2. Other notes about this 9.5beta2 PostGIS 2.2.0 release:

  • The FDW API changed between PostgreSQL 9.5beta1 and PostgreSQL 9.5beta2, so the OGR_FDW, if you don't reinstall the bundle, will crash and burn in PostgreSQL 9.5beta2 (using PostGIS 2.2. beta1 executables). Similarly this newly compiled OGR_FDW will not work on PostgreSQL 9.5beta1 (so upgrade to 9.5beta2 first).
  • The PostgreSQL 9.5betas (that includes both beta1 and beta2), are compiled against the pointcloud 1.1 master branch. This was required because the released pointcloud 1.0.1, does not compile against PostgreSQL 9.5
  • The PostgreSQL 9.5beta2 PostGIS 2.2.0 release comes packaged with SFCGAL 1.2.2 (instead of 1.2.0 like the others versions) which fixes a crasher with ST_StraightSkeleton as noted in ticket - Newer SFCGAL will be packaged with upcoming PostGIS 2.2.1, but if you are on an older edition and are using SFCGAL, you can always copy latest SFCGAL.dll binaries from the 2.2.1dev packages on PostGIS windows page

Andrew Dunstan: Announcing Release 4.16 of the PostgreSQL Buildfarm client

From Planet PostgreSQL. Published on Nov 21, 2015.

I have just released version 4.16 of the PostgreSQL Buildfarm client

It can be downloaded from

Several bugs have been fixed, and there are these non-trivial changes:

  • capturing of TAP test logs
  • bin check enabled on Windows
  • rm_worktrees feature
  • an experimental module to run Sepgsql tests
  • try database shutdown following a start failure
  • report all PGBuild module versions to server

rm_worktrees is enabled by putting this in your config file:

rm_worktrees => 1,

The effect is that at the end of a run the checked out work tree is removed, leaving just the git repository, which in turn might be mostly linked to your HEAD branch if you use the git_use_workdirs option. Before a run, the work tree will be checked out again. The net effect is a significant saving in space used. with these two options, the additional space for each branch except when it's actually building is reduced to less than 30Mb. On crake, the git mirror, branch trees and cache now come in at about 1.5Gb. That's a lot less than it used to be. The additional cost of checking out the worktree each time is very modest.

Shutdown after a start failure tries to remedy a situation where we try to start the server, and don't detect that it has started, but it has in fact started. So now if we get a failure we try to shut down any server that might have started, just in case. This change is possibly redundant given the recent change where postgres detects that its data directory has disappeared and shuts down when it has, but it's probably worth having anyway.

How To Create Installable Django Packages

By pydanny's blog from Django community aggregator: Community blog posts. Published on Nov 21, 2015.

Django Package Ecosystem: cookiecutter-djangopackage

What I mean by an "installable Django package": a reusable component that can be shared across Django projects, allowing us to combine our own efforts with others. Some examples include:

Ever want to quickly create a similarly installable Django package to submit to PyPI and Django Packages? One that goes beyond the basics described in the Django tutorial? Specifically, a package that includes:

  • Test runner so you don't need a example/test project (Although those can be useful).
  • The important configuration in place: Travis, editorconfig, gitignore, etc.
  • The important documentation in place: Readme, License, Read the Docs-ready Sphinx docs, etc.
  • Static files ready to go.
  • A base DTL/Jinja2 template ready to go.
  • All those other fiddly bits not included in startapp that are hard to remember.

Well, here's how I do it.

Introducing cookiecutter-djangopackage

First, get Cookiecutter. Trust me, it's awesome:

$ pip install cookiecutter

Now run it against this repo:

$ cookiecutter

You'll be prompted to enter some values. Enter them. Then an installable Django package will be built for you.

Warning: app_name must be a valid Python module name or you will have issues on imports.

Enter the new package (in my case, I called it 'newpackage') and look around. Open up the AUTHORS.rst,, or README.rst files and you'll see your input inserted into the appropriate locations.

Speaking of the README.rst, that file includes instructions for putting the new package on PyPI and Django Packages.

├── .editorconfig
├── .gitignore
├── .travis.yml
├── AUTHORS.rst
├── HISTORY.rst
├── Makefile
├── README.rst
├── newpackage
│   ├──
│   ├──
│   ├── static
│   │   ├── css
│   │   │   └── newpackage.css
│   │   ├── img
│   │   │   └── .gitignore
│   │   └── js
│   │       └── newpackage.js
│   └── templates
│       └── cheese
│           └── base.html
├── docs
│   ├── Makefile
│   ├── authors.rst
│   ├──
│   ├── contributing.rst
│   ├── history.rst
│   ├── index.rst
│   ├── installation.rst
│   ├── make.bat
│   ├── readme.rst
│   └── usage.rst
├── requirements-test.txt
├── requirements.txt
├── requirements_dev.txt
├── setup.cfg
├── tests
│   ├──
│   └──
└── tox.ini

Now, instead of monkeying around for awhile doing copy/paste package setup, I'm immediately ready to write code.


cookiecutter-djangopackage does a lot, but even with its tight focus on package creation it could do more. Some of the things I would love to see included in the future:

  • Option for Appveyor CI support
  • Option to replace django.test with py.test.
  • Generation of model boilerplate, admin, and CRUD views.
  • More in the issue tracker.

Try it out and let me know what you think. I'm open to new ideas and receiving pull requests.

Shaun M. Thomas: PG Phriday: Cluster Control

From Planet PostgreSQL. Published on Nov 20, 2015.

It has occurred to me that I may have been spending a bit too much time being excited about new Postgres features and developments in the community. One of the intents of this weekly article was for educational purposes, so this week, let’s get back to basics. To that end, the topic for this week boils down to the tools available for managing Postgres instances, and how to use them. Surprisingly, it’s not as straight-forward as you might think.

Having used Postgres for almost 15 years now, it’s easy to forget (or mentally block!) the early struggles. With new tools and wrappers always being developed, it’s not always clear what the best practices for managing a Postgres cluster actually are. Indeed, it often depends on how Postgres is installed.

Let’s start with a basic source-code installation. I won’t cover that process, but we can assume that after the smoke clears, the binaries are somewhere in /usr/bin or /usr/local/bin, and are thus in our execution path. Given that, let’s say we have an available mount point at /data and want to create a new cluster there. Here’s how that might go:

sudo mkdir /data/pgsql
sudo chown postgres:postgres /data/pgsql
sudo su - postgres
initdb -D /data/pgsql/my_db
pg_ctl -D /data/pgsql/my_db start

We now have an active Postgres instance at /data/pgsql/my_db. It’s extremely common to use the postgres OS user for this kind of thing, hence all of our sudo commands to prepare. It’s entirely possible to do this as a regular user, but I usually don’t recommend that approach.

In any case, this type of installation essentially depends on the pg_ctl command-line tool. It does everything related to controlling a Postgres instance. But it’s also annoying to use the -D parameter all the time when using Postgres tools, so there are several environment variables that can also do the job. This lets us prime our environment with .bashrc, for example. Let’s stop the instance:

export PGDATA=/data/pgsql/my_db
pg_ctl stop -m fast

Why the -m fast part? By default, Postgres is exceedingly polite. If we had simply asked it to stop, it would patiently wait for any pending transactions to complete before stopping. But if there are several users connected, some may leave transactions idle, or a really long process may be running, or maybe we’re just in a hurry. Using a fast shutdown tells Postgres to abort any transactions and stop as soon as it can—safely, of course. There’s no data loss, but a few users or applications may grumble a bit.

At the system level, it’s very common for a global service management to wrap much of this process. To start or stop Postgres on CentOS or RedHat, we’d do something like this:

sudo service postgresql-9.4 start
sudo service postgresql-9.4 stop

On a Debian or Ubuntu system, it would be this:

sudo service postgresql start
sudo service postgresql stop

And this is where things start to get complicated. These two major Linux flavors can’t even agree on what to name the service control mechanism, and both have extremely differing views on handling multiple versions. Both however, go to great lengths to “hide” the Postgres-provided binaries so that only the known wrappers can access them unless we cheat and add the binary location directly to our PATH variable.

By and large, that isn’t actually necessary, but it does add a major complication: standardization. As in, there isn’t any. Depending on distribution, wrappers will take different parameters, install Postgres in varying locations, and have divergent control methods. Let’s dive a little deeper into Debian/Ubuntu and their slightly more comprehensive wrappers.

Ubuntu assumes any number of Postgres versions may be installed at any one time. As such, starting and stopping the postgresql service on an Ubuntu system will cascade that operation to any configured instances hosted on the machine. What if we only want to manage a single instance? First, we need to know which instances are even available. This is where the cluster tools come in.

Because a single Postgres instance can represent several individual databases, it’s not uncommon to refer to it as a cluster. Debian and thus Ubuntu took that concept and ran with it and made something that’s actually pretty handy. Let’s get a look at a test system with multiple running clusters:

sudo su - postgres
Ver Cluster Port Status Owner    Data directory               Log file
9.4 main    5432 online postgres /var/lib/postgresql/9.4/main /var/log/postgresql/postgresql-9.4-main.log

Neat, eh? This tool gives us a lot of information that’s immediately useful. With multiple clusters running on one system, we need to know which port each is running on. Knowing where the data resides, and where to find logs has obvious merit. And to differentiate each, the clusters are named. Let’s repeat that instance creation from above using a couple more wrapper tools:

pg_createcluster 9.4 my_db -D /data/pgsql/my_db
pg_ctlcluster 9.4 my_db start
Ver Cluster Port Status Owner    Data directory               Log file
9.4 main    5432 online postgres /var/lib/postgresql/9.4/main /var/log/postgresql/postgresql-9.4-main.log
9.4 my_db   5433 online postgres /data/pgsql/my_db            /var/log/postgresql/postgresql-9.4-my_db.log

The pg_createcluster tool takes the place of initdb on Debian-based systems like Ubuntu. It’s aware of the other wrappers and recommended settings, locations, and so on. It also did us the favor of automatically incrementing the port so the clusters can run simultaneously. And pg_ctlcluster takes the place of pg_ctl for similar reasons. It should be fairly clear the tools are inherently designed such that the version is a required parameter. They’re pretty dedicated to the concept of hosting multiple clusters and Postgres versions on a single server.

Since they’re wrappers, they also accept many of the same parameters. If we wanted to stop the my_db cluster quickly, we could use a similar command as before:

pg_ctlcluster 9.4 my_db stop -m fast

Note that there’s one major omission here: we didn’t specify the data directory. Of course, the wrappers know where all of the clusters live, so we don’t need to. The version and name uniquely identify all clusters under this architecture, removing the necessity of memorizing cluster locations or hacking our environment with variables.

All of this comes with something of a caveat related to configuration. Normally, Postgres configuration files reside within the data directory itself, making the database a self-contained entity that is not tied to any one server. If the data is on a SAN for instance, it could be mounted on a replacement server and started with no modifications. But to have wrappers, the wrappers need a standard location to find information regarding the clusters they can control. On systems that use these tools, that’s the /etc/postgresql directory.

Here’s what the configuration directories look like on our test system:

find /etc/postgresql -type d

It’s a simple organization to remember, and closely resembles the version + name format all of the tools require. All of the configuration files the Postgres documentation or online articles refer to, actually live here instead. Most UNIX systems expect configuration files somewhere in the /etc directory, so that’s not unexpected. But it does introduce a rather unusual complication.

Some Postgres tools unrelated to this organization expect configuration files to be in the Postgres data directory. If we were to restore a backup of the data directory on an unrelated server, we would be unable to start it, and we would have no idea who should be allowed to connect. To compound the issue, the cluster tools themselves would have no way to administer it, because the /etc files would be missing.

The easy solution would be to include the corresponding /etc/postgresql folder in the backup process itself. We could (and should) also use configuration management to distribute these files to replacement or restoration servers. But that doesn’t fix external tools that look for config files in the data directory, so I tend to also do a bit of file linking. For example:

for x in /etc/postgresql/9.4/my_db/*.conf; do
  ln -s $x /data/pgsql/my_db

This way we don’t have to worry; the necessary files are all there and we don’t lose any benefits provided by the cluster tools.

In any case, I hope any developers reading this are now better equipped to experiment with local instances. It’s unfortunate there’s not enough standardization across the distributions in this regard, but at least now you have a starting point and some insight at how it got this way. Enjoy!

Marco Slot: pg_paxos: High Availability Data Replication for PostgreSQL

From Planet PostgreSQL. Published on Nov 19, 2015.

The Paxos algorithm is a powerful building block for building highly available distributed systems. Paxos can be seen as a function paxos(k,v) that returns the same value on all servers in a group for a certain key (k), and the value is one of the inputs (v). Paxos is most commonly used to implement log replication through a technique called Multi-Paxos. In Multi-Paxos, nodes call paxos using the indexes in a log as keys and state changes (e.g. 'set primary to node2') as values. Using this technique, data can be kept consistent and available across multiple servers, even if some of them fail.

We recently started exploring Paxos as a technique for doing automated fail-over of PostgreSQL servers and wrote a simple implementation of Paxos and Multi-Paxos including basic support for membership changes in about 1000 lines of PL/pgSQL. We found PL/pgSQL to be an excellent tool for implementing Paxos, because the necessary transactional semantics are a natural part of the language. Other consensus algorithms would have been harder to implement in this way, because they rely on timers and other background tasks. 

The pg_paxos extension demonstrates how the Paxos functions can be used for fault-tolerant, consistent table replication in PostgreSQL. While a replicated table has high write and read latencies due to the network round-trips involved, it can be very useful for applications such as automated fail-over. We plan to add optimisations such as the ones applied in Google Megastore to reduce the overhead and develop the extension further to be able to replace components like etcd or Zookeeper.

Tomas Vondra: TRIM facepalm

From Planet PostgreSQL. Published on Nov 19, 2015.

While discussing results of various filesystem benchmarks - both here and in talks given on conferences, one of the things I've been questioning was the negligible impact of TRIM/DISCARD mount option.

I was speculating that maybe TRIM only really matters when the drive gets almost full, because until then the controller has enough room where to write the incoming data without getting the internal garbage collection under pressure. But I've recently did a a few pgbench runs aiming to test exactly this, using a rather large scale (filling more than 90GB of the 100GB drive), yet still no diference.

Another speculation was that maybe this particular SSD is really good and there's so much additional overprovisioning that the TRIM still makes no difference. After all, it's a good SSD frin Intel (S3700) meant for write-intensive data center applications, so I wasn't entirely surprised by that.

But then it dawned upon me ...


The TRIM does not matter because pgbench does not discard any pages, it simply overwrites the pages in place, so there's nothing to TRIM.

What pgbench does is that (a) it writes new data by adding new pages and (b) updates those pages, by rewriting them in place. None of that involves discarding pages - the controller knows which page of the file it's overwriting, and therefore knows which SSD pages are dirty and need to be erased. Similarly for WAL segments, that are reclaimed and overwritten in-place.

Maybe the TRIM does matter for other workloads (e.g. creating a lot of temporary files on the SSD device), or in databases where tables are often dropped or truncated. But for the simple pgbench workloads, TRIM does not matter at all.

Introducing Moya Techblog, a Blogging Engine for Coders and Photographers

By Will McGugan from Django community aggregator: Community blog posts. Published on Nov 18, 2015.

I've had a blog on my vanity domain for nearly a decade now. For the last 6 years, it has been powered by Django Techblog, which I wrote out of my frustration with syntax highlighting plugins in Wordpress. Techblog has happily served my blog those 6 years, with only minor hiccups when I half-heartedly ported the code to the latest Django.

There was nothing terribly wrong with Django Techblog; it had multiple blogs, tags, search etc., but there were some annoyances that I didn't have the motivation to do much about. The custom markup system I used was clever, but I would often forget the syntax! The support for images was rudimentary and the single-threaded comment system wouldn't notify me of replies.

So fast forward 6 years and the time is right to build a new blog system to fix the old niggles. This time around I had same requirements for syntax highlighting in both posts and comments, with a new requirement to be able to manage and display photographs, while supporting the same multi-blog and feed URLs.

Moya Tech Blog

The new blog engine running my blog is Moya Tech Blog. Posts and comments use markdown (by default), which is practically a lingua franca for software developers.

The syntax highlighting supports a number of languages. Here's an example (my new favorite sort algorithm):

from time import sleep
from threading import Timer
def sleepsort(values):
    sleepsort.result = []
    def add1(x):
    mx = values[0]
    for v in values:
        if mx < v: mx = v
        Timer(v, add1, [v]).start()
    return sleepsort.result

Post Editor

The editor in the old Techblog was undeniably clunky; just a big text-area. It did support drafts, and previews in a new window, but it wasn't at all elegant.

Moya Techblog's editor is friendlier. Drafts are auto-saved as you type, including any images you add, and there is an automatically updating preview that renders the markdown.

Photography Features

The original Techblog did nothing more than provide a basic UI for uploading images. The new, friendlier, interface manages uploads and does some some fairly sophisticated image processing so that photos are displayed in a resolution that fits the display. If you upload large enough photos, it will display them in pin-sharp 4K resolution on high dpi and retina displays. I don't think that even 500px supports that.

Moya Techblog Image manager

A feature that I think photographers will appreciate is that Moya Techblog will extract camera and lens information from the exif data (here's an example). I've always found this information invaluable, especially when I was learning photography.

Installing Moya Techblog

Moya Techblog is super easy to install. First install the following with pip (requests_oauthlib is for the 'Sign-In with...' buttons):

pip install moya requests_oauthlib

Next run the following to get the code and set up the database:

git clone
cd moya-techblog/site
moya init

From the same directory, enter the following to run the development server:

moya runserver

The development server uses sqlite and stores media in the project directory, which is convenient for testing/development. See the docs on deployment if you want to permanently host Moya Techblog.

Nginx proxying to nginx: getting gzip compression to work

By Reinout van Rees' weblog from Django community aggregator: Community blog posts. Published on Nov 18, 2015.

At work we use gunicorn as our wsgi runner. Like many, gunicorn advises you to run the nginx webserver in front of it. So on every server we have one or more websites with gunicorn. And an nginx in front.

Nginx takes care, of course, of serving the static files like css and javascript. Some gzipping of the results is a very, very good idea:

server {
    listen 80;
    gzip on;
    gzip_proxied any;


Two notes:

  • The default is to only gzip html output. We also want javascript and json. So you need to configure gzip_types.

    (I copy-pasted this from one of my config files, apparently I needed three different javascript mimetypes... Perhaps some further research could strip that number down.)

  • gzip_proxied any tells nginx that gzipping is fine even for proxied requests.

Proxied requests? Yes, because we have a lot of servers and all external traffic first hits our main nginx proxy. So: we have one central server with nginx that proxies requests to the actual servers. So: nginx behind nginx:

server {
    listen   443;
    location / {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://some-internal-server-name/;
    ssl on;
    ssl_certificate ...

Pretty standard "I listen on 443/https and proxy it on port 80 to some internal server" setup.

Works like a charm. Only drawback: gzipping does not work.

The reason? nginx defaults, in this case.

  • The gzip module has a gzip_http_version configuration parameter with a default of 1.1.

    Which means that http 1.0 requests are not gzipped, only 1.1.

  • The proxy module has a proxy_http_version configuration parameter with a default of 1.0.

    Which means that proxied requests are send from the main proxy to the actual webserver with http 1.0.

These two don't match. There are two solutions:

  • Set gzip_http_version 1.0 in the nginx configs on your webservers. This switches on gzip for the http 1.0 connections coming from the proxy.
  • Set proxy_http_version 1.1 on the main proxy so that it sends http 1.1 connections to the webservers.

My choice originally was to do the first one. But a bug report came in for another site and now I've switched it on on the main proxy so that all the sites get the benefit.

Note: you might want to make different choices. Perhaps you have a caching proxy halfway? Perhaps you want the main nginx on the proxy to do the gzipping for you? Etcetera. Check whether the above tips apply to your situation :-)

Christophe Pettus: “The PCI-Compliant Database” at PGConfSV

From Planet PostgreSQL. Published on Nov 18, 2015.

I’ll be speaking about “The PCI-Compilant Database” at PGConf Silicon Valley!

Pavel Golub: Easy PostgreSQL docs for Sublime Text

From Planet PostgreSQL. Published on Nov 18, 2015.

I’m using this perfect tool called Sublime Text 3 for a bunch of tasks. One of them is viewing SQL scripts from time to time. ST3 has perfect SQL highlighting, but what I miss the most is the context help functionality, e.g. I select “LEFT JOIN” and hit F1 hot key. But that’s not problem since ST3 has a lot of packages. To solve my problem I need GotoDocumentation package and some tuning. Here is my settings for GotoDocumentation:

"docs": {
// obj containing the docs for each scope
// these are merged with the default ones
// the key value pair represent scope -> doc url
// supported placeholders:
// - %(query)s the selected text/word
// - %(scope)s the current scope
"sql": ""

"pascal": "",
// if we have no docs for the current scope
// we will try using the fallback one,
// to disable set to false
"fallback_scope": "google"

ST3 context help  for pg sql

ST3 context help for pg sql

Filed under: Coding, PostgreSQL Tagged: development, PostgreSQL, SQL, sublime text

Marko Tiikkaja: LISTENing connections aren't cheap

From Planet PostgreSQL. Published on Nov 17, 2015.

Recently I used perf to look into what could be the cause for our increased CPU usage on the PostgreSQL server (encouraged by Andres' great talk at  I was somewhat surprised to find that thirty percent of the CPU time used by postgres was spent spinning on spinlocks, i.e. doing no actual useful work.  Digging into the profile a bit more, most of these were coming from a function called

Hubert 'depesz' Lubaczewski: Waiting for 9.6 – Generate parallel sequential scan plans in simple cases.

From Planet PostgreSQL. Published on Nov 17, 2015.

On 11th of November, Robert Haas committed patch: Generate parallel sequential scan plans in simple cases.   Add a new flag, consider_parallel, to each RelOptInfo, indicating whether a plan for that relation could conceivably be run inside of a parallel worker. Right now, we're pretty conservative: for example, it might be possible to defer applying […]

Hubert 'depesz' Lubaczewski: Waiting for 9.6 – Implement lookbehind constraints in our regular-expression engine.

From Planet PostgreSQL. Published on Nov 17, 2015.

On 30th of October, Tom Lane committed patch: Implement lookbehind constraints in our regular-expression engine.   A lookbehind constraint is like a lookahead constraint in that it consumes no text; but it checks for existence (or nonexistence) of a match *ending* at the current point in the string, rather than one *starting* at the current […]

Looking for an experienced Python Developer

By Will McGugan from Django community aggregator: Community blog posts. Published on Nov 16, 2015.

My client, WildFoundry, is looking for an experienced Python developer to join us. We work in the field of IoT, and you will get to work with a variety of really cool technologies. Here's the job description:

Job Description

WildFoundry is seeking a senior Python web application developer on a 8 month contract to help us in the development of the dataplicity Internet-of-Things platform ( You would be expected to work from your home or own office most of the time and very high quality candidates based in Slovakia, Poland and the United Kingdom will be considered.

This is an unbeatable opportunity to work from home, earn excellent rates and join in with fast growing projects.

Our web application runs on a combination of Python and django, and we’re really pushing the boundaries of what we can do with both of them. You will be challenged. You’ll be playing with everything from Amazon and Linode services to embedded PCs, and a bunch of new web technologies. You’ll be writing daemons, UI for django apps, code for embedded systems, websockets.

We are seeking a top calibre developer well regarded in the community. If you’ve contributed code to OSS apps or have just built something awesome, be sure to mention it when applying. We have new requirements coming on steam almost every other week and thus it’s likely excellent devs will see their contracts extended further.

Skills & Requirements

  • Excellent Python developer (5+ years)
  • Familiarity with web protocols (HTTP, websockets)
  • Familiarity with Javascript, CSS, HTML, XML, JSON
  • Linux (Ubuntu)

Please note we do direct agreements only and are not open to agencies.


WildFoundry is an R&D group focussing exclusively on IoT devices. Some of our technology is already on the market via our distribution partners (eg and, but most of it is still under wraps.

We are based in Oxford, UK, but we’ve got developers and hardware engineers in the UK, Slovakia and Australia.

As a matter of policy, we do direct agreements only and are not open to agencies. There are no exceptions.


If this sounds like you, please apply here.

Tomas Vondra: PostgreSQL on SSD - 4kB or 8kB pages?

From Planet PostgreSQL. Published on Nov 16, 2015.

As you may be aware, PostgreSQL splits data files into 8kB pages by default. It's possible to use different page sizes (up to 32kB), but that requires compiling custom packages so almost no one does that. It's quite rare to come across an installation using different page size at a customer, for example.

Why 8kB data pages and not a different size? Clearly, smaller or larger pages might be more efficient in some cases, and indeed some other databases use different page sizes. For example InnoDB uses 16KB pages by default, and some databases even allow a mix of page sizes. So there seems to be a gain in using different page sizes, otherwise engineers working on those products would not waste time implementing it.

So what factors determine the optimal page size? And how do SSD disks change the results?

One of the fundamental rules about page size, called Five Minute Rule, was formulated by Jim Gray and G. F. Putzolu in 1985, and approaches the problem as an economic question "Given disk and RAM prices, how long should a page of given size be kept in memory?" The equation expressing the rule is essentially this

break_even_interval = (pages_per_MB * price_per_MB_RAM) * (IOPS_per_disk * price_per_disk)

and for values common in 1985 (1kB pages and disk/RAM prices in 1985), the break even interval came out as roughly 5 minutes (hence the name of the rule).

However since 1985 a lot has changed - the prices of RAM and disks plummeted, performance of rotational storage slightly improved and so on. Despite that, the 1997 revision of the paper came to about the same conclusion - the break even interval is still about 5 minutes, however assuming 8kB page size (not 1kB as used in the 1985 paper).

This is probably also the reason why PostgreSQL uses 8kB pages - Postgres95 (as it was called at that time) was released shortly before the paper was published, but it had to follow the same reasoning.

But again, a lot has changed since 1997 - the RAM/disk prices continued to plummet and, probably more importantly, flash storage is becoming the de facto storage standard in the last few years. And once again, there were updates to the paper in 2008 in ACM Queue, and in 2010 in The Register.

The general conclusion of both those updates is that the "break even" interval for 8kB pages significantly increases (up to tens of minutes or hours) and that to get back to the ~5 minute interval, the page size needs to significantly increase (to 64kB or even more).

So why PostgreSQL still uses 8kB pages?

The original paper, formulated in 1985, was aiming to answer a particular question:

Given a page size and costs for memory and disks, when does it become economical to buy memory instead of adding disks?

But it's interesting to realize that we may choose which of the variables are fixed and solve the equation for a different one, and that those variable are not entirely independent. And of course, the equation is just a simplified model of the problem as stated in 1985, thus ignoring costs that were negligible on rotational storage but may got significant on flash.

What does it mean that the variables are not entirely independent? Firstly, it's naive to assume that disks can do the same number of IOPS for all page sizes - for rotational devices the difference is not that significant as the most expensive part are the seeks. Once you move the heads to the proper location, it does not make much difference what amount of data you actually read (up to some point, of course).

For example this review of an enterprise 10k enterprise drive from Seagate can do 400 IOPS both for 4kB and 8kB random reads.

But on flash storage this is no longer true - it's quite common to see significant performance differences with varying request sizes. For example for Intel S3700 drives, the data sheet says that with 4kB pages the drive can do up to 75000 IOPS, but with 8kB it's just 47500 IOPS (i.e. a 40% drop in performance).

The other difference that comes with flash storage is that the price for reads and writes is not the same. For example the Intel S3700 can do 75000 IOPS for 4kB reads, but only 36000 IOPS for writes (i.e. less than 50%). On rotational devices, the difference is much less significant.

Another thing not reflected in the equation is the impact of page size on cache. The page size directly determines the number of "slots" in a cache - given 1GB of cache you may split it into 262.144 slots (4kB) or 131.072 slots (8kB), which of course impacts adaptivity of the cache. For example assume you frequenly access only 200.000 entries, but that those entries are scattered through much larger data set, i.e. each of those interesting entries is on a different page. With 4kB pages it's possible to keep all those pages in cache at once, with 8kB pages that's not possible.

So while the recent reviews of the "5 minute rule" suggest increasing the page size, these costs present a natural pressure against such increase, both on rotational and (especially) flash storage.

PostgreSQL with 4kB pages on SSD

To illustrate the effect of using s smaller page size on SSD, I've done a bunch of pgbench tests with 4kB and 8kB page size, and I'll present a simple comparison here. The test was done on a rather small system (i5-2500k, 8GB of RAM, S3700 SSD) on ~75GB data set (scale=5000). The database was reasonably tuned, most importantly

  • shared_buffers = 1GB
  • min_wal_size = 1GB
  • max_wal_size = 8GB

The results of a long read-write pgbench run (4 hours, 16 clients) look like this:


So using a smaller data page resulted in ~30% increase in performance, which is a significant improvement. It gets even more interesting when we look at the amount of data written to the device during the pgbench run (4 hours), as tracked by Host_Writes_32MiB - one of the SMART counters.


That is, with 8kB pages the database wrote about 962 GB of data, while with 4kB pages the amount of data is just 805 GB. But this comparison is not entirely fair, because the 4kB database also did more work, as it operated on higher transaction rate - let's compensate for this difference, by reducing the 4kB results accordingly (as if it operated on the same transaction rate).


So given the same transaction rate the difference is about 25%, which is quite significant - it may have impact on service life of the SSD device, which is closely related to the amount of writes.


The "Five minute rule" is useful and embodies a very interesting view approach to building database systems, based on simple economic reasoning. While I believe it's fundamentally right, it's just a simplified model and it's crucial to realize which aspects it neglects - in this post I've mentioned several important aspects of flash storage.

Does this mean that the smaller the page the better? Certainly not - the 4kB page was not chosen incidentally, but because that's the size of pages used by the S37000 SSD internally. In this case it eliminates write amplification, i.e. needlessly "dirtying" both 4kB half-pages when writing a single 8kB page. But lowering the page size would not help, at least not on this particular SSD. Not all SSD drives use 4kB pages, though - some SSD use smaller or larger pages (e.g. Samsung is using 8kB in some drives).

Dimitri Fontaine: All Your Base Conference 2015

From Planet PostgreSQL. Published on Nov 16, 2015.

I had the pleasure to be invited to speak at All Your Base Conference 2015 about PostgreSQL (of course). The conference gathers together lots of user experience around data management and database products, either in the now classic meaning of the word (I mean relational database management systems here) or the newer set of trade-offs represented by the NoSQL set of tools.

The conference was very nicely organized and ran smoothly, and I got the unique chance to get a back from the trenches story series about people tools of choices and the reason why they do things their way. It has been very enlightning!

If you ever get the chance to attend All Your Base, take my word for it and just go!

Vladimir Borodin: Wait interface in PostgreSQL

From Planet PostgreSQL. Published on Nov 16, 2015.

People having experience with commercial RDBMS are used to have the ability to answer the question “What a particular session is doing right now?” Or even “What was that session waiting 5 minutes ago?” For a long time PostgreSQL did not have such diagnostic tools and DBAs used to get out with different ways of sophistication. I gave a talk on (in Russian) about how we do it. This talk was collaborative with Ildus Kurbangaliev from PostgrePro. And Ildus was just speaking about tool that allows to answer questions above.

Strictly speaking it is not the first try to implement what people used to call wait [events] interface, but all previous attempts were not brought to some reasonable state and died as proof of concept patches. But pg_stat_wait is currently available as a set of patches to current stable 9.4 branch and currently developing 9.6 (actual versions should be looked at pgsql-hackers@).

After quite long testing and fixing bugs we even deployed them to production.


Before it all becomes part of core PostgreSQL you need to recompile postgres. I think description of rebuilding as ./configure && make && sudo make install is meaningless — much better to look into documentation.

After it you should add pg_stat_wait to shared_preload_libraries. Additionally, you can add following options to postgresql.conf:

  • waits_monitoring = on - enabling functionality on,
  • pg_stat_wait.history = on - storing history of wait events,
  • pg_stat_wait.history_size = 1000000 - number of last events to keep in history,
  • pg_stat_wait.history_period = 1000 - how often should wait events be stored in history (ms).

After that you should restart PostgreSQL and make CREATE EXTENSION pg_stat_wait. After that everything will start working.


What exactly will start to work? First you may look at what is inside the extension:

rpopdb01g/postgres M # \dxS+ pg_stat_wait
           Objects in extension "pg_stat_wait"
                   Object Description
 function pg_is_in_trace(integer)
 function pg_start_trace(integer,cstring)
 function pg_stat_wait_get_current(integer)
 function pg_stat_wait_get_history()
 function pg_stat_wait_get_profile(integer,boolean)
 function pg_stat_wait_make_test_lwlock(integer,integer)
 function pg_stat_wait_reset_profile()
 function pg_stop_trace(integer)
 function pg_wait_class_list()
 function pg_wait_event_list()
 view pg_stat_wait_current
 view pg_stat_wait_history
 view pg_stat_wait_profile
 view pg_wait_class
 view pg_wait_event
 view pg_wait_events
(16 rows)

rpopdb01g/postgres M #

Let’s see what wait events pg_stat_wait is able to monitor:

rpopdb01g/postgres M # SELECT version();
 PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
(1 row)

rpopdb01g/postgres M # SELECT class_name, count(event_name)
FROM pg_wait_events GROUP BY 1 ORDER BY 2 DESC;
 class_name | count
 LWLocks    |    52
 Storage    |     9
 Locks      |     9
 Network    |     3
 Latch      |     1
 CPU        |     1
(6 rows)

rpopdb01g/postgres M #

You can see that waits monitoring for 9.4 knows about 52 LWLocks and for disk, for example, it can track next things:

rpopdb01g/postgres M # SELECT * FROM pg_wait_events WHERE class_id = 3;
 class_id | class_name | event_id | event_name
        3 | Storage    |        0 | SMGR_READ
        3 | Storage    |        1 | SMGR_WRITE
        3 | Storage    |        2 | SMGR_FSYNC
        3 | Storage    |        3 | XLOG_READ
        3 | Storage    |        4 | XLOG_WRITE
        3 | Storage    |        5 | XLOG_FSYNC
        3 | Storage    |        6 | SLRU_READ
        3 | Storage    |        7 | SLRU_WRITE
        3 | Storage    |        8 | SLRU_FSYNC
(9 rows)

rpopdb01g/postgres M #

Under “can track” the following is meant:

  • What and how long a particular process is waiting right now?
  • How many times a particular process hung in waiting of every event type and how much time did it spend waiting?
  • What was a particular process waiting some time ago?

For answering these questions there are pg_stat_wait_current, pg_stat_wait_profile, pg_stat_wait_history respectively. Best seen on the examples.


rpopdb01g/postgres M # SELECT pid, class_name, event_name, wait_time
FROM pg_stat_wait_current WHERE class_id NOT IN (4, 5)
ORDER BY wait_time DESC;
  pid  | class_name |  event_name   | wait_time
 23510 | LWLocks    | BufferLWLocks |     17184
 23537 | LWLocks    | BufferLWLocks |      9367
 23628 | LWLocks    | BufferLWLocks |      9366
 23502 | LWLocks    | BufferLWLocks |      3215
 23504 | LWLocks    | BufferLWLocks |      2846
 23533 | LWLocks    | BufferLWLocks |      2788
 23514 | LWLocks    | BufferLWLocks |      2658
 23517 | LWLocks    | BufferLWLocks |      2658
 23532 | LWLocks    | BufferLWLocks |      2641
 23527 | LWLocks    | BufferLWLocks |      2507
 23952 | Storage    | SMGR_READ     |      2502
 23518 | Storage    | XLOG_FSYNC    |      1576
 23524 | LWLocks    | WALWriteLock  |      1027
(13 rows)

rpopdb01g/postgres M #

We remove waits of classes ‘Network’ and ‘Latch’ because their waiting time is usually several orders of magnitude longer than waits of other classes. And listed above columns are not all columns that exist in the view:

smcdb01d/postgres M # SELECT * FROM pg_stat_wait_current
WHERE class_id IN (2, 3) LIMIT 2;
-[ RECORD 1 ]-----------------------------
pid        | 12107
sample_ts  | 2015-11-16 10:36:59.598562+03
class_id   | 2
class_name | Locks
event_id   | 4
event_name | Transaction
wait_time  | 24334
p1         | 5
p2         | 255593733
p3         | 0
p4         | 0
p5         | 0
-[ RECORD 2 ]-----------------------------
pid        | 1266
sample_ts  | 2015-11-16 10:36:59.598562+03
class_id   | 3
class_name | Storage
event_id   | 0
event_name | SMGR_READ
wait_time  | 1710
p1         | 1663
p2         | 16400
p3         | 20508
p4         | 0
p5         | 220036

smcdb01d/postgres M #

Parameters p1-p5 are text fields. For example, for heavy-weight locks they give approximately same information that you can see in pg_locks view and for disk I/O waits you can understand from which DB, relation and block we were waiting while reading.


For example, you can see how much time DB spent in each class of waits:

rpopdb01g/postgres M # SELECT class_name, sum(wait_time) AS wait_time,
sum(wait_count) AS wait_count FROM pg_stat_wait_profile
GROUP BY class_name ORDER BY wait_time DESC;
 class_name |  wait_time   | wait_count
 Network    | 144196945815 |   11877848
 Latch      |  90164921148 |    3521073
 LWLocks    |   2648490737 |   10501900
 Storage    |    977430136 |   36444251
 CPU        |     68890774 |  365699457
 Locks      |           74 |          1
(6 rows)

rpopdb01g/postgres M #

Or which LWLocks are the hottest in the system:

rpopdb01g/postgres M # SELECT event_name, sum(wait_time) AS wait_time,
sum(wait_count) AS wait_count FROM pg_stat_wait_profile
WHERE class_id = 1 AND wait_time != 0 AND wait_count != 0
GROUP BY event_name ORDER BY wait_time DESC;
      event_name      | wait_time  | wait_count
 LockMgrLWLocks       | 1873294341 |    3870685
 WALWriteLock         | 1039279117 |     859101
 BufferLWLocks        |  299153931 |    7356555
 BufFreelistLock      |    7466923 |      75484
 ProcArrayLock        |    2321769 |      34355
 CLogControlLock      |     778148 |      21286
 WALInsertLocks       |     456224 |       7451
 BufferMgrLocks       |     107374 |       8447
 XidGenLock           |      84914 |       2506
 UserDefinedLocks     |       1875 |          7
 CLogBufferLocks      |        868 |         80
 SInvalWriteLock      |         11 |          3
 CheckpointerCommLock |          1 |          1
(13 rows)

Time: 29.388 ms
rpopdb01g/postgres M #

These two examples show that waiting time does not always correlate with wait events count. That’s why sampling without accounting waiting time can give not right the whole picture.


This view allows to see what a particular process was waiting for in the past. Storage depth and sampling interval can be configured as shown above.

xivadb01e/postgres M # SELECT sample_ts, class_name, event_name, wait_time
FROM pg_stat_wait_history WHERE pid = 29585 ORDER BY sample_ts DESC LIMIT 10;
           sample_ts           | class_name |   event_name    | wait_time
 2015-11-16 10:56:28.544052+03 | LWLocks    | BufferMgrLocks  |    983997
 2015-11-16 10:56:27.542938+03 | LWLocks    | CLogControlLock |    655975
 2015-11-16 10:56:26.850302+03 | LWLocks    | WALInsertLocks  |    979516
 2015-11-16 10:56:25.849207+03 | LWLocks    | WALInsertLocks  |    207418
 2015-11-16 10:56:24.848059+03 | LWLocks    | WALInsertLocks  |    923916
 2015-11-16 10:56:23.846909+03 | LWLocks    | WALInsertLocks  |    753185
 2015-11-16 10:56:22.845808+03 | LWLocks    | WALInsertLocks  |    877707
 2015-11-16 10:56:21.844718+03 | LWLocks    | WALInsertLocks  |    778897
 2015-11-16 10:56:20.843562+03 | LWLocks    | CLogControlLock |    991267
 2015-11-16 10:56:19.842464+03 | LWLocks    | CLogControlLock |   1001059
(10 rows)

xivadb01e/postgres M #

Session tracing

All described above views are designed to be always turned on, their performance overhead is minimal. But there are cases when sampling once in pg_stat_wait.history_period is not enough and you need to see all waits of a particular process. In that case you should use functions for tracing, for example:

rpopdb01g/postgres M # SELECT pg_backend_pid();
(1 row)

rpopdb01g/postgres M # SELECT pg_start_trace(5399, '/tmp/5399.trace');
INFO:  00000: Trace was started to: /tmp/5399.trace
LOCATION:  StartWait, wait.c:259

(1 row)

rpopdb01g/postgres M # SELECT pg_is_in_trace(5399);
(1 row)

-- some activity

rpopdb01g/postgres M # SELECT pg_stop_trace(5399);
INFO:  00000: Trace was stopped
LOCATION:  StartWait, wait.c:265

(1 row)

rpopdb01g/postgres M #

A simple text file would be created where there would be two lines for each wait event, for example:

start 2015-11-16 11:17:26.831686+03 CPU MemAllocation 0 0 0 0 0
stop 2015-11-16 11:17:26.831695+03 CPU
start 2015-11-16 11:17:26.831705+03 LWLocks BufferLWLocks 122 1 0 0 0
stop 2015-11-16 11:17:26.831715+03 LWLocks
start 2015-11-16 11:17:26.831738+03 Network WRITE 0 0 0 0 0
stop 2015-11-16 11:17:26.831749+03 Network
start 2015-11-16 11:17:26.831795+03 Network READ 0 0 0 0 0
stop 2015-11-16 11:17:26.831808+03 Network
start 2015-11-16 11:17:26.831825+03 Storage SMGR_READ 1663 13003 12763 0 13
stop 2015-11-16 11:17:26.831844+03 Storage

Instead of conclusion

Wait interface is the long-awaited feature in PostgreSQL which allows significantly improve the understanding of what is happening inside the database. Right now this functionality is kicked into core PostgreSQL so that starting from 9.6 you would not need to recompile postgres.

Just in case, shortly before Ildus submitted his implementation on pgsql-hackers@ Robert Haas proposed the same idea and lots of people supported this idea. To become it true a couple of preparatory patches have already been commited, for example Refactoring of LWLock tranches.

I really hope that it will become part of PostgreSQL in 9.6.

A real Python "wat"

By James Bennett from Django community aggregator: Community blog posts. Published on Nov 15, 2015.

A few weeks ago I went through and explained the various items in a list of “Python wats” — behaviors of Python which seemingly made no sense. Calling them “wats” is a bit of a stretch in most cases, though, because most of them were simply consequences of fairly reasonable design decisions in how Python or its standard libraries work, but presented in ways which obscured what was actually going on.

Lest I be accused of ...

Read full entry

Buildout 2.5.0 has much nicer version conflict reporting

By Reinout van Rees' weblog from Django community aggregator: Community blog posts. Published on Nov 15, 2015.

We use buildout for all our django projects. Nothing wrong with pip, but buildout has extension possibilities build-in (for creating directories, installing user crontabs, local development checkouts and many more) that are quite helpful. And it works better when you need to use system packages (gdal, mapnik, etc).

One area where buildout could use some improvement was the version conflict reporting. Let's say you have pinned django to 1.6.6 (old project that I'll upgrade to 1.8 this week) and you add the django debug toolbar. This is the error you get:

The constraint, 1.6.6, is not consistent with the requirement, 'Django>=1.7'.
  Updating django.
Error: Bad constraint 1.6.6 Django>=1.7

First things first. An easy one is to improve the wording of the message:

  Installing django.
Error: The requirement ('Django>=1.7') is not allowed by
your [versions] constraint (1.6.6)

Now... so there is some package that requires at least django 1.7. But which one? Buildout did not tell you. Which would mean you'd have to grep in all your requirements' sub-requirements for which package actually requires the offending "django>=1.7"...

I've now added some internal logging that stores which package required which dependency. After an error occurs, the list is searched for possible matches.

With this change you'll get a much more helpful output right before the error:

Installing django.
version and requirements information containing django:
  [versions] constraint on django: 1.6.6
  Base installation request: 'sso', 'djangorecipe'
  Requirement of djangorecipe==1.10: Django
  Requirement of djangorecipe==1.10: zc.recipe.egg
  Requirement of djangorecipe==1.10: zc.buildout
  Requirement of sso: django-nose
  Requirement of sso: django-mama-cas
  Requirement of sso: django-debug-toolbar
  Requirement of sso: django-auth-ldap
  Requirement of sso: Django<1.7,>=1.4.2
  Requirement of lizard-auth-server: django-nose
  Requirement of lizard-auth-server: django-extensions
  Requirement of lizard-auth-server: Django<1.7,>=1.6
  Requirement of django-nose: Django>=1.2
  Requirement of django-nose: nose>=1.2.1
  Requirement of django-mama-cas: requests==1.1.0
  Requirement of django-debug-toolbar: sqlparse
  Requirement of django-debug-toolbar: Django>=1.7
  Requirement of django-auth-ldap: python-ldap>=2.0
  Requirement of django-auth-ldap: django>=1.1
  Requirement of translations: Django>=1.4
  Requirement of django-extensions: six>=1.2
  Installing django.
Error: The requirement ('Django>=1.7') is not allowed by
your [versions] constraint (1.6.6)

This makes it much easier to spot the cause (in this case django-debug-toolbar).

There are some unrelated packages in here because I'm doing a textual comparison. The advantage is that it is very robust. And extracting the right package name from requirements without messing things up is harder to get right and takes more code.

So... if you use buildout, give version 2.5.0 a try!

Ember application structure

By Piotr Maliński from Django community aggregator: Community blog posts. Published on Nov 15, 2015.

Ember.js applications are divided into multiple files and folders. All of which makes sense when we get to know what is where, so lets take a look on Ember application structure.

Andreas Scherbaum: FOSDEM PGDay and Devroom 2016 - Announcement & Call for Papers

From Planet PostgreSQL. Published on Nov 15, 2015.

Andreas 'ads' Scherbaum

FOSDEM PGDay is a one day conference that will be held ahead of FOSDEM in Brussels, Belgium, on Jan 29th, 2016. This will be a one-day focused PostgreSQL event, with a single track of talks. Registration is required to attend, the registration will open soon. Since we have a limited number of seats available for this event, we urge everybody to register as soon as possible once open.

PostgreSQL Europe will also have our regular devroom at FOSDEM on Sunday the 31st, which will be held at the main FOSDEM venue at ULB. This day will, of course, continue to be free of charge and open to all FOSDEM entrants. No registration is required to attend this day.

For full details about the conference, venue and hotel, see

The call for papers is now open for both these events. We are looking for talks to fill both these days with content for both insiders and new users. Please see for details and submission information.

The deadline for submissions is December 7th, 2015, but we may as usual pre-approve some talks, so get your submissions in soon!

We will provide a special rate with the Brussels Marriott Hotel. For details, see

Setting up ember-cli development environment with ember 2.1

By Piotr Maliński from Django community aggregator: Community blog posts. Published on Nov 15, 2015.

In a series of tutorials starting with this one I'll try to showcase ember.js framework for building fronted web applications. As a backend there will be Django Rest Frameowork and more.

As times change and JavaScript frameworks don't just download to your static folder I'll start with setting up Ember.js development environment with ember-cli.

Peter Geoghegan: Suggesting a corrected column name/spelling in the event of a column misspelling

From Planet PostgreSQL. Published on Nov 14, 2015.

One small PostgreSQL 9.5 feature I worked on is the new hinting mechanism feature, which sometimes hints, based on a score, what you might have meant to type following misspelling a column name in an SQL query. The score heavily weighs levenshtein distance. A HINT message is sent to the client, which psql and other client tools will display by default.

It's common to not quite recall offhand if a column name is pluralized, or where underscores are used to break up words that make up the name of a column. This feature is targeted at that problem, providing guidance that allows the user to quickly adjust their query without mental context switching. For example:

postgres=# select * from orders where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: select * from orders where order_id = 5;
HINT:  Perhaps you meant to reference the column "orders"."orderid".

You may also see a hint in the case of two possible matches, provided both matches have the same score, and the score crosses a certain threshold of assumed usefulness:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where order_id =...
HINT:  Perhaps you meant to reference the column "o"."orderid" or the column "ol"."orderid".

If an alias was used here (which this query must have anyway), the hint becomes more specific:

postgres=# select * from orders o join orderlines ol on o.orderid = ol.orderid where o.order_id = 5;
ERROR:  42703: column o.order_id does not exist
LINE 1: ...oin orderlines ol on o.orderid = ol.orderid where o.order_id...
HINT:  Perhaps you meant to reference the column "o"."orderid".

This feature should make writing queries interactively in psql a bit more pleasant. Mental context switching to figure these incidental details out has a tendency to slow things down.

We need less powerful languages

By Luke Plant from Django community aggregator: Community blog posts. Published on Nov 14, 2015.

Many systems boast of being ‘powerful’, and it sounds difficult to argue that this is a bad thing. Almost everyone who uses the word assumes that it is always a good thing.

The thesis of this post is that in many cases we need less powerful languages and systems.

Before I get going, I should say first of all that very little in this post is original. The train of thought behind it was set off by reading Hofstadter's book Gödel, Escher, Bach — an Eternal Golden Braid which helped me pull together various things in my own thinking where I've seen the principle in action. Philip Wadler's post on the rule of least power was also formative, and most of all I've also taken a lot from the content of this video from a Scala conference about everything that is wrong with Scala, which makes the following fairly central point:

Every increase in expressiveness brings an increased burden on all who care to understand the message.

My aim is simply to illustrate this point using examples that might be more accessible to the Python community than the internals of a Scala compiler.

I also need a word about definitions. What do we mean by “more powerful” or “less powerful” languages? In this article, I mean something roughly like this: “the freedom and ability to do whatever you want to do”, seen mainly from the perspective of the human author entering data or code into the system. This roughly aligns with the concept of “expressiveness”.

The problem with this kind of freedom is that every bit of power you insist on have when writing in the language corresponds to power you must give up at other points of the process — when ‘consuming’ what you have written. I'll illustrate this with various examples which range beyond what might be described as programming, but have the same principle at heart.

We'll also need to ask “Does this matter?” It matters, of course, to the extent that you need to be able to ‘consume’ the output of your system. Different players who might ‘consume’ the message are software maintainers, compilers and other development tools, which means you almost always care — this has implications both for performance and correctness as well as human concerns.

Databases and schema

Starting at the low end of the scale in terms of expressiveness, there is what you might call data rather than language. But the principle applies here.

In my years of software development, I've found that clients and users often ask for “free text” fields. A free text field is maximally powerfully as far as the end user is concerned — they can put whatever they like in. In this sense, this is the “most useful” field — you can use it for anything.

But precisely because of this, it is also the least useful, because it is the least structured. Even search doesn't work reliably because of typos and alternative ways of expressing the same thing. The longer I do software development involving databases, the more I want to tightly constrain everything as much as possible. When I do so, the data I end up with is massively more useful. I can do powerful things when consuming the data only when I severely limit the power (i.e. the freedom) of the agents putting data into the system.

In terms of database technologies, the same point can be made. Databases that are “schema-less” give you great flexibility and power when putting data in, and are extremely unhelpful when getting it out. A key-value store is a more technical version of "free text", with the same drawbacks — it is pretty unhelpful when you want to extract info or do anything with the data, since you cannot guarantee that any specific keys will be there.


The success of the web has been partly due to the fact that some of the core technologies, HTML and CSS, have been deliberatedly limited in power. Indeed, you probably wouldn't call them programming languages, but markup languages. This, however, was not an accident, but a deliberate design principle on the part of Tim Berners Lee. I can't do better than to quote that page at length:

Computer Science in the 1960s to 80s spent a lot of effort making languages which were as powerful as possible. Nowadays we have to appreciate the reasons for picking not the most powerful solution but the least powerful. The reason for this is that the less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative form, anyone can write a program to analyze it in many ways. The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators. If, for example, a web page with weather data has RDF describing that data, a user can retrieve it as a table, perhaps average it, plot it, deduce things from it in combination with other information. At the other end of the scale is the weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about. This the only way to find out what a Java applet means is to set it running in front of a person.

This is has become a W3C principle:

Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.

Note that this is almost exactly the opposite of Paul Graham's advice (with the caveat that 'power' is often too informally defined to compare):

if you have a choice of several languages, it is, all other things being equal, a mistake to program in anything but the most powerful one.

Python file

Moving up towards ‘proper’ programming language, I came across this example — the file format used by distutils/setuptools. If you have had to create a package for a Python library, you may well have used it.

The file format is essentially a very small language for defining what files should be included in your Python package (relative to the file, which we'll call the working directory from now on). It might look something like this:

include README.rst
recursive-include foo *.py
recursive-include tests *
global-exclude *~
global-exclude *.pyc
prune .DS_Store

There are two types of directive: include type directives (include, recursive-include, global-include and graft), and exclude type directives (exclude, recursive-exclude, global-exclude and prune).

There comes a question — how are these directives to be interpreted?

You could interpret them in this way:

A file from the working directory (or sub-directories) should be included in the package if it matches at least one include type directive, and does not match any exclude type directive.

This would make it a declarative language.

Unfortunately, that is not how the language is defined. The distutils docs for are specific about this — the directives are to be understood as follows (my paraphrase):

  1. Start with an empty list of files to include in the package (or technically, a default list of files).
  2. Go down the directives in the in order.
  3. For every include type directive, copy any matching files from the working directory to the list for the package.
  4. For every exclude type directive, remove any matching files from the list for the package.

As you can see, this interpretation defines a language that is imperative in nature - each line of is a command that implies an action with side effects.

The point to note is that this makes the language more powerful than my speculative declarative version above. For example, consider the following:

recursive-include foo *
recursive-exclude foo/bar *
recursive-include foo *.png

The end result of the above commands is that .png files that are below foo/bar are included, but all other files below foo/bar are not. If I'm thinking straight, to replicate the same result using the declarative language is harder — you would have to do something like the following, which is obviously sub-optimal:

recursive-include foo *
recursive-exclude foo/bar *.txt *.rst *.gif *.jpeg *.py ...

So, because the imperative language is more powerful, there is a temptation to prefer that one. However, the imperative version comes with significant drawbacks:

  1. It is much harder to optimise.

    When it comes to interpreting the and building a list of files to include in the package, one fairly efficient solution for a typical case is to first build an immutable list of all files in the directory and its sub-directories, and then apply the rules: addition rules involve copying from the full list to an output list, and subtraction rules involve removing from the output list. This is how the Python implementation currently does it.

    This works OK, unless you have many thousands of files in the full list, most of which are going to get pruned or not included, in which case you can spend a lot of time building up the full list, only to ignore most of it.

    An obvious shortcut is to not recurse into directories that would be excluded by some exclude directive. However, you can only do that if the exclude directives come after all include directives.

    This is not a theoretical problem — I've found that doing sdist and other commands can take 10 minutes to run, due to large number of files in the working directory if you use the tool tox for instance. This means that runs of tox itself (which uses become very slow. I am currently attempting to fix this issue, but it is looking like it will be really hard.

    Adding the optimised case might not look that hard (you can shortcut the file system traversal using any exclude directives that come after all include directives), but it adds sufficiently to the complexity that a patch is unlikely to be accepted — it increases the number of code paths and the chances of mistakes to the point of it not being worth it.

    It might be that the only practical solution is to avoid altogether and optimise only the case when it is completely empty.

  2. The power has a second cost — files are harder to understand.

    First, in understanding how the language works — the docs for this are considerably longer than for the declarative version I imagined.

    Second, in analysing a specific file — you have to execute the commands in your head in order to work out what the result will be, rather than being able to take each line on its own, or in any order that makes sense to you.

    This actually results in packaging bugs. For instance, it would be easy to believe that a directive like:

    global-exclude: *~

    at the top of a file would result in any file name ending in ~ (temporary files created by some editors) being excluded from the package. In reality it does nothing at all, and the files will be erroneously included if other commands include them.

    Examples I've found of this mistake (exclude directives that don't function as intended or are useless) include:

    • hgview (exclude directives at the top do nothing)
    • django-mailer (global-exclude at the top does nothing)
  3. Another result is that you cannot groups lines in the file in any way you please, for clarity, since re-ordering changes the meaning of the file.

In addition, virtually no-one will actually use the additional power. I'm willing to bet that 99.99% files do not make use of the additional power of the imperative language (I downloaded 250 and haven't found any that do). So we could have been served much better by a declarative language here instead of an imperative one. But backwards compatibility forces us to stick with this.

URL reversing

One core piece of the Django web framework is URL routing. This is the component that parses URLs and dispatches them to the handler for that URL, possibly passing some components extracted from the URL.

In Django, this is done using regular expressions. For an app that displays information about kittens, you might have a kittens/ with the following:

from django.conf.urls import url

from kittens import views

urlpatterns = [
    url(r'^kittens/$', views.list_kittens, name="kittens_list_kittens"),
    url(r'^kittens/(?P<id>\d+)/$', views.show_kitten, name="kittens_show_kitten"),

The corresponding file looks like:

def list_kittens(request):
    # ...

def show_kitten(request, id=None):
    # ...

Regular expressions have a capture facility built in, which is used to capture parameters that are passed to the view functions. So, for example, if this app were running on, a URL like results in calling the Python code show_kitten(request, id=23).

Now, as well as being able to route URLs to specific functions, web apps almost always need to generate URLs. For example, the kitten list page will need to include links to the individual kitten page i.e. show_kitten. Obviously we would like to do this in a DRY way, re-using the URL routing configuration.

However, we would be using the URL routing configuration in the opposite direction. When doing URL routing, we are doing:

URL path -> (handler function, arguments)

In URL reversing, we know the handler function and arguments we want the user to arrive at, and want to generate a URL that will take the user there, after going through the URL routing:

(handler function, arguments) -> URL path

In order to do this, we essentially have to predict the behaviour of the URL routing mechanism. We are asking “given a certain output, what is the input?”

In the very early days Django did not include this URL reversing facility, but it was found that with most URLs, it was possible to 'reverse' the URL pattern. The regex can be parsed looking for the static elements and the capture elements.

Note, first of all, that this is only possible at all because the language being used to define URL routes is a limited one — regular expressions. We could easily have defined URL routes using a more powerful language. For example, we could have defined them using functions that:

  • take a URL path as input
  • raise NoMatch if they do not match
  • return a truncated URL and an optional set of captures if they do match.

Our kittens would look like something like this:

from django.conf.urls import url, NoMatch

def match_kitten(path):
    KITTEN = 'kitten/'
    if path.startswith():
        return path[len(KITTEN):], {}
    raise NoMatch()

def capture_id(path):
    part = path.split('/')[0]
        id = int(part)
    except ValueError:
        raise NoMatch()
    return path[len(part)+1:], {'id': id}

urlpatterns = [
    url([match_kitten], views.list_kittens, name='kittens_list_kittens'),
    url([match_kitten, capture_id], views.show_kitten, name="kittens_show_kitten"),

Of course, we could provide helpers that make things like match_kitten and capture_id much more concise:

from django.conf.urls import url, m, c

urlpatterns = [
    url([m('kitten/'), views.list_kittens, name='kittens_list_kittens'),
    url([m('kitten/'), c(int)], views.show_kitten, name="kittens_show_kitten"),

Now, this language for URL routing is actually a lot more powerful than our regex based one, assuming that m and c are returning functions as above. The interface for matching and capturing is not limited to the capabilities of regexes — for instance, we could do database lookups for the IDs, or many other things.

The downside, however, is that URL reversing would be entirely impossible. For general, Turing complete languages, you cannot ask “given this output, what is the input?”. We could potentially inspect the source code the function and look for known patterns, but it quickly becomes totally impractical.

With regular expressions, however, the limited nature of the language gives us more options. In general, URL configuration based on regexes is not reversible — a regex as simple as . cannot be reversed uniquely. (Since we want to generate canonical URLs normally, a unique solution is important. As it happens, for this wild card, Django currently picks an arbitrary character, but other wild cards are not supported). But as long as wild cards of any sort are only found within capture groups (and possibly some other constraints), the regex can be reversed.

So, if we want to be able to reliably reverse the URL routes, we actually want a language less powerful than regular expressions. Regular expressions were presumably chosen because they were powerful enough, without realising that they were too powerful.

Additionally, in Python defining mini-languages for this kind of thing is quite hard, and requires a fair amount of boilerplate and verbosity both for implementation and usage — much more than when using a string based language like regexes. In languages like Haskell, relatively simple features like easy definitions of algebraic data types and pattern matching make these things much easier.

Regular expressions

The mention of regexes as used in Django's URL routing reminds me of another problem:

Many usages of regexes are relatively simple, but whenever you invoke a regex, you get the full power whether you need it or not. One consequence is that for some regular expressions, the need to do backtracking to find all possible matches means that it is possible to construct malicious input that takes a huge amount of time to be processed by the regex implementation.

This has been the cause of a whole class of Denial Of Service vulnerabilities in many web sites and services, including one in Django due to an accidentally 'evil' regex in the URL validator — CVE-2015-5145.

Django templates vs Jinja templates

The Jinja template engine was inspired by the Django template language, but with some differences in philosophy and syntax.

One major advantage of Jinja2 over Django is that of performance. Jinja2 has an implementation strategy which is to compile to Python code, rather than run an interpreter written in Python, which is how Django works, and this results in a big performance increase — often 5 to 20 times. (YMMV etc.)

Armin Ronacher, the author of Jinja, attempted to use the same strategy to speed up Django template rendering. There were problems, however.

The first he knew about when he proposed the project — namely that the extension API in Django makes the approach taken in Jinja very difficult. Django allows custom template tags that have almost complete control over the compilation and rendering steps. This allows some powerful custom template tags like addtoblock in django-sekizai that seems impossible at first glance. However, if a slower fallback was provided for these less common situations, a fast implementation might still have been useful.

However, there is another key difference that affects a lot of templates, which is that the context object that is passed in (which holds the data needed by the template) is writable within the template rendering process in Django. Template tags are able to assign to the context, and in fact some built-in template tags like url do just that.

The result of this is a key part of the compilation to Python that happens in Jinja is impossible in Django.

Notice that in both of these, it is the power of Django's template engine that is the problem — it allows code authors you to do things that are not possible in Jinja2. However, the result is that a very large obstacle is placed in the way of attempts to compile to fast code.

This is not a theoretical consideration. At some point, performance of template rendering becomes an issue for many projects, and a number have been forced to switch to Jinja because of that. This is far from an optimal situation!

Often the issues that make optimisation difficult are only clear with the benefit of hindsight, and it isn't true to say that simply adding restrictions to a language is necessarily going to make it easier to optimise. You might also say that for the Django template designers, allowing the context object to be writable was the obvious choice because Python data structures are typically mutable by default. Which brings us to Python...


There are many ways that we could think about the power of the Python language, and how it makes life hard for every person and program that wants to make sense of Python code.

Compilation and performance of Python is an obvious one. The unrestricted effects that are possible at any point, including writable classes and modules etc., not only allow authors to do some very useful things, they make it extremely difficult to execute Python code quickly. PyPy has made some impressive progress, but looking at the curve from PyPy 1.3 onward, which shows diminishing returns, makes it clear that they are unlikely to make much bigger gains in the future. And the gains that have been made in terms of run time have often been at the expense of memory usage. There is simply a limit to how well you can optimise Python code.

(Please note, to all who continue reading this — I'm not a Python basher or a Django basher for that matter. I'm a core developer of Django, and I use Python and Django in almost all my professional programming work. The point of this post is illustrate the problems caused by powerful languages).

However, rather than focus on the performance problems of Python, I'm going to talk about refactoring and maintenance. If you do any serious work in a language, you find yourself doing a lot of maintenance, and being able to do it quickly and safely often becomes very important.

So, for example, in Python, and with typical VCS tools (Git or Mercurial, for instance), if you re-order functions in a module e.g. move a 10 line function to a different place, you get a 20 line diff, despite the fact that nothing changed in terms of the meaning of the program. And if something did change (the function was both moved and modified), it's going to be very difficult to spot.

This happened to me recently, and set me off thinking just how ridiculously bad our toolsets are. Why on earth are we treating our highly structured code as bunch of lines of text? I can't believe that we are still programming like this, it is insane.

At first, you might think that this could be solved with a more intelligent diff tool. But the problem is that in Python, the order in which functions are defined can in fact change the meaning of a program (i.e. change what happens when you execute it).

Here are a few examples:

Using a previously defined function as a default argument:

def foo():

def bar(a, callback=foo):

These functions can't be re-ordered or you'll get a NameError for foo in the definition of bar.

Using a decorator:

def foo():

def bar():

Due to unrestricted effects that are possible in @decorateit, you can't safely re-order these functions and be sure the program will do the same thing afterwards. Similarly, calling some code in the function argument list:

def foo(x=Something()):

def bar(x=Something()):

Similarly, class level attributes can't be re-ordered safely:

class Foo():
    a = Bar()
    b = Bar()

Due to unrestricted effects possible inside the Bar constructor, the definitions of a and b cannot be re-ordered safely. (This might seem theoretical, but Django, for instance, actually uses this ability inside Model and Form definitions to provide a default order for the fields, using a class level counter inside the base Field constructor).

Ultimately, you have to accept that a sequence of function statements in Python is a sequence of actions in which objects (functions and default arguments) are created, possibly manipulated, etc. It is not a re-orderable set of function declarations as it might be in other languages.

This gives Python a certain power when it comes to writing it, but imposes massive restrictions on what you can do in any automated way to manipulate Python source code.

Above I used the simple example of re-ordering two functions or class attributes. But every single type of refactoring that you might do in Python becomes virtually impossible to do safely because of the power of the language e.g. duck typing means you can't do method renames, the possibility of reflection/dynamic attribute access (getattr and friends) means you can't in fact do any kind of automated renames (safely).

So, if we are tempted to blame our crude VCS or refactoring tools, we actually have to blame the power of Python — despite the huge amount of structure in correct Python source code, there is very little that any software tool can do with it when it comes to manipulating it, and line-based diffing is actually a reasonable approach.

In an ideal world, with my dream language, when you rename a function, the entire 'diff' in your VCS should simply be "Function foo renamed to bar". (And, this should be exportable, so that when you upgrade a dependency to a version in which foo is renamed to bar, it should be exactly zero work to deal with this). In a “less powerful” language, this would be possible, but the power given to the program author in Python has taken power from all the other tools in the environment.

Does this matter? It depends on how much time you spend manipulating your code, compared to using code to manipulate data.

At the beginning of a project, you may be tempted to desire the most powerful language possible, because it gives you the most help and freedom in terms of manipulating data. But later on, you spend a huge amount of time manipulating code, and often using an extremely basic tool to do so — a text editor. This treats your highly structured code as one of the least structured forms of data — a string of text — exactly the kind of manipulation you would avoid at all costs inside your code. But all the practices you would choose and rely inside your program (manipulating all data inside appropriate containers) are no longer available to you when it comes to manipulating the program itself.

Some popular languages make automated refactoring easier, but more is needed: to actually make use of the structure of your code, you need an editor and VCS that understand your code properly. Projects like Lamdu are in the right direction, but still in their infancy, and unfortunately involving re-thinking the entire software development stack.


When you consider the total system and all the players (whether software or human), including the need to produce efficient code, and long term maintainability, less powerful languages are actually more powerful — “slavery is freedom”. There is a balance between expressiveness and reasonability.

The more powerful a language, the greater the burden on software tools, which either are need to be more complicated in order to work, or are forced to do less than they could. This includes:

  • compilers — with big implications for performance
  • automated refactoring and VCS tools — with big implications for maintenance.

Similarly, the burden also increases for humans — for anyone attempting to understand the code or modify it.

A natural instinct is to go for the most powerful solution, or a solution that is much more powerful than is actually needed. We should try to do the opposite — find the least powerful solution that will do the job.

This won't happen if creating new languages (which might involve parsers etc.) is hard work. We should prefer software ecosystems that make it easy to create very small and weak languages.

Andrew Dunstan: A record with all defaults

From Planet PostgreSQL. Published on Nov 14, 2015.

Fresh from yesterday's help file:

Say you want to insert a record into your table with all the default values set. Later on you'll update some of the values. You might want to do this so you can get the row's ID field. So you try this:
insert into mytable values () returning my_id;
but you find it doesn't work. VALUES isn't allowed to have an empty list. Here's what you do:
insert into mytable values (default) returning my_id;
This will insert the default value explicitly into first column in the table, and implicitly into all the other columns. It doesn't matter if any of the columns doesn't have an explicit default value set - in that case the default is NULL. Of course, that means this won't work if you have NOT NULL columns without a default set.

Abdul Yadi: Extension for QR Code Bitmap

From Planet PostgreSQL. Published on Nov 13, 2015.

I have gone through an interesting small size C code package for QR encoding at Its logic is contained in two small files: QR_Encode.h (3.7KB) and QR_Encode.c (61KB). Then it is sun-shiny Saturday morning as I manage to add in-memory monochrome bitmap construction (1 bit per pixel) and wrap the whole package as PostgreSQL extension module. I share it at qrcode.tar.bz2.

Please modify PG_CONFIG = /opt/pgsql/9.4/bin/pg_config in Makefile as necessary. Then make and make install.

Connect to a database then invoke sql command CREATE EXTENSION qr;

SELECT qr('QR Code with PostgreSQL', 0, 0, 4);

  1. Text to be encoded.
  2. Error correction level (0 to 3).
  3. Accepted model number (0 to 2).
  4. Scale 4 means: a dot in QR image will be 4 pixel width and 4 pixel height.

It returns byte array representing monochrome bitmap. You can save it as a file or directly render it to HTML page.
QRCode Demo

Bruce Momjian: My Daughter Speaks

From Planet PostgreSQL. Published on Nov 13, 2015.

My daughter Catherine, then 12 years old, was interviewed at Postgres Open 2013 by Craig Kerstiens and Selena Deckelmann. Community members who have heard the 23-minute recording have found it humorous , and I have recently received permission to release it to a wider audience.

Shaun M. Thomas: PG Phriday: Parallel Sequence Scans

From Planet PostgreSQL. Published on Nov 13, 2015.

A couple days ago, Robert Haas announced that he checked in the first iteration of parallel sequence scans in the Postgres 9.6 branch. And no, that’s not a typo. One of the great things about the Postgres devs is that they have a very regimented system of feature freezes to help ensure timely releases. Thus even though 9.5 just released its second beta, they’re already working on 9.6.

So what is a sequence scan, and why does this matter? Past articles have covered this, but Postgres uses sequence scans when it needs to read the entire contents of a table. For larger entities consisting of tens or hundreds of millions of rows, this is a time-consuming process. Even the best processor can only handle a few million rows per second, so as scale increases vertically, there’s currently no way to address these larger data volumes efficiently without significant effort.

As slow as a sequence scan is, it’s also a relatively simple process, and a great starting point for the long road to a parallel Postgres engine. Postgres knows how big the table files are and obviously knows its own storage format, so it merely needs to set scan ranges and farm them out to workers that report the results back to a coordinator process. In theory, that’s a clean transition that avoids complicated aggregation rules or clause pushdown headaches.

But how well does it work? To see this for myself, I did something I’ve actually never done before now: download the Postgres git tree. Then I set up a test case:

SELECT, repeat(' ', 20) AS junk
  FROM generate_series(1, 20000000) a(id);
ANALYZE test_tab;

With this test table, there are 20-million rows of empty junk and no indexes, so we’re effectively forcing Postgres to use a sequence scan for any query. Then we have to enable the feature with the max_parallel_degree parameter. And finally we invoke a query with a naive WHERE clause applied to every row so the engine has to actually do some work.

SET max_parallel_degree TO 1;
  FROM test_tab
 WHERE junk LIKE '%s%';
                             QUERY PLAN
 Gather  (cost=1000.00..265706.30 ROWS=1 width=25)
         (actual TIME=1832.456..1832.456 ROWS=0 loops=1)
   NUMBER OF Workers: 1
   ->  Parallel Seq Scan ON test_tab
         (cost=0.00..264706.20 ROWS=1 width=25)
         (actual TIME=1828.830..5489.878 ROWS=0 loops=1)
         FILTER: (junk ~~ '%s%'::text)
         ROWS Removed BY FILTER: 29594528
 Planning TIME: 0.062 ms
 Execution TIME: 1834.332 ms

There’s a lot going on here. First, we need to talk about how many processes actually worked on this query. The max_parallel_degree parameter controls how many background workers assist the main process, so there are actually two Postgres processes performing independent sequence scans. Some parallel systems use the parent as a mere coordinator, so we might expect the number of workers to be greater than 1 before actual parallel operation occurs. That isn’t the case with this implementation.

The query itself simply asks for something that doesn’t exist in our sample set. This helps us get a best-case scenario where no results are handed between the processes, but time is still consumed scanning table data. And the resulting plan from that query is rather different from a standard scan. We can see how many extra workers were involved and that the results were “gathered” by an extra execution layer.

The only odd detail is that 29-million rows were removed from the results of a 20-million row table. We’ll just chalk that up as an implementation quirk considering this is pre-alpha code. Otherwise, this patch appears to scale in a relatively linear manner. Let’s check out a few different variants of max_parallel_degree.

Workers Avg Time (s)
0 3.8
1 1.9
2 1.3
3 1.1
4 0.9

There’s a bit of jitter in the timings on our test system, but the trend is fairly clear. With no extra workers, one process can scan a 20M row table in about four seconds. With three extra workers, those four processes can perform the same task in about one second.

This iteration of the patch takes the size of the table into account, possibly to compensate for implicit worker management overhead. With 20M rows, we couldn’t get more than five dedicated workers, while 100M rows utilized seven workers to utilize all eight of our CPU cores. Beyond that are a couple important caveats:

  1. Aggregates are not currently handled.
  2. Certain clauses are not pushed down into the workers.

The first can actually be circumvented rather easily. For example:

SET max_parallel_degree TO 3;
    FROM test_tab
   WHERE junk LIKE '%s%'
) s;
(1 ROW)
TIME: 1111.904 ms

The second is a natural result of the patch’s immaturity. We need to have parallel functionality before it can be optimized. I’m perfectly content waiting for it to be done right. In the meantime, we have all of the functionality added to make this possible. After 9.4 added background workers, new Postgres extensions began leveraging them. And now 9.6 will probably use them for their original purpose, based on how stable the patch appears so far.

It’s exciting to see that Postgres will finally be able to scale vertically in a way that can handle the large tables some organizations have been accumulating. We have a 30TB database with tens of billions of rows. Even though we’ll be sharding that in the future, looming parallel features imply the shards themselves will scale as we cross over into trillions of rows.

It’s an exciting time to be a Postgres end-user.

Terry Erisman: PGConf Silicon Valley Keynotes & Community Happy Hour

From Planet PostgreSQL. Published on Nov 13, 2015.

We are pleased to announce the final agenda of keynote addresses for the first annual PGConf Silicon Valley conference. Keynotes will be delivered by Jay Kreps, CEO of Confluent, Ivan Novick, Greenplum Database Product Manager at Pivotal Software, and Umur Cubukcu, CEO and Co-founder of Citus Data. The conference is next week, November 17-18, 2015, at the South San Francisco Conference Center. Organized by Citus Data in cooperation with the San Francisco PostgreSQL Users Group, PGConf Silicon Valley is a technical conference aimed at the Silicon Valley PostgreSQL community and beyond.

This year's conference theme is "High Performance & Scalability". Breakout session tracks include DevOps, New Features, PostGIS, Tales from the Trenches, Hacking Postgres, and Data at Scale.

The Keynotes

On Wednesday, November 18, we will hear three keynote addresses that will launch a full day of breakout sessions:

  • Apache Kafka: A Commit Log for the Datacenter
    Jay Kreps, CEO of Confluent, will discuss what database logs, event data, stream processing and Franz Kafka have to do with each other.
  • "PostgreSQL License, Vendor Eco-systems, Cloud Providers, and Future of RDBMS
    Ivan Novick, Greenplum Database Product Manager at Pivotal Software, will explore how the permissive PostgreSQL license has created an ecosystem of vendors providing spin-off products based on the PostgreSQL database. He will cover the evolution of PostgreSQL innovations, ranging from databases for big data analytics to OLTP, as well as running them as shrink-wrapped software or in the cloud. He will also provide a vision for how these initiatives are converging to shape the future of relational databases.
  • Scaling Out PostgreSQL in the Distributed Era
    Umur Cubukcu, CEO and Co-founder of Citus Data.

Community Happy Hour

On Tuesday, November 17, following the day’s tutorial sessions, there will be a no host community happy hour at The Brass Elephant Lounge and Sports Bar. Promoted in conjunction with the San Francisco PostgreSQL Users Group (SFPUG), the lounge is within walking distance of the South San Francisco Conference Center. The happy hour is open to conference attendees and all members of the Silicon Valley Postgres community, whether they are registered for PGConf Silicon Valley or not. The event is the perfect opportunity for those who are not able to attend the conference to connect with other members of the PostgreSQL community. Those planning to attend the event are asked to RSVP for planning purposes.

Conference Registration

Register now to attend the conference. Use discount code "CitusData20" to receive a 20% discount.

I hope to see you next week!

Josh Williams: Odd pg_basebackup Connectivity Failures Over SSL

From Planet PostgreSQL. Published on Nov 13, 2015.

A client recently came to me with an ongoing mystery: A remote Postgres replica needed replaced, but repeatedly failed to run pg_basebackup. It would stop part way through every time, reporting something along the lines of:

pg_basebackup: could not read COPY data: SSL error: decryption failed or bad record mac

The first hunch we had was to turn off SSL renegotiation, as that isn't supported in some OpenSSL versions. By default it renegotiates keys after 512MB of traffic, and setting ssl_renegotiation_limit to 0 in postgresql.conf disables it. That helped pg_basebackup get much further along, but they were still seeing the process bail out before completion.

The client's Chef has a strange habit of removing my ssh key from the database master, so while that was being fixed I connected in and took a look at the replica. Two pg_basebackup runs later, a pattern started to emerge:
$ du -s 9.2/data.test*
67097452        9.2/data.test
67097428        9.2/data.test2
While also being a nearly identical size, those numbers are also suspiciously close to 64GB. I like round numbers, when a problem happens close to one that's often a pretty good tell of some boundary or limit. On a hunch that it wasn't a coincidence I checked around for any similar references and found a recent openssl package bug report:

RHEL 6, check. SSL connection, check. Failure at 64 GiB, check. And lastly, a connection with psql confirmed AES-GCM:
SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256)

Once the Postgres service could be restarted to load in the updated OpenSSL library, the base backup process completed without issue.

Remember, keep those packages updated!

MIT uses Evennia!

By Griatch's Evennia musings (MU* creation with Django+Twisted) from Django community aggregator: Community blog posts. Published on Nov 12, 2015.

Evennia was recently used as a test bed to train an AI system to attempt to play a MUD as a human would - by only reading and understanding the text on the screen.

Researchers at MIT (Massachusetts Institute of Technology) recently presented the paper Language understanding for Text-based games using Deep reinforcement learning (PDF) at a conference on natural language processing. A summary is in the MIT press release.

I was contacted by these fine folks some time ago so I knew they had plans to use Evennia for their research. It's great to see they now have an article out on it! Evennia devs are also mentioned in the acknowledgements - so something for the Evennia dev community to be proud of! 

MUDs are tricky

The main complication for an AI playing a MUD is that the computer has no access to the actual game state but must try to surmise how well it's doing only from the text given (same as a human would). The researchers compare the results from a range of deep-learning neural network algorithm that they train to play.

To test their AI, the researchers first used Evennia to build a simple training "Home World": a 4-room "house" where the simple goal is to find and eat an apple while refraining to go to sleep. The room descriptions used here were pretty formulaic although not trivial to give a challenge. This they used to train their AI system.

They then took this trained neural network and applied it to the real challenge, playing the Evennia Tutorial World. You can yourself try this out in our demo install or by just running a single command when starting Evennia. They call it "Fantasy World" in the article.

The tutorial world has hand-written descriptions and often describes the exits as part of the room text. The article actually makes a comprehensive analysis of the tutorial world, including the available game states and transitions as well as the number of words and number of commands per state. Interesting stuff in itself. I presume the scientists have modified their copy of the tutorial world to provide better metrics for their analysis.

A bridge too far

As far as I understand from the article, the AI does understand to use commands with one or two arguments (like eat apple or the move red-root right), but they note that actually finding the tomb of the fallen hero (the main quest of the tutorial) is too hard for the AI:

[...]However, this is a complex quest that requires the player to memorize game events and perform high-level planning which are beyond the scope of this current work.
So instead they evaluate the AI's performance on a more mundane task: Getting across the bridge to the castle. It's not clear to me if the AI actually plays more of the game too or if their test just exposes the AI to the bridge itself. I suspect it does play more due to the examples they use from other rooms; evaluating the bridge-crossing is just a clear-cut metric to use for "success".

The MIT press release claims that the AI is also scored on how much health/magic it has, but I don't see that mentioned in the article itself (and the tutorial world only has magic if you find the hero's tomb which they claim they cannot do).

The bridge in Evennia's tutorial world is actually a single "room" that takes multiple steps to cross. At every step the room description changes to describe the progress. Random texts will appear as the bridge sways in the wind and various environmental cues are heard and seen. There is also a small chance of falling off the bridge if one lingers too long on it.

So although all you really need to do is to walk east repeatedly, I can see why this can be a challenge to a neural network having no mental image of what a bridge is. It can only work off the text it's given at any given time.

In the paper, the algorithms are evaluated both on their ability to actually cross the bridge and on how optimal their solution was, for example by not issuing invalid commands to the situation.

Beyond the bridge

The results are that after being trained on the training house setup, the AI will eventually be able to cross the bridge. The particular algorithm proposed also perform slightly better than the comparison ones (and a lot better than simple randomness).

So from the perspective of the researchers this seems to be a success. Even so, this reinforces the fact that quite some way to go before an AI can *actually* play a real MUD successfully. Using MUDs for this type of research is a good idea though, and I do hope they expand and continue this line work in the future.

Who knows, maybe the AI will even find that ancient tomb eventually!

Image from MIT news

How to send Jabber (XMPP) messages from Django

By CTO with a CEO flavour: Alex Morozov's blog from Django community aggregator: Community blog posts. Published on Nov 11, 2015.

Did you ever want to have a simple Django notification bot? An intranet one which just sends you (or someone you tell it to) Jabber messages when certain events occur? So did I. Please, welcome: django-jabber.

Tip: How to get a single object`s value with Django ORM

By CTO with a CEO flavour: Alex Morozov's blog from Django community aggregator: Community blog posts. Published on Nov 11, 2015.

There are times when you want to get a single field of a single object in the database. For example, just get the headline of the blog post #1, not fetching it’s body. How do you do it in a usual way?

>>> BlogPost.objects.only('headline').get(pk=1).headline
'Hello world'

# Or maybe even this way:
>>> BlogPost.objects.values('headline').filter(pk=1)[0]['headline']
'Hello world'

Recently I’ve stumbled upon a shorter one:

When you shouldn’t use the Django admin

By CTO with a CEO flavour: Alex Morozov's blog from Django community aggregator: Community blog posts. Published on Nov 11, 2015.

In case you’ve thought I detest django.contrib.admin — by no means. Actually it’s one of the Django’s greatest features, I really love it. In most cases.

Here’s a real life story. We’ve had to quickly put up the first version of an intranet claim tracking system, and one of our developers was just crazy about the Django admin… So why not, we’ve got along with the stock interface as the primary one for ...

Initial Data in Django

By Caktus Consulting Group from Django community aggregator: Community blog posts. Published on Nov 10, 2015.

I've struggled to find an ideal way to load initial data for Django projects. By “initial data,” I'm referring to the kind of data that you need on a new system for it to be functional, but could change later. These are largely lists of possible choices, such as time zones, countries, or crayon colors.

Here are my requirements:

  • Fairly simple to run on initial server deploy, initial development environment setup, and when starting a test run.
  • Does not risk overwriting changes that are made to records in the live database after they're initially created.
  • Not too hard to update from the current live data, so that future new deploys etc get the latest data
  • Copes well as models evolve, because they will
  • Well supported by Django
  • Not too great a performance impact on testing

Here are some of the approaches that I've tried.


Fixtures are how Django used to recommend loading initial data.


  • It's fairly easy to update the fixtures as the "initial" data evolves - e.g. you've added more options in your live server, and want to preserve them in the initial data, so just do another dumpdata.
  • Fixtures don't slow down test startup if they're not named "initial.XXXX" or you're using a recent version of Django, because they don't get loaded automatically.
  • Easy enough to load at the beginning of tests that need them by adding a fixtures attribute to the test case class.


  • fatal - If a fixture is loaded again, it overwrites any changed data in the database with the original values
  • Discouraged by current Django documentation
  • Hard to keep valid when models evolve. The right way would be every time a model changes, you update the fixtures from the current data, then create a fresh temporary database without applying the new migrations, load the current fixtures, apply the new migrations, and make a fresh dump of the initial data. But that’s a lot of work, and hard to remember to do every time models change.
  • Data is not automatically available during tests, and since our system won't run correctly without some of this data, you have to arrange to load or create it at test setup.
  • Not loaded automatically so:
  • When setting up new development environments, you must document it and it’s still easily overlooked, or else get a developer to run some script that includes it
  • For automated deploys, not safe to run on every deploy. Probably the only safe approach is to run manually after the first deploy.

Summary: rejected due to risk of data loss, inconvenience during development, and negative recommendation from Django documentation.

Fixture hack

I played around with a modified loaddata command that checked (using natural keys) if a record in the fixture was already in the database and did not overwrite any data if the record had previously been loaded.

This means it's safer to add to scripts and automatic deploys.


  • Fairly easy to update as "initial" data evolves - e.g. you've added more options in your live server, and want to preserve them in the initial data, so just do another dumpdata
  • Fixtures don't slow down test startup if they're not named "initial.XXXX" or you're using a recent version of Django, because they don't get loaded automatically
  • Easy enough to load at the beginning of tests that need them by adding a fixtures attribute to the test case class.
  • Can add to env setup scripts and automated deploys safely


  • Hard to keep valid when models evolve
  • Data is not automatically available during tests
  • Not loaded automatically, so when setting up new development environments, you must document it and it’s still easily overlooked, or else get a developer to run some script that includes it

Summary: rejected; it mitigates one problem with fixtures, but all the others remain.

Post-migrate signal

Something else I experimented with was running code to create the new records in a post-migrate signal, even though the docs warn against data modification in that signal.


  • Runs automatically each time migrations are run, so will automatically get run during most automated deploys
  • Runs automatically when tests are setting up the test database, so all tests have the data available - but is part of the initial database, so we don't have the overhead of loading initial data during every test's setUp.


  • fatal - Runs every time migrations are run, even reverse migrations - so it runs when tables are in the wrong state and breaks development when you might be migrating forward and back
  • If it fails, the whole migration fails, so you can't just ignore a failure even if you didn't care about creating the initial data that time
  • Slows down database creation when running tests, unless you use --keepdb

Summary: rejected; not a valid way to load initial data.

In a migration

Add a migration that creates the initial records.


  • This is what the Django documentation currently recommends
  • Runs automatically
  • The migration only runs when the database schema matches what it was when you wrote it, so it won't break as models evolve
  • You can write it to ignore records that already exist, so it won't overwrite later changes in the database


  • fatal in some cases - migrations don't use the actual model class, so models with custom behavior (like MPTTModel) won't get created correctly. You might be able to find workarounds for this on a case-by-case basis.
  • Slows down database creation when running tests, unless you use --keepdb
  • Harder than fixtures to update as the initial data evolves. Options:
    • Go back and edit the original migration - but then it won't run on existing databases and they won't get the new records
    • Add a new migration that adds the whole updated initial data set, then go back and comment out the code in the previous initial data migration since there's no point running it twice on new database setup
    • Add yet another migration for just the new data - probably the simplest in terms of updating the migrations, but it'll be harder to extract just the new data from the current database than to just extract the whole dataset again. Also, you don't preserve any edits that might have been made over time to older records.

Summary: best option so far. It has some drawbacks, but not as bad as the other options.


The best approach in most cases is probably to load initial data in a migration, as the Django documentation recommends. It's not perfect, but it avoids some of the fatal flaws of other approaches. And the new (in Django 1.8) --keepdb option helps ameliorate the slow test startup.

I'm still curious if there are other approaches that I haven't considered, though.

Speeding up a Django web site without touching the code

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 08, 2015.

I’ve recently been tweaking my server setup for a Django 1.3 web site with the goal of making it a bit faster. Of course, there is a lot of speed to gain by improving e.g. the number of database queries needed to render a web page, but the server setup also has an effect on the web site performance. This is a log of my findings.

All measurements have been done using the ab tool from Apache using the arguments -n 200 -c 20, which means that each case have been tested with 20 concurrent requests up to 200 requests in total. The tests was run from another machine than the web server, with around 45ms RTT to the server. This is not a scientific measurement, but good enough to let me quickly test my assumptions on what increases or decreases performance.

The Django app isn’t particularly optimized in itself, so I don’t care much about the low number of requests per second (req/s) that it manages to process. The main point here is the relative improvement with each change to the server setup.

The baseline setup is a Linode 1024 VPS (Referral link: I get USD 20 off my bill if you sign up and remain a customer for 90 days), running Apache 2.2.14 with mpm-itk, mod_wsgi in daemon mode with maximum 50 threads and restart every 10000 requests, SSL using mod_ssl, and PostgreSQL 8.4.8 as the database. For the given Django app and hardware, this setup is strolling along at 4.0 req/s.

With this blog post as reference, I switched from Apache+mod_wsgi to using nginx 0.7.5 as SSL terminator, for serving static media, and as a proxy in front of Gunicorn 0.13.4. Gunicorn is a WSGI HTTP server, hosting the Django site. The Linode VPS got access to four CPU cores (n=4), so I set up nginx with 4 workers (n) and Gunicorn with 9 workers (2n+1). Different values for these settings are sometimes recommended, but this is what I’m currently using. This setup resulted in an increase to 9.0 req/s.

A nice improvement, but I changed multiple components here, so I don’t know exactly what helped. It would be interesting to test e.g. Apache with mod_proxy in front of Gunicorn, as well as different number of nginx and Gunicorn workers. The nginx version is also a bit old, because I used the one packaged in Ubuntu 10.04 LTS. I should give nginx 1.0.x a spin.

Next up, I added pgbouncer 1.3.1 (as packaged in Ubuntu 10.04 LTS, latest is 1.4.2) as a PostgreSQL connection pooler. I let pgbouncer do session pooling, which is the safest choice and the default. Then I changed the Django app settings to use pgbouncer at port 6432, instead of connecting directly to PostgreSQL’s port 5432. This increased the performance further to 10.5 req/s.

Then, I started looking at SSL performance, without this being the bottleneck at all. I learned a lot about SSL performance, but didn’t improve the test results at all. Some key points was:

  • nginx defaults to offering Diffie-Hellman Ephemeral (DHE) which takes a lot of resources. Notably, the SSL terminators stud and stunnel does not use DHE. See this blog post for more details and how to turn off DHE in nginx.

  • If you’re using AES, you can process five times as many requests with a 1024 bit key compared to a 2048 bit key. I use a 2048 bit key.

  • 64-bit OS and userland doubles the connections per second compared to 32-bit. My VPS is stuck at 32-bit for historical reasons.

  • SSL session reuse eliminates one round-trip for subsequent connections. I set this up, but my test setup only use fresh connections, so this improvement isn’t visible in the test results.

  • Browsers will go a long way to get hold of missing certificates in the certificate chain between known CA certificates and the site’s certificate. To avoid having the browser doing requests to other sites to find missing certificates, make sure all certificates in the chain are provided by your server.

    If you’re switching from Apache to Nginx, note that Apache uses separate files for your SSL certificate and the SSL certificate chain, while Nginx wants these two files to be concatenated to a single file, with your SSL certificate first.

Next, I read about transaction management and the use of autocommit in Django. The Django site I’m testing is read-heavy, with almost no database writes at all. It doesn’t use Django’s transaction middleware, which means that each select/update/insert happens in its own transaction instead of having one database transaction spanning the entire Django view function.

Since I’m using PostgreSQL >= 8.2, which supports INSERT ... RETURNING, I can turn on autocommit in the Django settings, and keep the transaction semantics of a default Django setup without the transaction middleware. Turning on autocommit makes PostgreSQL wrap each query with a transaction, instead of Django adding explicit BEGIN, and COMMIT or ROLLBACK statements around each and every query. Somewhat surprisingly, this reduced the performance to 9.2 req/s. Explanations as to why this reduced the performance are welcome.

Reverting the autocommit change, I got back to 10.5 req/s. Then I tried tuning the PostgreSQL configuration using the pgtune tool. I went for the web profile, with autodetection of the amount of memory (1024 MB):

pgtune -i /etc/postgresql/8.4/main/postgresql.conf -o postgresql-tuned.conf -T Web
mv postgresql-tuned.conf /etc/postgresql/8.4/main/postgresql.conf

pgtune changed the following settings:

maintenance_work_mem = 60MB        # From default 16MB
checkpoint_completion_target = 0.7 # From default 0.5
effective_cache_size = 704MB       # From default 128MB
work_mem = 5MB                     # From default 1MB
wal_buffers = 4MB                  # From default 64kB
checkpoint_segments = 8            # From default 3
shared_buffers = 240MB             # From default 28MB
max_connections = 200              # From default 100

After restarting PostgreSQL with the updated settings, the performance increased to 11.7 req/s.

To summarize: in a few hours, I’ve learned a lot about SSL performance tuning, and–without touching any application code–I’ve almost tripled the amount of requests that the site can handle. The performance still isn’t great, but it’s a lot better than what I started with, and the setup is still far from perfect.

To get further speed improvements, I would mainly look into three areas: adding page (or block) caching where appropriate, log database queries and tweak the numerous or slow ones, and look further into tweaking the PostgreSQL settings. But, that’s for another time.

If you have suggestions for other server setup tweaks, please share them in the comments, and I’ll try them out.

Updated: Removed the “mean response time” numbers, which simply is (time of full test run) / (number of requests). It just told us the same as req/s in a less intuitive way. The other interesting number here is the perceived latency for a single user/request. I’ll make sure to include it in future posts.

Comics v2.2.0 released with Django 1.5 support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 08, 2015.

Version 2.2.0 of my comics aggregator is now released. It features a general upgrade of dependencies, including the move from Django 1.4 to Django 1.5, and a lot of updates to comic crawlers. The Django upgrade was completed months ago and it’s been running my Comics instance since, so it’s about time to get it released before Django 1.6 arrives in a month or two. Regarding the crawler updates, it’s a bit sad to see that many of the crawlers have been broken for months without me or anybody else noticing, but it’s hard to catch some content lacking in the middle of a firehose of similar content. I guess I’ll have to make it a monthly task to look through the crawler status page of my Comics instance and do patch releases with updated crawlers.

Check out the project docs for more information about Comics and this release in particular.

Comics v2.3.0 released with better mobile support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 08, 2015.

Version 2.3.0 of my comics aggregator is now released. As always, dependencies have been updated, including the routine Django upgrade from Django 1.5 to 1.6, just in time for the upcoming 1.7 release. The largest change this time around is the move from Bootstrap 2 to 3, which includes a refreshed, flatter design and lots of tweaking to make Comics both look good and work nicely on mobile devices, something it didn’t use to do.

The dependency overview at has been a great motivation for doing some maintenance work on Comics. The only dependency left outdated is django-registration, as 1.0 moves to class-based views, which requires some larger changes to my customizations. Thus, the upgrade of django-registration has been deferred together with the related upgrade of my vendorized copy of django-invitation.

Most, if not all, of the other dependencies seems to support Python 3 now, though some lack the right Trove classifiers in their PyPI registration, so they are not correctly labeled by I found an open pull request for cssmin and a recent commit for django-compressor adding the missing Trove classifiers. I’ve also done my part to improve the Python 3 metadata by sending a pull request to django-bootstrap-form.

Check out the Comics project docs for more information about Comics and this release in particular.

Comics v2.2.0 released with Django 1.5 support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 08, 2015.

Version 2.2.0 of my comics aggregator is now released. It features a general upgrade of dependencies, including the move from Django 1.4 to Django 1.5, and a lot of updates to comic crawlers. The Django upgrade was completed months ago and it’s been running my Comics instance since, so it’s about time to get it released before Django 1.6 arrives in a month or two. Regarding the crawler updates, it’s a bit sad to see that many of the crawlers have been broken for months without me or anybody else noticing, but it’s hard to catch some content lacking in the middle of a firehose of similar content. I guess I’ll have to make it a monthly task to look through the crawler status page of my Comics instance and do patch releases with updated crawlers.

Check out the project docs for more information about Comics and this release in particular.

Comics v2.3.0 released with better mobile support

By Stein Magnus Jodal from Django community aggregator: Community blog posts. Published on Nov 08, 2015.

Version 2.3.0 of my comics aggregator is now released. As always, dependencies have been updated, including the routine Django upgrade from Django 1.5 to 1.6, just in time for the upcoming 1.7 release. The largest change this time around is the move from Bootstrap 2 to 3, which includes a refreshed, flatter design and lots of tweaking to make Comics both look good and work nicely on mobile devices, something it didn’t use to do.

The dependency overview at has been a great motivation for doing some maintenance work on Comics. The only dependency left outdated is django-registration, as 1.0 moves to class-based views, which requires some larger changes to my customizations. Thus, the upgrade of django-registration has been deferred together with the related upgrade of my vendorized copy of django-invitation.

Most, if not all, of the other dependencies seems to support Python 3 now, though some lack the right Trove classifiers in their PyPI registration, so they are not correctly labeled by I found an open pull request for cssmin and a recent commit for django-compressor adding the missing Trove classifiers. I’ve also done my part to improve the Python 3 metadata by sending a pull request to django-bootstrap-form.

Check out the Comics project docs for more information about Comics and this release in particular.

Django under the hood: documentation workshop - Mikey Ariel

By Reinout van Rees' weblog from Django community aggregator: Community blog posts. Published on Nov 06, 2015.

(One of my summaries of a talk at the 2015 django under the hood conference).

Documentation used to be an afterthought of software delivery. Now it is a key component of the success of a software project.

Content strategy

Originally it is a marketing term. Which is fine, as documentation is an important part of your project's marketing!

The core is asking the right questions (even is the answer is simple).

  • Who are my readers? Sounds like a simple question. But... are your readers advanced users? Or beginners? Do you need "persona-based" documentation, so documentation for specific groups ("admins", "developers", etc)?

  • What do my readers want to know? Often your readers need context before they can understand reference documentation. Do you need an end-to-end tutorial? Or just explanations?

    Does the textual content need to be enhanced with video or diagrams, for instance?

  • When do my readers need the content? Installation documentation right at the beginning to get started? A reference guide when you're already working with it? Tutorials for learning it?

    "When" also relates to "when do I need/want to update the documentation?"

  • Where do my readers consume the content? Do you need a "man" page? Embedded help in your GUI app? Good, helpful error messages? Online documentation that can be found by google?

  • Why do my readers even need this content? Minimize double work. Can you point at another project's documentation or do you need to describe some feature yourself?

    Similarly, if you need to add documentation to work around bugs or things that don't work right yet: should you not actually fix the code instead?

DevOps for docs

"Content strategy" leverages marketing concepts to make your documentation better. Likewise, "devops for docs" leverages engineering for your documentation.

  • Look for a unified toolchain. If possible, use the same tools as the developers of the project you're documenting (especially if you're both the developer and the documenter). Git, for instance. Don't use google docs if the project uses git. By using the same kind of system, everybody can help each other.

  • Use out of the box documentation tools like asciidoctor, gitbook, MkDocs, sphinx.

  • Use continuous integration! Automatic checker for broken links, perhaps an automatic spell checker, automatic builds (like read the docs does).

    There are automatic checkers like "Hemingway" that can be used as a kind of text unit test.

    You can add custom checks like making sure your project name is always spelled correctly.

  • Iterative documentation. Dividing the work into sprints for instance if it is documentation for a big project. Use your issue tracker or trello or something like that to manage it.

Keep in mind: we're all in this together. Designers, developers, product managers, quality assurance, support engineers, technical writers, users.

Docs or it didn't happen

Some ideas.

  • Treat docs as a development requirement. Write it down in your contribution guidelines. Write down what your definition of "documented" is.

  • Contribution guidelines are a good idea! They're an important part of your documentation in itself. Do you want people to help you? Write those guidelines.

    With contrib guidelines you can also steer the direction and the atmosphere of your project. If you suggest that 20% of a participant's time is spend mentoring new contributors, you send a strong message that you're a welcoming and helpful community, for instance.

    Also look at non-code areas. Do you want contributions from designers? Do you explicitly like somone to work only on the documentation side of things?

  • Provide templates. "If you add a new feature, use this template as a basis.". "Add a 'version added' link to your description." That kind of helpful suggestions.

  • Contributing to documentation is a great (and often reasonably easy) way to get into contributing to a project as a whole.

  • Collaboration and training. Sprints and hackfests are great to get people started. There are communities and conferences. "Open help", "write the docs". Also mini-conferences inside bigger ones.

My recumbent bike in front of a station

Image: my recumbent bike in front of Troisvierges station in the north of Luxemburg, our startpoint this summer for cycling over the former 'Vennbahn' railway

water-gerelateerd Python en Django in het hartje van Utrecht!

Is Open Source Consulting Dead?

By chrism from . Published on Sep 10, 2013.

Has Elvis left the building? Will we be able to sustain ourselves as open source consultants?

Consulting and Patent Indemification

By chrism from . Published on Aug 09, 2013.

Article about consulting and patent indemnification

Python Advent Calendar 2012 Topic

By chrism from . Published on Dec 24, 2012.

An entry for the 2012 Japanese advent calendar at

Why I Like ZODB

By chrism from . Published on May 15, 2012.

Why I like ZODB better than other persistence systems for writing real-world web applications.

A str. __iter__ Gotcha in Cross-Compatible Py2/Py3 Code

By chrism from . Published on Mar 03, 2012.

A bug caused by a minor incompatibility can remain latent for long periods of time in a cross-compatible Python 2 / Python 3 codebase.

In Praise of Complaining

By chrism from . Published on Jan 01, 2012.

In praise of complaining, even when the complaints are absurd.

2012 Python Meme

By chrism from . Published on Dec 24, 2011.

My "Python meme" replies.

In Defense of Zope Libraries

By chrism from . Published on Dec 19, 2011.

A much too long defense of Pyramid's use of Zope libraries.

Plone Conference 2011 Pyramid Sprint

By chrism from . Published on Nov 10, 2011.

An update about the happenings at the recent 2011 Plone Conference Pyramid sprint.

Jobs-Ification of Software Development

By chrism from . Published on Oct 17, 2011.

Try not to Jobs-ify the task of software development.

WebOb Now on Python 3

By chrism from . Published on Oct 15, 2011.

Report about porting to Python 3.

Open Source Project Maintainer Sarcastic Response Cheat Sheet

By chrism from . Published on Jun 12, 2011.

Need a sarcastic response to a support interaction as an open source project maintainer? Look no further!

Pylons Miniconference #0 Wrapup

By chrism from . Published on May 04, 2011.

Last week, I visited the lovely Bay Area to attend the 0th Pylons Miniconference in San Francisco.

Pylons Project Meetup / Minicon

By chrism from . Published on Apr 14, 2011.

In the SF Bay Area on the 28th, 29th, and 30th of this month (April), 3 separate Pylons Project events.

PyCon 2011 Report

By chrism from . Published on Mar 19, 2011.

My personal PyCon 2011 Report