Skip to content. | Skip to navigation

Personal tools
Log in
You are here: Home

Open Source Posts

Climbing up Branches

By Griatch's Evennia musings (MU* creation with Django+Twisted) from Django community aggregator: Community blog posts. Published on Feb 14, 2016.

Today I pushed the latest Evennia development branch "wclient". This has a bunch of updates to how Evennia's webclient infrastructure works, by making all exchanged data be treated equal (instead of treating text separately from other types of client instructions).

It also reworks the javascript client into a library that should be a lot easier to expand on and customize. The actual client GUI is still pretty rudimentary though, so I hope a user with more web development experience can take upon themselves to look it over for best practices.

A much more detailed description of what is currently going on (including how to check out the latest for yourself) is found in this mailing list post. Enjoy!

Pavan Deolasee: Looking forward to PGDay India 2016

From Planet PostgreSQL. Published on Feb 14, 2016.

With just couple of weeks to go for PGDay India, I’m quite excited about the upcoming PostgreSQL conference at Bangalore on 26th February, 2016. This is going to be the biggest ever conglomeration of users, developers and companies interested in PostgreSQL in India. This is also by far the largest PostgreSQL conference in terms of number of speakers and attendees. Till last year, we’d to almost find speakers for the conference, but this year we’d a good problem of rejecting almost 1 in every 2 submitted proposals. It’s almost certain that in coming years, we either have to do multiple-track or multiple-day event. Again, a nice to have problem.

For the first time, we are looking to easily cross 100 mark in terms of attendees. We have secured 3 Platinum Sponsors, 2 Gold Sponsors, 1 Silver Sponsor and a media partner. All these things clearly indicate a fast growing community of users and developers in India. The conference will see representation from many companies, including Government organizations. Again, a solid demonstration of the reach of this fantastic open source database technology.

From 2ndQuadrant’s perspective, we are sponsoring the event at the highest level, showing our commitment to the adoption of PostgreSQL and our interest in this fast growing market. Abhijit Menon-Sen will speak about “VACUUM Internals” and I’ll do a talk on “What’s New in Postgres-XL 9.5″. I’ve worked very hard to improve Postgres-XL in the last few months and catching it up with the latest release of PostgreSQL, I’m quite excited to talk about the work we have done on this release. I’m keeping my fingers crossed, but I sincerely hope that Postgres-XL 9.5 Beta will be out before the conference.

So if you are a PostgreSQL user or a developer or just an IT enthusiast wanting to know about the technology, this is a great opportunity for you to learn from other users and hear from some of the finest developers of the technology. Visit for more details.

Oleg Bartunov: About "bikeshedding" in Postgres world

From Planet PostgreSQL. Published on Feb 13, 2016.

Some of my younger colleagues by Postgres Professional, who recently open the world of postgres hackers, sometimes wonder about meaning some jargon words, which they don't well understand. Recent example was "that should be orange", which I didn't understand myself well, so I asked Bruce for the meaning. I'm not sure I got an idea, what does it means, Bruce :)

One of the popular jargon word is "bikeshedding", which came from FreeBSD world and introduced to it-world by phk ( I like his quote from "Parkinson's Law":

Parkinson shows how you can go in to the board of directors and
get approval for building a multi-million or even billion dollar
atomic power plant, but if you want to build a bike shed you will
be tangled up in endless discussions.

Translated to our reality it means, that community is inclined to discuss some simple topics, while leaving very interesting patches without attention. I explain that to myself, that there are more people who understand simple/common topics and could participate in discussion, than developers who understand some very specific and difficult patch.

btw, the first time it was used by Jim Nasby (Decibel! <>) in 2007.

select sent, ts_headline(body_plain, to_tsquery('bikeshedding'),'StartSel="",StopSel=""')  from messages where body_tsvector @@ to_tsquery('bikeshedding') order by sent asc;
        sent         |                                           ts_headline
 2007-08-10 18:04:28 | bikeshedding on this list list is a very big danger.
                     | Advocacy is something everyone feels
 2008-02-20 14:55:14 | bikeshedding can start as to how to "use" it through the SQL interface.
                     | But, until
 2008-02-20 15:03:49 | bikeshedding can start as to how to "use" it through the SQL interface.
                     | >
                     | > But, until
 2009-02-08 16:46:59 | back and rip that functionality back out of his patch.
                     | Sounds like bikeshedding to me.
                     | ...Robert
 2009-04-14 05:04:36 | close to bikeshedding now...

Using mysql load data infile with django

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Using mysql load data infile with django

Using Fabric to update a remote svn checkout with ssh public key authentication

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Using Fabric to update a remote svn checkout with ssh public key authentication

Allow squid/mod_wsgi to pass the HTTP_AUTHORIZATION header to Apache

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Allow squid/mod_wsgi to pass the HTTP_AUTHORIZATION header to Apache

Django "view-permissions" for related objects

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Django "view-permissions" for related objects

Nested resources in Tastypie

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Nested resources in Tastypie

Custom choices in Django admin

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Custom choices in Django admin

RestORM - The client side of REST

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

RestORM - The client side of REST

Customizing Django startproject with templates

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Customizing Django startproject with templates

Django redirects with regular expressions

By Maykin Media blog from Django community aggregator: Community blog posts. Published on Feb 13, 2016.

Django redirects with regular expressions

Using jsPDF in Django templates to export as PDF

By Micropyramid django from Django community aggregator: Community blog posts. Published on Feb 12, 2016.

Using jsPDF in Django templates to export as PDF

jsPDF is used to generate pdf files in client-side Javascript.

You can find the links for jsPDF here and also you can find the link to project homepage.

You've to include the scripting files/links in head section to work properly.

Tip: We have to download the newest version of the library and include it in the HEAD or at the end of the BODY section.

Example to run the script from local storage:

In the HEAD section:

    <script src="js/jspdf.js">script>


In the BODY section:

    <script src="js/jspdf.js">script>
    <script src="js/main.js">script>

After loading the scripts in HEAD/BODY section, now we can start using the jsPDF library.

Lets start with some of basics of jsPDF to get the idea of using it in our applications:

First let us discuss how to create a new document?

It's simple as below mentioned:

var doc = new jsPDF(orientation, unit, format, compress);

The constructor can take several parameters.

  • orientation - The default value of orientation is "portrait". We can set it to "landscape" if we want a different page orientation.
  • unit - We can tell jsPDF in which units we want to work. Use on of the following: "pt" (points), "mm" (default), "cm", "in".
  • format - It's default page format. We can change it we want other formats also like: "a3", "a4" (default), "a5", "letter", "legal".

As an order to understand, here is an example:

var doc = new jsPDF('landscape');
doc.text(20, 20, 'Hello landscape world!');

We can add new page using the following code:

doc.addPage(width, height);

As parameters we can pass the page width and height in the units defined in the docuent constructor. Adding pages moves us to this page, so many operations will be executed on that page. If we want to go to another page we can use the setPage function.


You can also get the actual number of pages by using the below code:


In an order to understand, here is an example:

var doc = new jsPDF();
doc.text(20, 20, 'Hello world!');
doc.text(20, 30, 'This is client-side Javascript, pumping out a PDF.');
doc.text(20, 20, 'Do you like that?');

 Now it's time to work with text:

1. The most important thing is displaying text, we do it using the funtion named doc.text which takes 3 parameters. like below:

doc.text(X, Y, "Text to be displayed");

- The first two are X and Y positions of the text in units defined in the document constructor.

* Here we have to notice a point: Y position, it is the position of the text baseline, so printing something with the Y position set to 0 will actually print it over the top edge of the document.

- The third argument is the text to be displayed.

2. The second thing is the font name used to draw the text. We can choose one of the following: couriertimesarial. We can change the font family and font style by running the doc.setFont function.

doc.setFont("arial", "bold");

By executing the doc.getFontList function we can find out what fonts are available and what font styles we can set for given font.

    "courier": ["normal", "bold", "italic", "bolditalic"],
    "Courier": ["", "Bold", "Oblique", "BoldOblique"],
    "times": ["normal", "bold", "italic", "bolditalic"],
    "Times": ["Roman", "Bold", "Italic", "BoldItalic"],
    "helvetica": ["normal", "bold", "italic", "bolditalic"],
    "Helvetica": ["", "Bold", "", ""] 

We can also change the font style individually by using the doc.setFontStyle or the doc.setFontType function, which is alias to the first one.

// is the same as calling

3. Next things is the font size. Now, for that we use doc.setFontSize function.



var doc = new jsPDF();
doc.text(20, 20, 'This is a title');

doc.text(20, 30, 'This is some normal sized text underneath.');

4. Last but not the least, the text color. We can change the text color using the doc.setTextColor function and passing three parameters which are RGB (Red, Green, Blue) color values.

var doc = new jsPDF();
// I know the proper spelling is colour ;)
doc.text(20, 20, 'This is gray.');

doc.text(20, 30, 'This is light gray.');

doc.setTextColor(255, 0, 0);
doc.text(20, 40, 'This is red.');

doc.setTextColor(0, 255, 0);
doc.text(20, 50, 'This is green.');

doc.setTextColor(0, 0, 255);
doc.text(20, 60, 'This is blue.');

In a order of full example:

var doc = new jsPDF();

doc.text(20, 20, 'This is the default font.');

doc.text(20, 30, 'This is courier normal.');

doc.text(20, 40, 'This is times italic.');

doc.text(20, 50, 'This is helvetica bold.');

doc.text(20, 60, 'This is courier bolditalic.');

Now it's time to work with Images:

We have only function for the images is the doc.addImage. It takes image as a first parameter, image format/type as a second and X,Y positions of the image as a third and fourth arguments respectively. Here we can also optionally pass new image size as a fifth and sixth arguments.

var img = new Image();
img.addEventListener('load', function() {
    var doc = new jsPDF();
    doc.addImage(img, 'png', 10, 50);
img.src = 'image_path/image_name.png';

In the above example, we passed an Image HTML DOM element as a first argument of the addImage function, however it can also be a base64 encoded image string.

var imgData = '.......base64code.....iiigAoooo//2Q==';
var doc = new jsPDF();
doc.text(35, 25, "Octonyan loves jsPDF");
doc.addImage(imgData, 'JPEG', 15, 40, 180, 180);

Working With Graphics:

First of all, we have to set the drawn shapes fill and stroke with colors. We do it using the doc.setFillColor and the doc.setDrawColor accordingly, passing RGB color values as parameters. 

doc.setFillColor(100, 100, 240);
doc.setDrawColor(100, 100, 0);

We can also set the stroke width. The stroke width unit is the same as defined in the document constructor.


Every shape drawing function takes the center point co-ordinates (triangle is the only exception) as two first parameters. They also take the last parameter for drawing style. It can be "S", "F", "DF", "FD" string and the meanings are: "stroke", "fill", "stroke and fill", "fill and stroke". The last two of course differ in the order of the drawing operations.

For example, we can draw an ellipse, by passing two radiuses :

// Empty ellipse
doc.ellipse(50, 50, 10, 5);
// Filled ellipse
doc.ellipse(100, 50, 10, 5, 'F');
// Filled circle with borders

or a circle, by passing only one radius as parameter :, 50, 5, 'FD');

or a rectangle, by passing it's width and height as parameters :

var doc = new jsPDF();

// Empty square
doc.rect(20, 20, 10, 10); 

// Filled square
doc.rect(40, 20, 10, 10, 'F');

// Empty red square
doc.rect(60, 20, 10, 10);

// Filled square with red borders
doc.rect(80, 20, 10, 10, 'FD'); 

// Filled red square
doc.rect(100, 20, 10, 10, 'F'); 

 // Filled red square with black borders
doc.rect(120, 20, 10, 10, 'FD');

or a rounded rectangle, by passing it's width, height and border radius as parameters : 

// Filled sqaure with rounded corners
doc.roundedRect(50, 150, 10, 10, 3, 3, 'FD');
// Black sqaure with rounded corners
doc.setFillColor(255, 255, 255);
doc.roundedRect(140, 20, 10, 10, 3, 3, 'FD'); 

and a triangle, by passing each corners co-ordinates.

// Filled triangle with borders
var doc = new jsPDF();
doc.triangle(60, 100, 60, 120, 80, 110, 'FD');
doc.triangle(100, 100, 110, 100, 120, 130, 'FD');

Last but not least, we can also draw lines passing through the co-ordinates of two points.

// Line
doc.line(50, 250, 100, 250);

Here is an example and some of functionalities we can apply on lines:

var doc = new jsPDF();

doc.line(20, 20, 60, 20); // horizontal line
doc.line(20, 25, 60, 25);

doc.line(20, 30, 60, 30);

doc.line(20, 35, 60, 35);

doc.setDrawColor(255,0,0); // draw red lines

doc.line(100, 20, 100, 60); // vertical line

doc.line(105, 20, 105, 60);

doc.line(110, 20, 110, 60);

doc.line(115, 20, 115, 60);

Here is a full example to understand in a better way:

var doc = new jsPDF();

doc.ellipse(40, 20, 10, 5);

doc.ellipse(80, 20, 10, 5, 'F');

doc.setFillColor(255,0,0);, 20, 5, 'FD');

Now, we are going to know about an interesting functionality over here - addHTML():

The below example will show you a preview of the html document that you've added this jsPDF script to generate pdf. This will generate the pdf as shown in the browsers.

var pdf = new jsPDF('p','pt','a4');

pdf.addHTML(document.body,function() {
 var string = pdf.output('datauristring');
 $('.preview-pane').attr('src', string);

The below example code will render the html and dispalys as plain text.

var doc = new jsPDF();

// We'll make our own renderer to skip this editor
var specialElementHandlers = {
 '#editor': function(element, renderer){
  return true;

// All units are in the set measurement for the document
// This can be changed to "pt" (points), "mm" (Default), "cm", "in"
doc.fromHTML($('body').get(0), 15, 15, {
 'width': 170, 
 'elementHandlers': specialElementHandlers

We can also add Metadata to our Generating pdf:

var doc = new jsPDF();
doc.text(20, 20, 'This PDF has a title, subject, author, keywords and a creator.');

// Optional - set properties on the document
 title: 'Title',
 subject: 'This is the subject',
 author: 'Author Name',
 keywords: 'generated, javascript, web 2.0, ajax',
 creator: 'Creator Name'

// Output as Data URI'Test.pdf');

For the people who want a full script with an example:

1. Add the following script to the HEAD:

<script src="">--script>

or Download Locally

2. Add HTML script to execute jsPDF code:

Customize this to pass the identifier or just change the #content to be the identifier you need.

    function demoFromHTML() {
        var pdf = new jsPDF('p', 'pt', 'letter');
        // source can be HTML-formatted string, or a reference
        // to an actual DOM element from which the text will be scraped.
        source = $('#content')[0];

        // we support special element handlers. Register them with jQuery-style 
        // ID selector for either ID or node name. ("#iAmID", "div", "span" etc.)
        // There is no support for any other type of selectors 
        // (class, of compound) at this time.
        specialElementHandlers = {
            // element with id of "bypass" - jQuery style selector
            '#bypassme': function (element, renderer) {
                // true = "handled elsewhere, bypass text extraction"
                return true
        margins = {
            top: 80,
            bottom: 60,
            left: 40,
            width: 522
        // all coords and widths are in jsPDF instance's declared units
        // 'inches' in this case
        source, // HTML string or DOM elem ref.
        margins.left, // x coord, { // y coord
            'width': margins.width, // max width of content on PDF
            'elementHandlers': specialElementHandlers

        function (dispose) {
            // dispose: object with X, Y of the last line add to the PDF 
            //          this allow the insertion of new lines after html
        }, margins);

3. Add your body content to generate as PDF:

<a href="javascript:demoFromHTML()" class="button">Generate PDFa>
<div id="content">
        Here is a complete example to generate a pdf using jsPDF.


Creating Django App

By Micropyramid django from Django community aggregator: Community blog posts. Published on Feb 12, 2016.


Django is a high-level Python Web framework that encourages rapid development.
Install django in virtualenv to keep it safe from messing around with the other versions you may have.
$ virtualenv env
$ env/bin/activate
$ pip install django

If Django is installed, you can see the version of our installation by running following command 

    $ python -c "import django; print(django.get_version())"

otherwise you get an error “No module named django”

Creating a Project:

If this is your first time using Django, you’ll need to auto-generate code that establishes a Django Project
 A collection of settings, 
 Including database configuration, 
 Django-specific options and application-specific settings.

With the following command:

    $ startproject myproject

This will Create "myproject" directory in your current directory.The above command(startproject) creates

    |- myproject/

The outer "myproject/" root directory is where your project is stored.

It is automatically created in each Django project. is a thin wrapper around before delegating to puts your project’s package on sys.path,Sets the DJANGO_SETTINGS_MODULE environment variable so that it points to your project’s file,  Calls django.setup() to initialize various internals of Django.

The inner "myproject/" directory is the actual Python package for your project.


An empty file that tells Python, this directory should be considered a Python package.


Settings/configuration for this Django project. 


The URL declarations for Django project


An entry-point for WSGI-compatible web servers to serve your project.

Database setup:

By default, SQLite is included in Python, so you won’t need to install anything to support your database.If you wish to use another database, install the appropriate database, and change ENGINE and NAME keys in the DATABASES to match your database connection settings

ENGINE – example: 'django.db.backends.sqlite3'

NAME – The name of your database. 


INSTALLED_APPS holds the names of all Django applications that are activated in this Django instance. By default, INSTALLED_APPS contains the following applications:

django.contrib.admin – The admin site. django.contrib.auth – An authentication system. django.contrib.contenttypes – A framework for content types. django.contrib.sessions – A session framework. django.contrib.messages – A messaging framework. django.contrib.staticfiles – A framework for managing static files.


Before using these applications you need to create a database by running the following command

$ python migrate

The migrate command looks at the INSTALLED_APPS setting and creates necessary database tables according to the database settings in your myapplication/ file.

Creating an App:

To create your app, make sure you’re in the same directory as and type this command:


$ python startapp myapp   #This creates a directory myapp


In our myapp application,create one Model: An Article has publication_date,title and content.

Edit the myapp/ file

  from django.db import models
  class Article(models.Model):
    publication_date = models.DateField()
    title = models.CharField(max_length=50)
    content = models.TextField(max_length=200)

Activating models: First you need to tell your project that "myapp" app is,Include your app name(myapp) in INSTALLED_APPS of file

  INSTALLED_APPS = ( 'django.contrib.admin',


Now,run following command:

$ python makemigrations myapp

By running makemigrations, you’re telling Django that you’ve made some changes to your models. Now, run migrate again to create those model tables in your database:

 $ python migrate

For writting view. Open the file myapp/ and write your code:

    from django.http import HttpResponse
    def index(request):
        return HttpResponse("Hello,You're at the myapp index.")

To call the view, we need to map it to a URL we need a URLconf.To create a URLconf in the "myapp" directory, create a file called
Now your app looks like:




from django.conf.urls import patterns, url from polls import views

urlpatterns = patterns('myapp.views', url(r'^$', index, name='index'),)

The next step is to point the root URLconf at the myapp.urls module. In myproject/ insert an include()




from django.conf.urls import patterns, include, url

from django.contrib import admin

urlpatterns = patterns('', url(r'^myapp/', include('myapp.urls')), url(r'^admin/', include(,   )

The development server:

To start Django Development server change into your outer directory and run the following command


$ python runserver

Now Go to http://localhost:8000/myapp/ in your browser,you will see the text which you defined in the index view.       

“Hello,You’re at the myapp index.”

django Payu Payment gateway Integration

By Micropyramid django from Django community aggregator: Community blog posts. Published on Feb 12, 2016.

In this blog, we will see how to integrate Django and PayU Payment Gateway. To integrate with PayU, we have package called "django-payu" - a pluggable Django application.

GitHub Repository: django-payu



$ pip install django-payu

Integration Process:

1. To start the integartion process, you need test merchant account and test credit card credentials to have the experience of overall transaction flow.

  • Once you create the account with PayU, they will provide SALT and KEY, we need this two credentials for the integration.

    NOTE: Here, you need to make the transaction request to the test-server and not on the production-server. Once you are ready and understood the entire payment flow, you can move to the production server.

2. To initialise the transaction, you need to generate a post request to the below urls with the parameters mentioned below

  • For PayU Test Server: POST URL:
  • For PayU Production (LIVE) Server: POST URL:
  • Parameters to be posted by Merchant to PayU in transaction request are:
    • key (Mandatory), txnid (Mandatory), amount(Mandatory), productinfo(Mandatory), firstname(Mandatory), email (Mandatory), phone (Mandatory), lastname, udf1-udf5, address1, address2, city, state, country, zipcode, surl(Mandatory), furl(Mandatory), curl(Mandatory), hash(Checksum)(Mandatory): sha512(key|txnid|amount|productinfo|firstname|email| udf1|udf2|udf3|udf4|udf5||||||SALT)
  • NOTE:
    • udf : User defined field
    • surl : Sucess URL (Success redirection URL)
    • furl : Failure URL (Failure redirection URL)
    • curl : Cancel URL (Cancel redirection URL)
    • hash(Checksum) : A checksum is generated by a mathematical function using the Merchant ID(key), message and the Salt as input. The Checksum algorithm used is SHA512.

3. Add the following information in the setting file using the details from your PayU account:

PAYU_MERCHANT_KEY = "Your MerchantID",


# Change the PAYU_MODE to 'LIVE' for production.

4. When the user click on the checkout button in your template, generate the mandatory parameter named "hash" using the get_hash() method.

from payu import get_hash
from uuid import uuid4

data = {
    'txnid':uuid4().hex, 'amount':10.00, 'productinfo': 'Sample Product',
    'firstname': 'test', 'email': '', 'udf1': 'Userdefined field',
hash_value = get_hash(data)

5. Then, send a post request to the PayU server using the HTML form filled with the data submitted by the buyer containing all the fields mentioned above.

6. When the transaction "post request" hits the PayU server, a new transaction is created in the PayU Database. For every new transaction in the PayU Database, a unique identifier is created every time at PayU’s end. This identifier is known as the PayU ID (or MihPayID).

7. Then the customer would be re-directed to PayU’s payment page. After the entire payment process, PayU provides the transaction response string to the merchant through a "post response". In this response, you would receive the status of the transaction and the hash.

8. Similar to Step 4, you need verify this hash value at your end and then only you shouls accept/reject the invoice order. You can verfy this hash using check_hash() method.

from django.http import HttpResponse
from payu import check_hash
from uuid import uuid4

def success_response(request):
    hash_value = check_hash(request.POST)
    if check_hash(request.POST):
        return HttpResponse("Transaction has been Successful.")

In this "django-payu" package, there are many other functions to Capture, Refund, Cancel etc., For the detailed documentation, See here.

Ernst-Georg Schmid: Fun with a integrarelational DBMS: SoilGrids

From Planet PostgreSQL. Published on Feb 12, 2016.

PostgreSQL has many ways to extend it's capabilities in well defined ways: Custom datatypes, custom functions, custom operators, even custom indexes.

And then there's the Foreign Data Wrapper, an API to pull (almost) any kind of data into PostgreSQL and treat it (almost) like a native table. There is already an impressive list of them, but sometimes you have to write your own one. Like yesterday, when I wanted to integrate pH data from the SoilGrids1km project into an already existing PostgreSQL/PostGIS system.

The data from SoilGrids is freely available, so I just could have downloaded it and put it into PostGIS. But the data set is a) huge and b) constantly updated and c) they have a REST API, so why not tap into it directly?

Maybe because the native language of the Foreign Data Wrapper is C and that's not exactly well suited for a fast prototype. :-) But then there is Multicorn, a bridge between the FDW API and Python, so I gave it a try...

After two hours, including the installation of Multicorn itself, I had this up and running:

CREATE SERVER soilgrids_srv
  OPTIONS (wrapper 'soilgrids_fdw.SoilGridsForeignDataWrapper');

CREATE FOREIGN TABLE public.soilgrids1km
   (latitude real ,
    longitude real ,
    ph_l real ,
    ph_m real ,
    ph_u real ,
    depth real ,
    publication_date date )
   SERVER soilgrids_srv;

select * from public.soilgrids1km where latitude = '51.57' and longitude = '5.39'

latitude longitude ph_l ph_m ph_u depth  publication_date
51.57    5.39      4.2  5.8  7.4  -0.025 2014-04-02 

I bet a more pythonic person than me could write something like this in under one hour or so.

And that level of extensibility is one of the reasons why I think that PostgreSQL is one awesome DBMS. Integrarelational even...

Shaun M. Thomas: PG Phriday: Database and Schema Basics

From Planet PostgreSQL. Published on Feb 12, 2016.

Sure, managing a Postgres cluster is great. We’ve got the software installed, more tool wrappers than we can wave several sticks at, and a running cluster that’s ready to do our bidding. But what do we do with it? In fact, what is a Postgres cluster anyway? How are databases involved? What about schemas? There’s critically important nomenclature that new users may find somewhat non-intuitive, so let’s demystify things a bit.

First, I’ll go on record that I never really liked the “cluster” label when it comes to how it’s often used in relation to Postgres. When people in the tech industry think of a cluster, they probably imagine something like this. Yes, a cluster can simply mean a grouping of related things, but context is everything. This gets even more confusing in larger contexts where a “cluster” of Postgres servers is a legitimate architecture, for horizontal scalability or otherwise.

So let’s back away for a second and consider the components themselves. Starting with the Postgres software, we have something similar to how programmers think of objects: instances. Classes are unusable until they’re instantiated as an object (barring virtual classes, interfaces, and other special cases.) In this way, the Postgres software is just there, doing nothing, until there’s an instance that’s created and running. Whether we use initdb, pg_createcluster, or some other wrapper to do this, no Postgres databases can exist anywhere without an instance to hold them.

The primary job of an instance is to merely exist; by itself, a Postgres instance is not usable. Postgres instances are assigned a port number for traffic control purposes, so it’s entirely possible to run multiple simultaneous instances on the same server. But otherwise, you can’t create tables in one, and they certainly are not a database.

This should not blow anyone’s mind, but there is an implicit lie here. Every new Postgres instance gets created along with with a database named “postgres”. Because of this, it’s very common for new users to consider an instance the database itself. After all, they can connect to “postgres” and create tables, issue queries, and so on. When using tools like pg_createcluster, an easy mistake to make, is that each application needs its own Postgres “cluster”. Even the Postgres initdb tool is a giant misnomer (or perhaps more of an anachronism); it creates a new instance which can contain databases.

So what’s a database then, and how do we create one and subsequently use it? When connecting to a Postgres database, most drivers make usage pretty clear. In order to connect, we need a port number and the database name. A default install leaves this at port 5432, with a database named “postgres”. Thus it’s not a stretch to view databases as an exclusive namespace. Any tables we create in one database can not be seen or used from another database.

As for creation, let’s assume we have an instance running on the defaults already. There are two ways to create new custom databases. First, we can use the createdb utility from the command line any number of times:

createdb myapp
createdb yourapp
createdb anotherapp

Or we can connect to the “postgres” database as a superuser (the “postgres” user by default), and then create the new databases using SQL syntax:


This of course raises the question: why would I want to create separate databases? Some use cases might not demand this, but it’s generally considered good practice to place objects in a container that describes its contents. The word “postgres” means nothing to anyone but people who really like Postgres, and those who interact with it regularly. It doesn’t suggest how data might be related, what application or project the data might belong to, or any other myriad reasons. It’s a nice playground, but shouldn’t really be used otherwise.

Some companies name databases after intended environment. In such cases, we often see “dev” or “stage” or “production” or some other variant. Others use the application name itself, making it fairly clear why the database exists, and what its primary role entails. Regardless of the approach, we strongly recommend creating databases with meaningful names, no matter how that might be interpreted.

But what comes next? Say we followed the rules and created a database for our application. Does that mean we can simply start creating tables there immediately? Yes, of course we can! But please do everyone a favor and refrain from doing so. Because as with all things Postgres, we often get a default we should try to avoid for more official projects. This time, that default is the public schema.

Here’s what happens if we connect and create a table right away:

  foo_desc  TEXT,
  lame_num  INT
              List OF relations
 Schema |      Name       | TYPE  |  Owner   
 public | foo             | TABLE | postgres

Any object we create, so long as we don’t provide a different schema, goes in the public context. To see why this might be a problem, consider how often global variables are used in large projects. This can be even more problematic for databases, as they can contain data from any number of related applications within a project. If any of these share a single table, view, or function with the same name, we could have a serious issue.

Schemas are the object namespaces of the database world. Without explicitly defining the desired schema, any new table or view simply goes into public, since it has to live somewhere. There are a few ways to use schemas to our advantage, and Postgres provides a few methods to simplify this even further.

Let’s start with the basics and create a schema and recreate our table there:

  foo_desc  TEXT,
  lame_num  INT
INSERT INTO (foo_desc, lame_num)
       VALUES ('This is a thing.', 10);
 id |     foo_desc     | lame_num 
  1 | This IS a thing. |       10

So long as a schema exists, we can use it simply by prefixing the table, view, or function with its name. But let’s say that’s annoying, because honestly who wants to do more typing? We can repeat the exercise by setting the Postgres search_path variable within our session.

CREATE SCHEMA more_stats;
SET search_path TO more_stats;
  foo_desc  TEXT,
  lame_num  INT
INSERT INTO foo (foo_desc, lame_num)
       VALUES ('This is a another thing.', 100);
 id |     foo_desc     | lame_num 
  1 | This IS a thing. |       10
 id |         foo_desc         | lame_num 
  1 | This IS a another thing. |      100

Notice how we were able to utilize content from both versions of the foo table? By changing the schema search path, we essentially modify what data we can access without requiring a prefix. What’s more, we can specify multiple schemas in a single search path. Let’s create another table and try it out:

CREATE SCHEMA less_stats;
SET search_path TO less_stats, more_stats, stats;
  bar_desc  TEXT,
  cool_num  INT
INSERT INTO bar (bar_desc, cool_num)
       VALUES ('I like cows.', 42);
 id |         foo_desc         | lame_num 
  1 | This IS a another thing. |      100
 id |   bar_desc   | cool_num 
  1 | I LIKE cows. |       42
          List OF relations
   Schema   | Name | TYPE  |  Owner   
 less_stats | bar  | TABLE | postgres
 more_stats | foo  | TABLE | postgres

This example was pretty overloaded with information, so let’s decompress a bit. What happened?

  1. The bar table was created in the new less_stats schema, so schemas are used in the order listed.
  2. The foo and bar tables are in different schemas, but we could use both without a prefix. So multiple schemas in the search path mean we don’t have to use the prefix for any objects they contain.
  3. Since both more_stats and stats contain a foo table, the more_stats version gets precedence. So not only are schemas used in the order listed, objects get view priority as well. Objects that exist in multiple schemas can only be “seen” without a prefix if they’re the first match.

Given these rules, there is some potential for undefined behavior when relying on search_path for larger projects that may contain multiple schemas. If any of those schemas contain tables or views with the same name, it might come as a surprise which version gets used without strict adherence to some kind of defined hierarchy. As such, for automated use, we generally recommend being explicit and prefixing table and view names in queries with the expected schema.

But for ad-hoc use, search paths are extremely handy. We can even “automate” this to a degree. In the postgresql.conf file, we can change the search_path to whatever we wish. Thus, anyone who connects to any database in an instance will automatically get a search path that removes public or enforces a certain schema view tree. Of course, doing this effectively exposes any of the objects in those schemas to a single global context, so some caution is probably warranted.

Finally, we can specify a search path and explicitly associate a database user with it. Thus, any time this user connects, they will get that list of schemas by default. This carries far less risk than setting it for all databases, and lets us customize access vectors based on how the connection might be used. Here’s how that might look for the stats application:

ALTER USER stats_read SET search_path TO less_stats, more_stats;

Now when the stats_read user connects, they don’t need to use prefixes when interacting with tables in those schemas. If that’s the primary role of the user, we don’t really have to worry about namespace collisions, since it’s probably a small subset of the available schemas.

Taking this idea further, A single database can contain schemas for each version of an application. Or hosting providers can restrict users to their own schema, so hundreds or even thousands of users can share a single database. Or perhaps each schema can correspond to a logical shard name, so the tables within reflect the same basic architecture, making shard-aware applications possible. In fact, this is how I designed shard_manager to work.

In the end, it boils down to this:

Instance -> Database -> Schema -> Object (table, view, etc.)

Each one of these can contain multiples of the thing it’s designed to hold. So unless there’s some kind of explicit need to physically separate data in such a way it needs another network port to access, multiple databases can coexist peacefully. Please keep this stack in mind before reaching for initdb, pg_createcluster, or similar tools.

Pavan Deolasee: Postgres-XL: Dealing with read-only transactions

From Planet PostgreSQL. Published on Feb 12, 2016.

One of the problems with Postgres-XL 9.2 is that it assigns a global transaction identifier (GXID) for every transaction that is started in the cluster. Since its hard for the system to know if a transaction is read-only or not (remember, even SELECT queries can do write activities), Postgres-XL would always assign a GXID and send that to the datanodes. This is quite bad for system performance because every read-only transaction now must go to the GTM, start a new transaction, grab an GXID and also finish the transaction on the GTM when its committed or aborted. For short transactions, like read-only pgbench workload, this adds a lot of overhead and latency. The overhead is even more noticeable when the actual work done by the transaction is very fast, say because the data is fully cached in shared buffers.

This is now set to change in the upcoming Postgres-XL 9.5 release. We have implemented the same lazy GXID assignment technique similar to PostgreSQL. This required us to enhance the concept of global session. A transaction which involves more than one node runs a global session and gets a unique global session identifier. If a node while executing the query decides to assign GXID to the transaction, it sends the global session identifier to the GTM. If the global transaction, identified by the global session, had already been assigned a GXID, the GTM sends back the same GXID to the node or a new GXID is assigned to the transaction. Further, the node sends back the assigned GXID to the coordinator so that it can decide whether to run a single phase or two-phase commit for the transaction.

While the lazy GXID assignment resulted in significant reduction in the GXID consumption rate, its still not enough because of yet another problem that we must solve. As you’re aware, Postgres-XL uses Multi-version Concurrency Control or MVCC for providing various transaction guarantees. But to reclaim the dead row versions, every node must check row versions against something called as RecentGlobalXmin which, in simple terms, is the oldest transaction that might be visible to the oldest transaction running in the system. To correctly compute this value, the GTM must be aware of the snapshot used by every running transaction in the cluster, even read-only transactions. That implies that every transaction must still report its start and end to the GTM, along with the oldest transaction its still interested in (known as snapshot xmin). If we wanted to do any significant improvement for read-only, short transaction workload, we must address this problem too.

So what we now have is a new cluster-monitor process that every datanode and coordinator runs as a postmaster child process. This process is responsible for periodically reporting node-local state to the GTM, which then computes the global state and sends back to all the nodes. To address the RecentGlobalXmin  described above, the cluster-monitor sends the RecentGlobalXmin  computed using node-local information. The GTM then computes a cluster-wide RecentGlobalXmin  which node must use. The entire algorithm is much more complex than it sounds because we must account for node failures, node disconnections and ensure that the RecenGlobalXmin  computation does not unnecessarily stalled because that will prevent any dead row versions getting removed from the system.

To summarise, the old way of executing a read-only transaction suffered from the following problems that the new Postgres-XL 9.5 will solve:

  1. Coordinator to GTM roundtrip to start a transaction and get a GXID
  2. Coordinator to GTM roundtrip for finishing the transaction
  3. Extra state and load on the GTM
  4. Rapid consumption of GXID, leading to wrap-around related problems

These features together give us an edge over other MPP databases as far as short, read-only transactions are concerned. Stay tuned for my next blog on OLTP performance benchmark results using these new enhancements.

Christophe Pettus: Indexes Are Not Cheap

From Planet PostgreSQL. Published on Feb 11, 2016.

I’ve noticed an increasing tendency in PostgreSQL users to over-index tables, often constructing very complex partial indexes to try to speed up very particular queries.

Be careful about doing this. Not only do additional indexes increase the plan time, they greatly increase insert time.

By way of example, I created a table with a single bigint column, and populated it with:

time psql -c "insert into t select generate_series(1, 100000000)"

That run without any indexes took 1 minute, 55 seconds; that run with eight indexes on the same table took 26 minutes, 39 seconds, or almost 14 times slower.

Regularly consult pg_stat_user_indexes and drop indexes that aren’t being used. Your disk space and insert times will thank you.

CSV Files - Read and Create

By GoDjango - Django Screencasts from Django community aggregator: Community blog posts. Published on Feb 11, 2016.

CSV files seem to be what some companies live by. Because of that we need to know how to work with them. In this video learn the basics of using csv reader, writer, DictReader, and DictWriter
Watch Now...

Devrim GÜNDÜZ: PostgreSQL YUM repo receives largest update set, ever.

From Planet PostgreSQL. Published on Feb 11, 2016.

Today, the PostgreSQL Global Development Group announced new minor releases for all supported version: 9.5.1, 9.4.6, 9.3.11, 9.2.15 and 9.1.20. As usual, RPMs are out, too:

Since the last set of updates, I have been working hard with other members in the community for shuffling the repo a bit, towards an aim for better and easier maintenance.

Below is the list of today's update set. They all include the latest versions of each software as of today:


Please let us know if you encounter any issues with the packaging.


Federico Campoli: Brigthton PostgreSQL Meetup news

From Planet PostgreSQL. Published on Feb 11, 2016.

I've been  busy recently  and I failed to update on the last meetup news.
I apologise for that.

We had a very interesting meetup in January.

Alexey Bashtanov explained how the grouping works in postgres and how to improve or even re implement in C the grouping functions.
On the meetup page there are the pictures from the meeting .
The presentation's recording is available there and the slides are free to download on slideshare there.

We are already preparing the next meetup announced for 18th of March.
That's the abstract.

The next postgresql meetup will be a technical talk on the ways in which PostgreSQL can communicate with the outside world.

This talk will cover PostgreSQL Foreign Data Wrappers which are tables that
can read and write to external services. 

Efficient ways to run queries from applications will also be covered.

About the speaker.
Matthew Franglen works as a team lead in Brandwatch. He has a long history
of development, with over 10 years of experience.

As usual the talk will be preceded by pizza and a nice chat. We'll also do a recording.

If you are in Brighton please join us here.

Chris Travers: Why Commons Should Not Have Ideological Litmus Tests

From Planet PostgreSQL. Published on Feb 10, 2016.

This will likely be my last post on this topic.  I would like to revive this blog on technical rather than ideological issues but there seems like a real effort to force ideology in some cases.  I don't address this in terms of specific rights, but in terms of community function and I have a few more things to say on this topic before I return to purely technical questions.

I am also going to say at the outset that LedgerSMB adopted the Ubuntu Code of Conduct very early (thanks to the suggestion of Joshua Drake) and this was a very good choice for our community.  The code of conduct provides a reminder for contributors, users, participants, and leadership alike to be civil and responsible in our dealings around the commons we create.  Our experience is we have had a very good and civil community with contributions from every walk of life and a wide range of political and cultural viewpoints.  I see  this as an unqualified success.

Lately I have seen an increasing effort to codify a sort of political orthodoxy around open source participation.  The rationale is usually about trying to make people feel safe in a community, but these are usually culture war issues so invariably the goal is to exclude those with specific political viewpoints (most of the world) from full participation, or at least silence them in public life.  I see this as extremely dangerous.

On the Economic Nature of Open Source

Open source software is economically very different from the sorts of software developed by large software houses.  The dynamics are different in terms of the sort of investment taken on, and the returns are different.  This is particularly true for community projects like PostgreSQL and LedgerSMB, but it is true to a lesser extent even for corporate projects like MySQL.  The economic implications thus are very different.

With proprietary software, the software houses build the software and absorb the costs for doing so, and then later find ways to monetize that effort.  In open source, that is one strategy among many but software is built as a community and in some sense collectively owned (see more on the question of ownerership below).

So with proprietary software, you may have limited ownership over the software, and this will be particularly limited when it comes to the use in economic production (software licenses, particularly for server software, are often written to demand additional fees for many connections etc).

Like the fields and pastures before enclosure, open source software is an economic commons we can all use in economic production.  We can all take the common software and apply it to our communities, creating value in those areas we value.  And we don't all have to share the same values to do it.  But it often feeds our families and more.

But acting as a community has certain requirements.  We have to treat eachother with humanity generally.  That doesn't mean we have to agree on everything but it does mean that some degree of civility must be maintained and cultivated by those who have taken on that power in open source projects.

On the Nature of Economic Production, Ownership and Power (Functionally Defined)

I am going to start by defining some terms here because I am using these terms in functional rather than formal ways.

Economic Production:  Like all organisms we survive by transforming our environment and making it more conducive to our ability to live and thrive.  In the interpersonal setting, we would call this economic production.  Note that understood in this way, this is a very broad definition and includes everything from cooking dinner for one's family to helping people work together.  Some of this may be difficult to value but it can (what is the difference between eating out and eating at home?  How much does a manager contribute to success through coordination?).

Ownership:  Defining ownership in functional rather than formal terms is interesting.  It basically means the right to use and direct usage of something.  Seen in this way, ownership is rarely all or nothing.  Economic ownership is the right to utilize a resource in economic production.  The extent to which one is restricted in economic production using a piece of software the less one owns it, so CAL requirements in commercial software and anti-TIVOization clauses in the GPL v3 are both restrictions on functional ownership.

Economic Power:  Economic power is the power to direct or restrict economic production.  Since economic production is required for human life, economic power is power over life itself.  In an economic order dominated by corporations, corporations control every aspect of our lives.  In places where the state has taken over from the corporations, the state takes over this as well.  But such power is rarely complete because not all economic production can be centrally controlled.

I am going to come back to these below because my hesitation on kicking people out of the community due to ideological disagreements (no matter how wrong one side may seem to be) have to do with this fear of abuse of economic power.

On Meritocracy (and what should replace it)

Meritocracy is an idea popularized by Eric Raymond, that power in a community should be given to technical merit.  In short, one should judge the code, not the person.  The idea has obvious appeal and is on the surface hyper-inclusive.  We don't have to care about anything regarding each other other than quality of code.  There is room for everyone.

More recently there has been push-back in some corners against the idea of meritocracy.  This push-back comes from a number of places, but what they have in common is questioning how inclusive it really is.

The most popular concern is that meritocracy suggests that we should tolerate people who actively make the community less welcoming, particularly for underrepresented groups. and therefore meritocracy becomes a cover for excluding the same groups who are otherwise excluded in other social dimensions, that the means of exclusion differs but who is excluded might not.

There is something to be said for the above concern, but advocates have often suggested that any nexus between community and hostile ideas is sufficient to raise a problem and therefore when an Italian Catholic expresses a view of gender based on his religion on Twitter, people not even involved in the project seek his removal from it on the grounds that the ideas are toxic.  For reasons that will become clear, that is vast overreach, and a legitimate complaint is thus made toxic by the actions of those who promote it.  And similarly toxic are the efforts by some to use social category to insist that their code should be included just to show a welcoming atmosphere.

A larger problem with meritocracy though is the way it sets up open source communities to be unbalanced, ruled by technical merit and thus not able to attract the other sorts of contributions needed to make most software successful.  In a community where technical merit is the measure by which we are judged, non-technical contributions are systematically devalued and undervalued.  How many open source communities produce software which is poorly documented and without a lot of attention to user interface?  If you devalue the efforts at documentation and UI design, how will you produce software which really meets people's needs?  If you don't value the business analysts and technical writers, how will you create economic opportunities for them in your community?  If you don't value them, how will you leverage their presence to deliver value to your own customers?  You can't if your values are skewed.

The successor to meritocracy should be economic communitarianism, i.e. the recognition that what is good for the community is economically good for all its members.  Rather than technical merit, the measure of a contribution and a contributor ought to be the value that a contribution brings the community.    Some of those will be highly technical but some will not.  Sometimes a very ordinary contribution that anyone could offer will turn the tide because only one person was brave enough to do it, or had the vision to see it as necessary.  Just because those are not technical does not mean that they are not valuable or should not be deeply respected.  I would argue that in many ways the most successful open source communities are the ones which have effectively interpreted meritocracy loosely as economic communitarianism.

On Feeling Safe in the Community

Let's face it  People need to feel safe and secure in the community regarding their physical safety and economic interests.  Is there any disagreement on this point?  If there is, please comment below.  But the community cannot be responsible for how someone feels, only in making sure that people are objectively physically and economically secure within it.  If someone feels unsafe in attending conferences, community members can help address security concerns and if someone severely misbehaves in community space, then that has to be dealt with for the good of everyone.

I don't think the proponents of ideological safety measures have really thought things through entirely.  The world is a big place and it doesn't afford people ideological safety unless they don't go out and work with people they disagree with.  As soon as you go across an international border, disagreements will spring up everywhere and if you aren't comfortable with this then interacting on global projects is probably not for you.

Worse, when it comes to conduct outside of community circles, those in power in the community cannot really act constructively most of the time.  We don't have intimate knowledge and even if we do, our viewpoints have to be larger than the current conflict.

On "Cultural Relativism:" A welcoming community for all?

One of the points I have heard over and over in discussions regarding community codes of conduct is that welcoming people regardless of viewpoint (particularly on issues like abortion, sexuality, etc) is cultural relativism and thus not acceptable.  I guess the question is not acceptable to whom?  And do we really want an ideological orthodoxy on every culture war topic to be a part of an open source project?  Most people I have met do not want this.

But the overall question I have for people who push culture war codes of conduct is "when you say a welcoming community for all, do you really mean it?  Or do you just mean for everyone you agree with?  What if the majority changes their minds?"

In the end, as I will show below, trying to enforce an ideological orthodoxy in this way does not bring marginal groups into the community but necessary forces a choice of which marginal groups to further exclude.  I don't think that is a good choice and I will go on record and say it is a choice I will steadfastly refuse to make.

A Hypothetical

Ideology is shaped by culture, and ideology of sexuality is shaped by family structures, so consequently where family structures are different, views on sexuality will be also.

So suppose someone on a community email list includes a pro-same-sex marriage email signature, something like:

"Marriage is an institution for the benefit of the spouses, not [to] bind parents to their children" -- Ted Olson, arguing for a right to same-sex marraige before the United States Supreme Court.

So a socially conservative software developer from southern India complaints to the core committee saying that this is an attack on his culture, saying that traditional Indian marriages are not real marriages.  Now, I assume most people would agree that it would be best for the core committee not to insist that the email signature be changed for someone to continue to participate.  So with such a decision, suppose the complainant changes his signature instead to read:

"If mutual consent makes a sexual act moral, whether within marriage or without, and, by parity of reasoning, even between members of the same sex, the whole basis of sexual morality is gone and nothing but misery and defect awaits the youth of the country… " -- Mohandas Gandhi

Now the first person decries the signature as homophobic and demands the Indian fellow be thrown off the email list.  And the community, if it has decided to follow the effort at ideological safety has to resolve the issue.  Which group to exclude?  The sexual minority?  Or the group marginalized through a history of being on the business end of colonialism?  And if one chooses the latter, then what does that say about the state of the world?  Should Indians, Malaysians, Catholics, etc. band together to fork a competing project?  Is that worth it as a cost?  Doesn't that hurt everyone?

On Excluding People from the Commons

In my experience, excluding people from the commons carries with it massive cost, and this is a good thing because it keeps economic power from being abused.  I have watched the impact first hand.  LedgerSMB would not even exist if this weren't an issue with SQL-Ledger.  That we are now the only real living fork of SQL-Ledger and far more active than the project we forked from is a testament to the cost.

Of course in that case the issue was economic competition and a developer who did not want to leverage community development to build his own business.  I was periodically excluded from SQL-Ledger mailing lists etc for building community documentation (he sold documentation).  Finally the fork happened beccause he wouldn't take security reports seriously.  And this is one of the reasons why I would like to push for an inclusive community.

But I also experienced economic ramifications from being excluded.  It was harder to find customers (again, the reason for exclusion was economic competition so that was the point).  In essence, I am deeply aware of the implications of kicking people out.

I have seen on email lists and tracker tickets the comparison of the goal of excluding people with problematic ideologies with McCarthyism.  The goal of McCarthyism was indeed similar, to make sure that if you had the wrong ideas you would be unable to continue a professional career.  I have had relatives who suffered because they defended the legal rights of the Communist Party during that time.  I am aware of cases where the government tried to take away their professional career (unsuccessfully).

Management of community is political and the cost of excluding someone is also political.  We already exist in some ways on the margins of the software industry.  Exclude too many people and you create your own nemesis.  That's what happened to SQL-Ledger and why LedgerSMB is successful today.

Notes on former FreeBSDGirl

One blog entry that comes from the other side of this issue is Randi Harper's piece on why she no longer will go to FreeBSD conferences and participate on IRC channels.   I am not familiar with the facts surrounding her complaints and frankly I don't have time to be so what the nature of her harassment complaint is, I will not be the judge.

There is however another side to the issue that is outside what she evidently has experience with, and that is the role of software maintainers in addressing the sorts of complaints she made.  Consequently I want to address that side and then discuss her main points at the bottom.

One thing to remember is that when people make accusations of bullying, harassment, etc. the people in charge are also the people with the least actual knowledge of what is going on.  Expecting justice from those in power in cases like this will lead, far more often than not, to feelings of betrayal.  This is not because of bad intentions but because of lack of knowledge.  This was one thing I learned navigating schoolyard bullies when I was growing up and we as project maintainers are in an even lower knowledge role than school administrators are.  Bullies are furthermore usually experts at navigating the system and take advantage of those who are not as politically adept, so the more enforcement you throw at the problem, the worse it gets.

So there is an idea that those in charge will stop people from treating eachother badly.  That has to stop because it isn't really possible (as reasonable as it sounds).  What we can do is keep the peace in community settings and that is about it.  One needs bottom up solutions, not top down ones.

So if someone came to me as a maintainer of a project alleging harassment on Twitter and demanding that an active committer be removed, that demand would probably go nowhere.  If political statements were mentioned, the response would be "do we want a political orthodoxy?"  Yet LedgerSMB has avoided these problems largely because, I think, we are a community of small businesses and therefore are used to working through disagreements and maybe because we are used to seeing these sorts of things as political.

Her main points though are worth reading and pondering.  In some areas she is perfectly right and in some areas dangerously wrong.

Randi is most right in noting that personal friction cannot be handled like a technical problem.  It is a political problem and needs to be handled as such.  I don't think official processes are the primary protection here, and planning doesn't get you very far, but things do need to be handled delicately.

Secondly, there is a difference between telling someone to stay quiet and telling someone not to be shouting publicly.   I think it is worth noting that if mediation is going to work then one cannot have people trying to undermine that in public, but people do need friends and family for support and so it is important to avoid the impression that one is insisting on total confidentiality.

Randi is also correct that how one deals with conflict is a key gauge of how healthy an open source community is.  Insisting that people be banished because of politically offensive viewpoints however does not strike me as healthy or constructive.  Insisting that people behave themselves in community spaces does.  In very rare cases it may be necessary to mediate cases that involve behavior outside that, but insisting on strict enforcement of some sort of a codified policy will not bring peace or prosperity.

More controversially I will point out that there is a point that Randi makes implicitly that is worth making explicit here, namely that there is a general tone-deafness to women's actual experiences in open source.  I think this is very valid.  I can remember a former colleague in LedgerSMB making a number of complaints about how women were treated in open source.  Her complaints included both unwanted sexual attention ("desperate geeks") and more actionably the fact that she was repeatedly asked how to attract more women to open source (she responded once on an IRC channel with "do you know how annoying that is?").  She ultimately moved on to other projects following a change in employment that moved LedgerSMB outside the scope of duties,  but one obvious lesson that those of us in open source can take from this is just to listen to complaints.  Many of these are not ones that policies can solve (you really want a policy aimed at telling people not to ask what needs to be done to attract more women to open source?) but if we listen, we can learn something.

One serious danger in the current push for more expansive codes of conduct is that it puts those who have the least knowledge in the greatest responsibility.  My view is that expansive codes of conduct, vesting greater power with maintainers over areas of political advocacy outside community fora will lead to greater, not less conflict.  So I am not keen in her proposed remedies.

How Codes of Conducts Should be Used

The final point I want to bring up here is how codes of conduct should be used.  These are not things which should be seen as pseudo-legal or process-oriented documents.  If you go this way, people will abuse the system.  It is better in my experience to vest responsibility with the maintainers in keeping the peace, not dispensing out justice, and to have codes of conduct aimed at the former, not the latter.  Justice is a thorny issue, one philosophers around the world have been arguing about for millennia with no clear resolution.

A major problem is the simple fact that perception and reality don't always coincide.  I was reminded of this controversy while reading an article in The Local about the New Years Eve sexual assaults, about work by a feminist scholar in Sweden to point out that actually men are more at risk from bodily violence than women are, and that men suffer disproportionately from crime but are the least likely to modify behavior to avoid being victimized.  The article is worth reading in light of the current issues.

So I think if one expects justice from a code of conduct, one expects too much.  If one expects fairness from a code of conduct, one expects too much.  If one expects peace and prosperity for all, then that may be attainable but that is not compatible with the idea that one has a right not to be confronted by people with dangerous ideologies.

Codes of conducts, used right, provide software maintainers with a valuable tool for keeping the peace.  Used wrong, they lead open source projects into ruin.  In the end, we have to be careful to be ideologically and culturally inclusive and that means that people cannot guarantee that they are safe from ideas they find threatening.

Magnus Hagander: A new version of Planet PostgreSQL

From Planet PostgreSQL. Published on Feb 10, 2016.

I have just pushed code for a new version of the codebase for Planet PostgreSQL.

For those of you who are just reading Planet, hopefully nothing at all should change. There will probably be some bugs early on, but there are no general changes in functionality. If you notice something that is wrong (given a couple of hours from this post at least), please send an email to planet(at) and we'll look into it!

For those who have your blogs aggregated at Planet PostgreSQL, there are some larger changes. In particular, you will notice the whole registration interface has been re-written. Hopefully that will make it easier to register blogs, and also to manage the ones you have (such as removing a post that needs to be hidden). The other major change is that Planet PostgreSQL will now email you whenever something has been fetched from your blog - to help you catch configuration mistakes bigger.

The by far largest changes are in the moderation and administration backend. This will hopefully lead to faster processing of blog submissions, and less work for the moderators.

Special Offer for the Readers of DjangoTricks Blog

By DjangoTricks from Django community aggregator: Community blog posts. Published on Feb 09, 2016.

Packt Publishing, the company that published my Django book, has a special offer for enthusiast and professional developers reading this blog. For two weeks you can get the eBook "Web Development with Django Cookbook - Second Edition" for half price. The eBook is available in PDF, ePub, Mobi, and Kindle formats. Also you will get access to download the related code files.

Use the discount code DJGTRK50 at the Packt Publishing bookstore.
The discount is valid until the 24th of February, 2016.

Just one comma

By Reinout van Rees' weblog from Django community aggregator: Community blog posts. Published on Feb 09, 2016.

In Django, if you say managed = False in a model's Meta, you tell Django not to touch the database table. So: no automatic database migrations, for instance.

Now, what is the problem if you have managed = False, and Django does do migrations?

Some of you will have seen the error already.

The comma after False is the problem:

>>> a = False,
>>> a
>>> bool(a)

The comma turns it into a tuple. And a non-empty tuple evaluates to True!

I found it quite funny. My colleague was torn between "extremely relieved" and "extremely annoyed" :-)

Christophe Pettus: Django 1.8/1.9 and PostgreSQL: An Ever-Closer Union

From Planet PostgreSQL. Published on Feb 09, 2016.

I’ll be speaking about Django and PostgreSQL at PyCon US 2016.

Richard Yen: 4 Steps to Using UUIDs with JPA

From Planet PostgreSQL. Published on Feb 09, 2016.

Postgres supports a variety of data types that allow data architects to store their data consistently, enforce constraints through validation, maximize performance, and maximize space.  Recently, I was asked to show someone how to store a UUID (Universally Unique Identifier) into Postgres by way of JPA (Java Persistence API).

read more

Wagtail and Streamfields

By Nicolas Kuttler tagged Django from Django community aggregator: Community blog posts. Published on Feb 09, 2016.

Django was always great for developers out of the box, but creating friendly admin interfaces was always a little too much work. This post explains why I now consider using Wagtail to make this task easier.

Umair Shahid: 2ndQ talks at PGDay FOSDEM 2016

From Planet PostgreSQL. Published on Feb 09, 2016.

It is always exciting to meet community members and exchange ideas about PostgreSQL and the eco-system around the database. I was lucky enough to be a part of FOSDEM PGDay in Brussels on 29th January this year. 3 of 2ndQuadrant’s very best spoke at the event. If you missed the talks, you can take a look at their slide decks shared below.

I will be sure to direct any queries you might have to the experts themselves! :-)

Christophe Pettus: PostgreSQL High Availability, 2016 Edition

From Planet PostgreSQL. Published on Feb 08, 2016.

The slides from my talk at PG Day at FOSDEM 2016 are now available.

Hubert 'depesz' Lubaczewski: – new change and some stats

From Planet PostgreSQL. Published on Feb 08, 2016.

Quite a long time ago (in October), Oskar Liljeblad reported a bug in anonymization. Namely – group keys were not anonymized. You can see example of such plan here. I finally got to it, fixed the bug, pushed new version to live site, and now such plans will be correctly anonymized. Thanks Oskar, and sorry […]

Hubert 'depesz' Lubaczewski: Waiting for 9.6 – Add num_nulls() and num_nonnulls() to count NULL arguments.

From Planet PostgreSQL. Published on Feb 08, 2016.

On 5th of February, Tom Lane committed patch: Add num_nulls() and num_nonnulls() to count NULL arguments.   An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that exactly one of a,b,c isn't NULL. The functions are variadic, so they can also be pressed into service to count the number of null or nonnull elements in […]

Gulcin Yildirim: Prague PostgreSQL Developer Day 2016

From Planet PostgreSQL. Published on Feb 08, 2016.

The 9th annual “Prague PostgreSQL Developer Day” conference will be held on 17th and 18th February in Prague.


Me with PostgreSQL elephants at Prague Zoo :)

Welcome to Prague PostgreSQL Developers Day

Prague PostgreSQL Developer Day is a two day conference and talks & trainings are mostly in Czech language. Conference program is also ready, you can check the full schedule.

1st Day: Trainings – 17th of February

The first day of the conference, there will be trainings and practical lectures for a smaller number of attendees. You can find the detailed trainings program. All trainings will be in Czech language.

One of the most interesting ones is “Reading query execution plans” training. With this training Tomas Vondra from 2ndQuadrant aims to help trainees to understand the basic stages of processing a SQL query (parsing, planning and execution) and how the database trying to find the best plan.

If you are more into PostgreSQL database administration you might want to join Pavel Stěhule‘s “Configuration, administration and monitoring PostgreSQL” training. He will focus on monitoring PostgreSQL for preventing problems and unwanted downtimes also figuring out possible bottlenecks of the PostgreSQL systems.

If you are interested in awesome PostGIS you might consider joining Jan Michalek‘s “Introduction to PostGIS” training. After the training participants are expected to have ability to perform simple spatial queries, import / export data, etc.

The last training that I would like to mention is Aleš Zelený‘s “Learning PostgreSQL for Oracle DBAs”. He will show differences and similarities between Oracle and PostgreSQL databases from the perspective of an Oracle DBA. If you are interested in listening his journey while he has been adapting PostgreSQL with a different DBMS background and different experiences, you might want to join his training.

With these 4 trainings that I mentioned above, the first day will end and the conference will continue with the talks both in Czech and English languages.

2nd Day: Talks – 18th of February

There will be interesting talks on the second day of the conference. You can find the detailed list of talks here.

Petr Jelinek from 2ndQuadrant will give a talk about “Open-source clustering solutions for PostgreSQL” and Tomas Vondra will talk about “Major changes in PostgreSQL 9.5″.

The two English talks will be given by Andres Freund and Marco Slot. Andres will talk about “IO in Postgres – Architecture, Tuning, Problems” and Marco will talk about “pg_paxos: Table replication distributed through consensus”. You don’t want to miss any of the talks though!

Last year’s event

This year, Prague PostgreSQL Developer Day conference expects to have around 200 people attending. They sold out in 7 days!

Don’t miss this great chance of meeting PostgreSQL people and having 2 days full of PostgreSQL!

Prague PostgreSQL Developer Day 2015 was a huge success with more than 150 people attending. If you would like to read about last year’s event, check out Tomas’s blog post.

We would like to meet with you!

Prague is an amazing city with its dark and gothic atmosphere, I enjoy spending my time in Prague and I’m so excited to join this year’s event!

Like I said before, Prague is awesome and 17th-18th of February will be more awesome with PostgreSQL people discussing about PostgreSQL and meeting with each other.

We will have a company booth and I will be there to talk about PostgreSQL and answer your questions about 2ndQuadrant.

Hope to see you all there. Happy P2D2!

Reuven Lerner: [Video 435] Simon Riggs: Databases — The Long View

From Planet PostgreSQL. Published on Feb 06, 2016.

Databases form the cornerstone of many applications, Web sites, and platforms. A huge amount of time, money, and research has been poured into databases over the last few decades. But our thirst for data, and the quantities that we’re trying to read and analyze, continue to grow. What can and should we do? How can we ensure reliability?  How can we communicate with a growing number of other systems? And where does PostgreSQL, an open-source relational database with a growing number of features, fit into this trend? In this talk, Siimon Riggs answers all of these questions, and describes how PostgreSQL’s developers are working to keep as many of these questions in mind as they continue to improve their contribution to the world of databases.

Time: 22 minutes

The post [Video 435] Simon Riggs: Databases — The Long View appeared first on Daily Tech Video.

Fresh Book for Developers Working with Django 1.8

By DjangoTricks from Django community aggregator: Community blog posts. Published on Feb 06, 2016.

This week the post office delivered a package that made me very satisfied. It was a box with three paper versions of my "Web Development with Django Cookbook - Second Edition". The book was published at the end of January after months of hard, but fulfilling work in the late evenings and on weekends.

The first Django Cookbook was dealing with Django 1.6. Unfortunately, the support for that version is over. So it made sense to write an update for a newer Django version. The second edition was adapted for Django 1.8 which has a long-term support until April 2018 or later. This edition introduces new features added to Django 1.7 and Django 1.8, such as database migrations, QuerySet expressions, or System Check Framework. Most concepts in this new book should also be working with Django 1.9.

My top 5 favourite new recipes are these:

  • Configuring settings for development, testing, staging, and production environments
  • Using database query expressions
  • Implementing a multilingual search with Haystack
  • Testing pages with Selenium
  • Releasing a reusable Django app

The book is worth reading for any Django developer, but will be best understood by those who already know the basics of web development with Django. You can learn more about the book and buy it at the Packt website or Amazon.

I thank the Packt Publishing very much for long cooperation in the development of this book. I am especially thankful to acquisition editor Nadeem N. Bagban, content development editors Arwa Manasawala and Sumeet Sawant, and technical editor Bharat Patil. Also I am grateful for insightful feedback from the reviewer Jake Kronika.

What 5 recipes do you find the most useful?

Shaun M. Thomas: PG Phriday: Tackling Intractable ACLs

From Planet PostgreSQL. Published on Feb 05, 2016.

Say that three times fast! Joking aside, managing database object access is a headache for users and DBAs alike. Postgres isn’t really an exception in that regard, but it does provide a few tools to greatly reduce the pain involved. The crazy thing is that few people even know this feature exists. I’ve known about it for a while myself, but it always slips my mind because it feels so wrong not to explicitly grant permissions.

What does that mean? Well, consider we have a schema with some existing tables, and a role we want to have read permission on tables in that schema. Here’s one as an example:

CREATE TABLE acl_app.t1 (foo INT);
CREATE TABLE acl_app.t2 (bar INT);
CREATE ROLE acl_read;
GRANT USAGE ON SCHEMA acl_app TO acl_read;

If this were a preexisting schema, normally we would grant read permission to tables like this:

GRANT SELECT ON acl_app.t1 TO acl_read;
GRANT SELECT ON acl_app.t2 TO acl_read;

And that’s also the usual suggestion for grants after tables are created. Create the table, then grant the permissions. It’s fairly straight-forward, and an expected part of database administration. But what about when we have an existing table with dozens or hundreds of tables? Doing the grants manually as shown above would be monumentally irritating! In fact, for many database systems, the only way forward is to use system catalogs to generate a script, and then execute the output in the database. Postgres lets you do that:

  SELECT 'GRANT SELECT ON acl_app.' || tablename ||
         ' TO acl_read;'
    FROM pg_tables
   WHERE schemaname = 'acl_app'
) TO '/tmp/grant_perms.sql';
\i /tmp/grant_perms.SQL

But the kind Postgres devs have also provided us with some extremely useful shorthand, because while usable, the script approach is something of an ugly hack. Here’s how that looks:

   TO acl_read;
\z acl_app.*
 Schema  | Name | TYPE  |     Access privileges     
 acl_app | t1   | TABLE | postgres=arwdDxt/postgres
         |      |       | acl_read=r/postgres       
 acl_app | t2   | TABLE | postgres=arwdDxt/postgres
         |      |       | acl_read=r/postgres

As we can see here, the permissions show that the acl_read role can read the tables in this schema. Unfortunately this is a one-time operation. All subsequent tables created in the future will not be readable by the acl_read role. Here’s what happens if we create a new table now:

CREATE TABLE acl_app.t3 (baz INT);
\z acl_app.*
 Schema  | Name | TYPE  |     Access privileges     
 acl_app | t1   | TABLE | postgres=arwdDxt/postgres
         |      |       | acl_read=r/postgres       
 acl_app | t2   | TABLE | postgres=arwdDxt/postgres
         |      |       | acl_read=r/postgres       
 acl_app | t3   | TABLE |

Now we have a new table that’s only accessible by the postgres user, since that’s who created it. In most database software, that’s where the story ends. We can either re-run the above command after adding new tables, or ask developers to add an explicit GRANT statement following every new table, for every user or role that needs access.

That’s not so hard when using a template. But templates aren’t always maintained, new hires might not know where they are or use them properly, and sometimes people just plain forget. If testing or staging environments don’t exactly match production, or developers use these environments with elevated privileges, missing grants might not show up until the new tables and the associated code hit production. What then?

Well, Postgres provides some non-standard syntax that solves this conundrum. We can actually modify the default privileges for several database objects, so they’re explicitly set upon creation. Let’s try this out with another new table:

   IN SCHEMA acl_app
   TO acl_read;
CREATE TABLE acl_app.t4 (buz SERIAL);
\z acl_app.*
 Schema  |    Name    |   TYPE   |     Access privileges     
 acl_app | t1         | TABLE    | postgres=arwdDxt/postgres
         |            |          | acl_read=r/postgres       
 acl_app | t2         | TABLE    | postgres=arwdDxt/postgres
         |            |          | acl_read=r/postgres       
 acl_app | t3         | TABLE    |
 acl_app | t4         | TABLE    | postgres=arwdDxt/postgres
         |            |          | acl_read=r/postgres       
 acl_app | t4_buz_seq | SEQUENCE |

As we can see, the new t4 table has the expected grant. But this is actually a trick example. When tables are declared in Postgres with a SERIAL or BIG SERIAL column type, a new sequence is created, and grants to the parent table do not cascade to this new sequence. Thus, if a role needed to insert into this table and we granted INSERT privileges, it would not have that capability so long as it used the sequence.

Of course, the way to fix that comes through the same mechanism. For roles that require insert capabilities, two default modifications cover that scenario. Here’s another role meant for inserting into our tables, and how those permissions show up:

CREATE ROLE acl_writer;
GRANT USAGE ON SCHEMA acl_app TO acl_writer;
   IN SCHEMA acl_app
   TO acl_writer;
   IN SCHEMA acl_app
   TO acl_writer;
CREATE TABLE acl_app.t5 (bla SERIAL);
\z acl_app.t5*
 Schema  |    Name    |   TYPE   |     Access privileges
 acl_app | t5         | TABLE    | postgres=arwdDxt/postgres
         |            |          | acl_read=r/postgres
         |            |          | acl_writer=arw/postgres
 acl_app | t5_buz_seq | SEQUENCE | postgres=rwU/postgres
         |            |          | acl_writer=U/postgres

Now, when new tables are created that automatically generate an associated sequence, the acl_writer role will be able to utilize both. We can also see that the acl_read role also worked as expected. The best part is that we don’t need to remember anything when creating new objects, so long as we set everything up beforehand. And if we forget and don’t add this stuff until later? Well, we can always use the ALL TABLES IN grant syntax to fix any existing objects.

There is one caveat we must point out, however. New functions in Postgres are always granted usage to PUBLIC following creation. This is mostly because Postgres doesn’t differentiate between functions and procedures, so there is no assumption that a function will act on database objects. Normally this isn’t a problem, but if a function is created with SECURITY DEFINER, it executes as the user who created it. If the function does modify table data, that means any user with access to the database can invoke it, and that’s a massive security hole.

We recommend fixing this in all production databases with these commands:


The last command should be repeated for any other schema in the database. From that point on, all function grants must either be explicit, or use the ALTER DEFAULT... technique we’ve covered in this article. It probably seems draconian, and in most cases that might be the case. However, in critical or sensitive database systems, sometimes being a little zealous can be beneficial. Preventing accidental privilege escalations also stops malicious ones.

And data can rarely be too safe.

gabrielle roth: PDXPUG: February meeting in two weeks

From Planet PostgreSQL. Published on Feb 04, 2016.

When: 6-8pm Thursday Feb 18, 2016
Where: Iovation
Who: Jason Owen
What: Incremental Schema Discovery via JSON Wrangling

Over the last few releases, Postgres has added and expanded support for storing and querying JSON documents. While the simplicity and flexibility of storing free-form data can be appealing, frequently what’s stored is actually fairly structured. Using data from the GitHub Archive, Jason will show how easy it is to load JSON into Postgres, demonstrate some fairly mechanical schema refactorings to extract structure from JSON data, and then compare the resulting normalized tables with the original, with the documentation, and with the data set loaded into Google BigQuery.

Jason has been using Postgres in earnest for two years now. While he’s worn the DBA hat as a full stack engineer from time to time, he is primarily a developer. He works at Tripwire and is an alumnus of Portland State University. Sometimes he says things online as @jasonaowen.

If you have a job posting or event you would like me to announce at the meeting, please send it along. The deadline for inclusion is 5pm the day before the meeting.

Our meeting will be held at Iovation, on the 32nd floor of the US Bancorp Tower at 111 SW 5th (5th & Oak). It’s right on the Green & Yellow Max lines. Underground bike parking is available in the parking garage; outdoors all around the block in the usual spots. No bikes in the office, sorry!

Elevators open at 5:45 and building security closes access to the floor at 6:30.

When you arrive at the Iovation office, please sign in on the iPad at the reception desk.

See you there!

David Kerr: Amazon Maintenance and your RDS Instances

From Planet PostgreSQL. Published on Feb 04, 2016.

I recently had the pleasure of Amazon telling me that they had to reboot all of my Postgres RDS instances to apply some security patches. When using RDS you generally expect that Amazon is going to do something like this and I was at least happy that they told me about it and gave me the option to trigger it on a specific maintenance window or else on my own time (up to a drop dead date where they'd just do it for me) One thing that you can't really know is what the impact of the operation is going to be. You know it's a downtime, but for how long? My production instances, are of course, Multi-AZ but all of my non-production instances are not. Fortunately, my non-production instances and my production instances both needed to get rebooted, so I could do some up-front testing on the timing. What I found was that the process takes about 10 to 15 minutes and, in this particular case, it was not impacted by database size. Although it is impacted by the number of instances you're rebooting at the same time. It seems Amazon queues the instances up so that some instances take longer than others. The pre-reboot security patches took about 5 minutes to load during this time the database was up. This was followed by a shutdown / reboot during which the database was unavailable. After the reboot which took less than a minute the database was immediately available while the system did post processing. After that a backup is performed which doesn't impact the system. So total downtime was about a minute, but I scheduled 10 minutes just to be safe. For the Multi-AZ instances the same process is followed but the shutdown / reboot is accompanied by an AZ failover which takes place nearly instantly. This is pretty cool as long as your applications are robust enough to re-connect. (Mine were not, so they required a restart) I timed the reboot to go with a deploy so no additional downtime was required. In the end it was fairly painless, if you don't trust your applications ability to reconnect it's good to baby sit them. Otherwise kicking it off during a maintenance window and not worrying about it is certainly doable.

Gulcin Yildirim: PgBouncer 1.7 – “Colors Vary After Resurrection”

From Planet PostgreSQL. Published on Feb 03, 2016.

PgBouncer is a lightweight connection pooler for PostgreSQL. PgBouncer 1.7 was announced on the 18th of December 2015. In this blog post we’ll talk about the major new improvements in PgBouncer.

I enjoy checking their funny release names at every new release and for celebrating PgBouncer 1.7: “Colors Vary After Resurrection” release, I used the lovely Dumbo image.


The Most Colorful Features

  • PgBouncer 1.7 supports TLS connections and, I think this is the biggest improvement of the new release. They used OpenSSL/LibreSSL libraries as backend implementation of the feature.

Note: TLS (Transport Layer Security) is a protocol that ensures privacy between communicating applications and their users on the Internet. When a server and client communicate, TLS ensures that no third party may eavesdrop or tamper with any message. TLS is the successor to the Secure Sockets Layer (SSL).

  • PgBouncer now supports authentication via TLS client certificate.

Note: Traditionally, TLS Client Authentication has been considered the alternative to bearer tokens (passwords and cookies) for web authentication. In TLS Client Authentication, the client (browser) uses a certificate to authenticate itself during the TLS handshake. Once the TLS connection is established (and authenticated), the client and server run HTTP on top of the TLS layer.

Let’s dig into details of TLS settings of PgBouncer. There are 14 config parameters related with TLS setup (client side + server side settings).

For assigning which TLS mode to use for connections from clients, we should set client_tls_sslmode parameter. TLS connections are disabled by default. When enabled, client_tls_key_file and client_tls_cert_file must be also configured to set up key and cert PgBouncer uses to accept client connections.

We can assign a root certificate to validate client certificates by setting client_tls_ca_file parameter, default is unset.

We can specify which TLS protocol versions are allowed by setting client_tls_protocols parameter, default is all.

For more detailed client side settings, you can check client_tls_ciphers, client_tls_ecdhcurve and client_tls_dheparams parameters.

Now, let’s talk about TLS server side config parameters. First, we need to declare TLS mode to use for connections to PostgreSQL servers with server_tls_sslmode parameter. TLS connections are disabled by default. We can assign CA server with server_tls_ca_file parameter. If we would like to assign private key for PgBouncer to authenticate against PostgreSQL server, we can use server_tls_key_file parameter, we can even assign a certificate for private key that PostgreSQL server can validate with server_tls_cert_file parameter. Like we did in client side TLS connection settings, we can declare which TLS protocol versions are allowed with server_tls_protocols parameter.

  • After TLS support, other significant new feature is support for “peer” authentication on Unix sockets.

Note: The peer authentication method works by obtaining the client’s operating system user name from the kernel and using it as the allowed database user name (with optional user name mapping). This method is only supported on local connections.

  • As a last major feauture of this version I would like to mention support for Host Based Access control file, like pg_hba.conf in Postgres. This allows to configure TLS for network connections and “peer” authentication for local connections.

We can configure how to authenticate users with auth_type parameter of PgBouncer. All of the config parameters are defined in the configuration file pgbouncer.ini. Let’s look into details of auth_type parameter.

auth_type parameter can be assigned one of the 6 values listed below. Let’s see the explanations and the usage of these values.

  • hba : If we set auth_type parameter with the value hba, we should set auth_hba_file parameter as well for showing which pg_hba.conf file will be used as a configuration. By doing this we allow actual authentication type to be loaded from auth_hba_file. This means we can use different authentication methods for different access paths. For example, with the version 1.7 connection over Unix socket use peer authentication method, at the same time connection over TCP must use TLS. So far, HBA file format does not support all of the authentication methods of pg_hba.conf. Supported methods are: trust, reject, md5, password, peer and cert.

  • cert : Client must connect over TLS connection with valid client certificate. Username is then taken from CommonName field from certificate.

  • md5 : Use MD5-based password check. auth_file (the name of the file to load user names and passwords from) may contain both MD5-encrypted or plain-text passwords. This is the default authentication method.

  • plain : Clear-text password is sent over wire. Deprecated.

  • trust : No authentication is done. Username must still exist in auth_file.

  • any : Like the trust method, but the username given is ignored. Requires that all databases are configured to log in as specific user. Additionally, the console database allows any user to log in as admin.

Other Shiny Features

There are more features released in this version. You can visit the PgBouncer changelog page or check the list below for the other improvements:

  • Set query_wait_timeout to 120s by default. This parameter defines the maximum time queries are allowed to spend waiting for execution. If the query is not assigned to a server during that time, the client is disconnected. This is used to prevent unresponsive servers from grabbing up connections. It also helps when server is down or database rejects connections for any reason. If this is disabled, clients will be queued infinitely. Current default (0) causes infinite queueing, which is not useful.With the version 1.7 if client has pending query and has not been assigned to server connection, the client connection will be dropped after 120 seconds by default.

  • Disable server_reset_query_always by default. Now reset query is used only in pools that are in session mode.

  • Increase pkt_buf to 4096 bytes. Improves performance with TLS. The behaviour is probably load-specific, but it should be safe to do as since v1.2 the packet buffers are split from connections and used lazily from pool.

  • Support pipelining count expected ReadyForQuery packets. This avoids releasing server too early. Fixes #52.

  • Improved sbuf_loopcnt logic – socket is guaranteed to be reprocessed even if there are no event from socket. Required for TLS as it has it’s own buffering.

  • Adapt system tests to work with modern BSD and MacOS. (Eric Radman)

  • Remove crypt auth. It’s obsolete and not supported by PostgreSQL since 8.4.

  • Fix plain “–with-cares” configure option – without argument it was broken.

What is PgBouncer?

PgBouncer is a utility for managing client connections to the PostgreSQL database. In a nutshell, it maintains a connection pool to the PostgreSQL server and reuses those existing connections. While this can be useful for reducing the client connection overhead, it also enables limiting the maximum number of open connections to the database server. It can also be used for traffic shaping like redirecting connections to one or more databases to different database servers. In addition to these, PgBouncer can be used for managing security on user and even on database level.

You can download PgBouncer via their downloads page and start using it now!

For more information about PgBouncer you can check my previous blog post about PgBouncer.

Happy readings!

gabrielle roth: RDS OS update

From Planet PostgreSQL. Published on Feb 03, 2016.

I’ve some more info for the AWS RDS OS update I mentioned last week. The announcement states that this update will take an outage. In my experience, it’ll be the usual failover time for a Multi-AZ instance (a minute or so), and 8-10 minutes for no-Multi-AZ instances. According to my event logs, the entire update process takes 20-25 minutes. Since I […]

How to no-mincss links with django-pipeline

By Peter Bengtsson from Django community aggregator: Community blog posts. Published on Feb 03, 2016.

This might be the kind of problem only I have, but I thought I'd share in case others are in a similar pickle.

Warming Up

First of all, the way my personal site works is that every rendered page gets cached as rendered HTML. Midway, storing the rendered page in the cache, an optimization transformation happens. It basically takes HTML like this:

<link rel="stylesheet" href="vendor.css">
<link rel="stylesheet" href="stuff.css">

into this:

/* optimized contents of vendor.css and stuff.css minified */

Just right-click and "View Page Source" and you'll see.

When it does this it also filters out CSS selectors in those .css files that aren't actually used in the rendered HTML. This makes the inlined CSS much smaller. Especially since so much of the CSS comes from a CSS framework.

However, there are certain .css files that have references to selectors that aren't in the generated HTML but are needed later when some JavaScript changes the DOM based on AJAX or user actions. For example, the CSS used by the Autocompeter widget. The program that does this CSS optimization transformation is called mincss and it has a feature where you can tell it to NOT bother with certain CSS selectors (using a CSS comment) or certain <link> tags entirely. It looks like this:

<link rel="stylesheet" href="ajaxstuff.css" data-mincss="no">

Where Does django-pipeline Come In?

So, setting that data-mincss="no" isn't easy when you use django-pipeline because you don't write <link ... in your Django templates, you write {% stylesheet 'name-of-bundle %}. So, how do you get it in?

Well, first let's define the bundle. In my case it looks like this:

  # Bundle of CSS that strictly isn't needed at pure HTML render-time
  'base_dynamic': {
        'source_filenames': (
        'extra_context': {
            'no_mincss': True,
        'output_filename': 'css/base-dynamic.min.css',

But that isn't enough. Next, I need to override how django-pipeline turn that block into a <link ...> tag. To do that, you need to create a directory and file called pipeline/css.html (or pipeline/css.jinja if you use Jinja rendering by default).

So take the default one from inside the pipeline package and copy it into your project into one of your apps's templates directory. For example, in my case, peterbecom/apps/base/templates/pipeline/css.jinja. Then, in that template add at the very end somehting like this:

{% if no_mincss %} data-mincss="no"{% endif %} />

The Point?

The point is that if you're in a similar situation where you want django-pipeline to output the <link> or <script> tag differently than it's capable of, by default, then this is a good example of that.

Alexey Lesovsky: Streaming replication in practice.

From Planet PostgreSQL. Published on Feb 03, 2016.

My slides about PostgreSQL Streaming Replication from PgConf 2016 Russia.

Christophe Pettus: Always Do This #4: Put stats_temp_directory on a memory file system

From Planet PostgreSQL. Published on Feb 02, 2016.

The PostgreSQL statistics collector generates a lot of very important statistics about the state of the database. If it’s not working, autovacuum doesn’t work, among other problems. But it does generate a lot of write activity, and by default, that goes back onto the database volume.

Instead, always set statstempdirectory to point to a RAM disk (which has to be owned by the postgres user, with 0600 permissions). The statistics are written back to the database on shutdown, so in normal operations, you won’t lose anything on a reboot. (You’ll lose the most recent statistics on a crash, but you will anyway; the statistics are reset on recovery operations, including restart from a crash.)

This can substantially cut down the amount of write I/O the main database storage volume has to receive, and it’s free!

Writing Unit Tests for Django Migrations

By Caktus Consulting Group from Django community aggregator: Community blog posts. Published on Feb 02, 2016.

Testing in a Django project ensures the latest version of a project is as bug-free as possible. But when deploying, you’re dealing with multiple versions of the project through the migrations.

The test runner is extremely helpful in its creation and cleanup of a test database for our test suite. In this temporary test database, all of the project's migrations are run before our tests. This means our tests are running the latest version of the schema and are unable to verify the behavior of those very migrations because the tests cannot set up data before the migrations run or assert conditions about them.

We can teach our tests to run against those migrations with just a bit of work. This is especially helpful for migrations that are going to include significant alterations to existing data.

The Django test runner begins each run by creating a new database and running all migrations in it. This ensures that every test is running against the current schema the project expects, but we'll need to work around this setup in order to test those migrations. To accomplish this, we'll need to have the test runner step back in the migration chain just for the tests against them.

Ultimately, we're going to try to write tests against migrations that look like this:

class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

    def setUpBeforeMigration(self, apps):
        BlogPost = apps.get_model('blog', 'Post')
        self.post_id = BlogPost.objects.create(
            title = "A test post with tags",
            body = "",
            tags = "tag1 tag2",

    def test_tags_migrated(self):
        BlogPost = self.apps.get_model('blog', 'Post')
        post = BlogPost.objects.get(id=self.post_id)

        self.assertEqual(post.tags.count(), 2)
        self.assertEqual(post.tags.all()[0].name, "tag1")
        self.assertEqual(post.tags.all()[1].name, "tag2")

Before explaining how to make this work, we'll break down how this test is actually written.

We're inheriting from a TestCase helper that will be written to make testing migrations possible named TestMigrations and defining for this class two attributes that configure the migrations before and after that we want to test. migrate_from is the last migration we expect to be run on machines we want to deploy to and migrate_to is the latest new migration we're testing before deploying.

class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

Because our test is about a migration, data modifying migrations in particular, we want to do some setup before the migration in question (0010_migration_being_tested) is run. An extra setup method is defined to do that kind of data setup after 0009_previous_migration has run but before 0010_migration_being_tested.

def setUpBeforeMigration(self, apps):
    BlogPost = apps.get_model('blog', 'Post')
    self.post_id = BlogPost.objects.create(
        title = "A test post with tags",
        body = "",
        tags = "tag1 tag2",

Once our test runs this setup, we expect the final 0010_migration_being_tested migration to be run. At that time, one or more test_*() methods we define can do the sort of assertions tests would normally do. In this case, we're making sure data was converted to the new schema correctly.

def test_tags_migrated(self):
    BlogPost = self.apps.get_model('blog', 'Post')
    post = BlogPost.objects.get(id=self.post_id)

    self.assertEqual(post.tags.count(), 2)
    self.assertEqual(post.tags.all()[0].name, "tag1")
    self.assertEqual(post.tags.all()[1].name, "tag2")

Here we've fetched a copy of this Post model's after-migration version and confirmed the value we set up in setUpBeforeMigration() was converted to the new structure.

Now, let's look at that TestMigrations base class that makes this possible. First, the pieces from Django we'll need to import to build our migration-aware test cases.

from django.apps import apps
from django.test import TransactionTestCase
from django.db.migrations.executor import MigrationExecutor
from django.db import connection

We'll be extending the TransactionTestCase class. In order to control migration running, we'll use MigrationExecutor, which needs the database connection to operate on. Migrations are tied pretty intrinsically to Django applications, so we'll be using django.apps.apps and, in particular, get_containing_app_config() to identify the current app our tests are running in.

class TestMigrations(TransactionTestCase):

    def app(self):
        return apps.get_containing_app_config(type(self).__module__).name

    migrate_from = None
    migrate_to = None

We're starting with a few necessary properties.

  • app is a dynamic property that'll look up and return the name of the current app.
  • migrate_to will be defined on our own test case subclass as the name of the migration we're testing.
  • migrate_from is the migration we want to set up test data in, usually the latest migration that's currently been deployed in the project.
def setUp(self):
    assert self.migrate_from and self.migrate_to, \
        "TestCase '{}' must define migrate_from and migrate_to properties".format(type(self).__name__)
    self.migrate_from = [(, self.migrate_from)]
    self.migrate_to = [(, self.migrate_to)]
    executor = MigrationExecutor(connection)
    old_apps = executor.loader.project_state(self.migrate_from).apps

After insisting the test case class had defined migrate_to and migrate_from migrations, we use the internal MigrationExecutor utility to get a state of the applications as of the older of the two migrations.

We'll use old_apps in our setUpBeforeMigration() to work with old versions of the models from this app. First, we'll run our migrations backwards to return to this original migration and then call the setUpBeforeMigration() method.

# Reverse to the original migration


Now that we've set up the old state, we simply run the migrations forward again. If the migrations are correct, they should update any test data we created. Of course, we're validating that in our actual tests.

# Run the migration to test

And finally, we store a current version of the app configuration that our tests can access and define a no-op setUpBeforeMigration()

    self.apps = executor.loader.project_state(self.migrate_to).apps

def setUpBeforeMigration(self, apps):

Here's a complete version:

from django.apps import apps
from django.test import TransactionTestCase
from django.db.migrations.executor import MigrationExecutor
from django.db import connection

class TestMigrations(TransactionTestCase):

    def app(self):
        return apps.get_containing_app_config(type(self).__module__).name

    migrate_from = None
    migrate_to = None

    def setUp(self):
        assert self.migrate_from and self.migrate_to, \
            "TestCase '{}' must define migrate_from and migrate_to properties".format(type(self).__name__)
        self.migrate_from = [(, self.migrate_from)]
        self.migrate_to = [(, self.migrate_to)]
        executor = MigrationExecutor(connection)
        old_apps = executor.loader.project_state(self.migrate_from).apps

        # Reverse to the original migration


        # Run the migration to test

        self.apps = executor.loader.project_state(self.migrate_to).apps

    def setUpBeforeMigration(self, apps):

class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

    def setUpBeforeMigration(self, apps):
        BlogPost = apps.get_model('blog', 'Post')
        self.post_id = BlogPost.objects.create(
            title = "A test post with tags",
            body = "",
            tags = "tag1 tag2",

    def test_tags_migrated(self):
        BlogPost = self.apps.get_model('blog', 'Post')
        post = BlogPost.objects.get(id=self.post_id)

        self.assertEqual(post.tags.count(), 2)
        self.assertEqual(post.tags.all()[0].name, "tag1")
        self.assertEqual(post.tags.all()[1].name, "tag2")

Hubert 'depesz' Lubaczewski: Waiting for 9.6 – Add trigonometric functions that work in degrees.

From Planet PostgreSQL. Published on Feb 01, 2016.

On 22nd of January, Tom Lane committed patch: Add trigonometric functions that work in degrees.   The implementations go to some lengths to deliver exact results for values where an exact result can be expected, such as sind(30) = 0.5 exactly.   Dean Rasheed, reviewed by Michael Paquier The description seems to explain everything, but […]

Hubert 'depesz' Lubaczewski: Waiting for 9.6 – Support parallel joins, and make related improvements.

From Planet PostgreSQL. Published on Feb 01, 2016.

On 20th of January, Robert Haas committed patch: The core innovation of this patch is the introduction of the concept of a partial path; that is, a path which if executed in parallel will generate a subset of the output rows in each process. Gathering a partial path produces an ordinary (complete) path. This allows […]

Shaun M. Thomas: PG Phriday: COPY and Alternative Import Methods

From Planet PostgreSQL. Published on Jan 29, 2016.

I recently noted that the COPY command in Postgres doesn’t have syntax to skip columns in source data when importing it into a table. This necessitates using one or more junk columns to capture data we’ll just be throwing away. During that, I completely forgot that friendly devs had contributed alternative file handling methods as Foreign Data Wrappers. Most people think of foreign wrappers as a method for interacting with remote databases. Perusing the full list however, reveals some surprising data sources. Twitter? Hive? Video cards?!

Well, let’s take a quick look at a real case. Here’s a table we want to import data into. We can’t insert into it directly, because of that extra column. So we’ll use an intermediate table as a raw import target, then insert into the target table. Let’s use COPY as a first approximation with 500,000 rows:

CREATE UNLOGGED TABLE option_quote_raw
  ext_quote_id BIGINT,
  quote_date DATE NOT NULL,
  quote_time_ns BIGINT NOT NULL,
  sub_seq_no BIGINT NOT NULL,
  exchange_id INT NOT NULL,
  time_us BIGINT NOT NULL,
  feed_side VARCHAR NOT NULL,
  flag INT NOT NULL,
  bid NUMERIC(16, 8) NOT NULL,
  ask NUMERIC(16, 8) NOT NULL,
  bid_size INT NOT NULL,
  ask_size INT NOT NULL,
  trash TEXT
  ext_quote_id BIGINT,
  quote_date DATE NOT NULL,
  quote_time_ns BIGINT NOT NULL,
  sub_seq_no BIGINT NOT NULL,
  exchange_id INT NOT NULL,
  time_us BIGINT NOT NULL,
  feed_side VARCHAR NOT NULL,
  flag INT NOT NULL,
  bid NUMERIC(16, 8) NOT NULL,
  ask NUMERIC(16, 8) NOT NULL,
  bid_size INT NOT NULL,
  ask_size INT NOT NULL
CREATE INDEX idx_quote_id ON option_quote (ext_quote_id);
\timing ON
COPY option_quote_raw FROM '/tmp/quote_source.csv' WITH CSV;
TIME: 1705.967 ms
INSERT INTO option_quote
SELECT ext_quote_id, quote_date, quote_time_ns, seq_no,
       sub_seq_no, exchange_id, time_us, feed_side, flag,
       bid, ask, bid_size, ask_size
  FROM option_quote_raw;
TIME: 2062.863 ms

Just under four seconds for half a million records on a table that has a pre-existing index is pretty good. But that process is pretty convoluted. Going back to that list of foreign wrappers, there’s an included file_fdw for tying a table directly to a file. Let’s repeat the same process from above by using FDW syntax:

  ext_quote_id BIGINT,
  quote_date DATE NOT NULL,
  quote_time_ns BIGINT NOT NULL,
  sub_seq_no BIGINT NOT NULL,
  exchange_id INT NOT NULL,
  time_us BIGINT NOT NULL,
  feed_side VARCHAR NOT NULL,
  flag INT NOT NULL,
  bid NUMERIC(16, 8) NOT NULL,
  ask NUMERIC(16, 8) NOT NULL,
  bid_size INT NOT NULL,
  ask_size INT NOT NULL,
  trash TEXT
SERVER file_server
OPTIONS (format 'csv', filename '/tmp/quote_source.csv');
\timing ON
INSERT INTO option_quote
SELECT ext_quote_id, quote_date, quote_time_ns, seq_no,
       sub_seq_no, exchange_id, time_us, feed_side, flag,
       bid, ask, bid_size, ask_size
  FROM file_quote;
TIME: 3867.848 ms

Looks like the overall time is roughly the same. Of course, the best use case for this approach is when using a standard file path location that doesn’t change. The contents of the “table” do not resolve until it’s used, so the filename option could represent a standard file upload location. This means the foreign file table can be repeatedly reused, while the raw import table in our first attempt needs to be truncated and refilled before the INSERT statement.

There’s also the case of archives, where CSV files are compressed for long term storage. What if we want to examine the contents of these files, or import them at a later date? Luckily, the Multicorn extension exists as a kind of generic python FDW API. Using that API, anyone can use Python to build a quick foreign data wrapper. Someone did just that and contributed an extension that can process gzip compressed files.

Let’s see how that works:

    WRAPPER 'csvgz_fdw.CSVGZForeignDataWrapper'
  ext_quote_id BIGINT,
  quote_date DATE NOT NULL,
  quote_time_ns BIGINT NOT NULL,
  sub_seq_no BIGINT NOT NULL,
  exchange_id INT NOT NULL,
  time_us BIGINT NOT NULL,
  feed_side VARCHAR NOT NULL,
  flag INT NOT NULL,
  bid NUMERIC(16, 8) NOT NULL,
  ask NUMERIC(16, 8) NOT NULL,
  bid_size INT NOT NULL,
  ask_size INT NOT NULL,
  trash TEXT
SERVER csvgz
OPTIONS (format 'csv', file_name '/tmp/quote_source.csv.gz');
\timing ON
INSERT INTO option_quote
SELECT ext_quote_id, quote_date, quote_time_ns, seq_no,
       sub_seq_no, exchange_id, time_us, feed_side, flag,
       bid, ask, bid_size, ask_size
  FROM file_quote_gz;

This takes about 17 seconds for the same 500k records, probably because the stack includes Multicorn, which depends on Python. Despite multiple levels of redirection, at least it’s possible! There is a native C version, but the author neglected to include build instructions. It’s possible performance of this extension could be improved with better streaming, as zcat on the same file requires less than a second. It also opens up the exciting potential of another extension that has the compression engine as another option, so tables could also access bzip2, xz, or other compressed formats.

Compressed files aside, is there any way to handle wildcards? Sometimes it makes sense to process all CSV files in a single folder, for example. Well, that doesn’t exist yet, but at the rate new wrappers are appearing, it’s only a matter of time. It’s also possible there’s already one out there, but it simply isn’t on the Wiki page, or the author hasn’t formalized it for contribution yet. In fact, there are probably quite a few foreign data wrappers that share that particular attribute.

It’s great to see other import methods appearing to address deficits in COPY functionality. I’ll definitely be watching the Wiki page for more.

Leo Hsu and Regina Obe: An almost idiot's guide to install PostgreSQL 9.5, PostGIS 2.2 and pgRouting 2.1.0 with Yum

From Planet PostgreSQL. Published on Jan 29, 2016.

If you already have a working PostgreSQL 9.5 install, and just want to skip to relevant sections, follow this list:

As a general note, these instructions are what I did for CentOS 7. For lower versions ther are some differences in packages you'll get. For example currently if you are installing on CentOS 6 (and I presume by extension other 6 family), you won't get SFCGAL and might have pgRouting 2.0 (instead of 2.1)

Continue reading "An almost idiot's guide to install PostgreSQL 9.5, PostGIS 2.2 and pgRouting 2.1.0 with Yum"

Greg Sabino Mullane: Postgres 9.5: three little things

From Planet PostgreSQL. Published on Jan 28, 2016.

The recent release of Postgres 9.5 has many people excited about the big new features such as UPSERT (docs) and row-level security (docs). Today I would like to celebrate three of the smaller features that I love about this release.

Before jumping into my list, I'd like to thank everyone who contributes to Postgres. I did some quick analysis and found that 85 people, from Adrien to Zeus, have helped version 9.5 of Postgres, at least according to the git logs. Of course, that number is actually higher, as it doesn't take into account people helping out on the #postgresql channel, running buildfarm animals, doing packaging work, keeping the infrastructure running, etc. Thanks to you all!


The first feature is one I've been wishing for a long time - a verbose form of the REINDEX command. Thanks to Sawada Masahiko for adding this. Similar to VACUUM, REINDEX gets kicked off and then gives no progress or information until it finishes. While VACUUM has long had the VERBOSE option to get around this, REINDEX gives you no clue to which index it was working on, or how much work each index took to rebuild. Here is a normal reindex, along with another 9.5 feature, the ability to reindex an entire schema:

greg=# reindex schema public;
## What seems like five long minutes later...

The new syntax uses parenthesis to support VERBOSE and any other future options. If you are familiar with EXPLAIN's newer options, you may see a similarity. More on the syntax in a bit. Here is the much improved version in action:

greg=# reindex (verbose) schema public;
INFO:  index "foobar_pkey" was reindexed
DETAIL:  CPU 11.00s/0.05u sec elapsed 19.38 sec.
INFO:  index "foobar_location" was reindexed
DETAIL:  CPU 5.21s/0.05u sec elapsed 18.27 sec.
INFO:  index "location_position" was reindexed
DETAIL:  CPU 9.10s/0.05u sec elapsed 19.70 sec.
INFO:  table "public.foobar" was reindexed
INFO:  index "foobaz_pkey" was reindexed
DETAIL:  CPU 7.04s/0.05u sec elapsed 19.61 sec.
INFO:  index "shoe_size" was reindexed
DETAIL:  CPU 12.26s/0.05u sec elapsed 19.33 sec.
INFO:  table "public.foobaz" was reindexed

Why not REINDEX VERBOSE TABLE foobar? Or even REINDEX TABLE foobar WITH VERBOSE? There was a good discussion of this on pgsql-hackers when this feature was being developed, but the short answer is that parenthesis are the correct way to do things moving forward. Robert Haas summed it up well:

The unparenthesized VACUUM syntax was added back before we realized that that kind of syntax is a terrible idea. It requires every option to be a keyword, and those keywords have to be in a fixed order. I believe the intention is to keep the old VACUUM syntax around for backward-compatibility, but not to extend it. Same for EXPLAIN and COPY.

Alas, this will probably trip a lot of people up, as REINDEX VERBOSE TABLE foobar does not work, nor does \h REINDEX in psql give the correct output as of version 9.5.0:

greg=# \h REINDEX
Command:     REINDEX
Description: rebuild indexes

What is should output is this:

greg=# \h REINDEX
Command:     REINDEX
Description: rebuild indexes

Digging in the source code for psql show that sql_help.c and sql_help.h are the files that generate this help output. I'd submit a patch, but those files are not in git, and the Makefile specifies:

sql_help.h: $(wildcard $(REFDOCDIR)/*.sgml)
    $(PERL) $< $(REFDOCDIR) $*

Which means that a Perl script is responsible for slurping the correct syntax in from our SGML documentation. Happily, the SGML is up to date, which you can see by visiting the offical REINDEX page (HTML). Maybe someone forgot to run

Feature: pg_ctl defaults to "fast" mode

The second feature in Postgres 9.5 I am happy about is the change in niceness of pg_ctl from "smart" mode to "fast" mode. The help of pg_ctl explains the different modes fairly well:

pg_ctl is a utility to initialize, start, stop, or control a PostgreSQL server.

  pg_ctl stop    [-W] [-t SECS] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]
Shutdown modes are:
  smart       quit after all clients have disconnected
  fast        quit directly, with proper shutdown
  immediate   quit without complete shutdown; will lead to recovery on restart

In the past, the default was 'smart'. Which often means your friendly neighborhood DBA would type "pg_ctl restart -D data", then watch the progress dots slowly marching across the screen, until they remembered that the default mode of "smart" is kind of dumb - as long as there is one connected client, the restart will not happen. Thus, the DBA had to cancel the command, and rerun it as "pg_ctl restart -D data -m fast". Then they would vow to remember to add the -m switch in next time. And promptly forget to the next time they did a shutdown or restart. :) Now pg_ctl has a much better default. Thanks, Bruce Momjian!

Feature: new 'cluster_name' option

When you run a lot of different Postgres clusters on your server, as I tend to do, it can be hard to tell them apart as the version and port are not reported in the ps output. I sometimes have nearly a dozen different clusters running, due to testing different versions and different applications. Similar in spirit to the application_name option, the new cluster_name option solves the problem neatly by allowing a custom string to be put in to the process title. Thanks to Thomas Munro for inventing this. So instead of this:

greg      7780     1  0 Mar01 pts/0    00:00:03 /home/greg/pg/9.5/bin/postgres -D data
greg      7787  7780  0 Mar01 ?        00:00:00 postgres: logger process   
greg      7789  7780  0 Mar01 ?        00:00:00 postgres: checkpointer process   
greg      7790  7780  0 Mar01 ?        00:00:09 postgres: writer process   
greg      7791  7780  0 Mar01 ?        00:00:06 postgres: wal writer process   
greg      7792  7780  0 Mar01 ?        00:00:05 postgres: autovacuum launcher process   
greg      7793  7780  0 Mar01 ?        00:00:11 postgres: stats collector process  
greg      6773     1  0 Mar01 pts/0    00:00:02 /home/greg/pg/9.5/bin/postgres -D data2
greg      6780  6773  0 Mar01 ?        00:00:00 postgres: logger process   
greg      6782  6773  0 Mar01 ?        00:00:00 postgres: checkpointer process   
greg      6783  6773  0 Mar01 ?        00:00:04 postgres: writer process   
greg      6784  6773  0 Mar01 ?        00:00:02 postgres: wal writer process   
greg      6785  6773  0 Mar01 ?        00:00:02 postgres: autovacuum launcher process   
greg      6786  6773  0 Mar01 ?        00:00:07 postgres: stats collector process

One can adjust the cluster_name inside each postgresql.conf (for example, to 'alpha' and 'bravo'), and get this:

greg      8267     1  0 Mar01 pts/0    00:00:03 /home/greg/pg/9.5/bin/postgres -D data
greg      8274  8267  0 Mar01 ?        00:00:00 postgres: alpha: logger process   
greg      8277  8267  0 Mar01 ?        00:00:00 postgres: alpha: checkpointer process   
greg      8278  8267  0 Mar01 ?        00:00:09 postgres: alpha: writer process   
greg      8279  8267  0 Mar01 ?        00:00:06 postgres: alpha: wal writer process   
greg      8280  8267  0 Mar01 ?        00:00:05 postgres: alpha: autovacuum launcher process   
greg      8281  8267  0 Mar01 ?        00:00:11 postgres: alpha: stats collector process  
greg      8583     1  0 Mar01 pts/0    00:00:02 /home/greg/pg/9.5/bin/postgres -D data2
greg      8590  8583  0 Mar01 ?        00:00:00 postgres: bravo: logger process   
greg      8592  8583  0 Mar01 ?        00:00:00 postgres: bravo: checkpointer process   
greg      8591  8583  0 Mar01 ?        00:00:04 postgres: bravo: writer process   
greg      8593  8583  0 Mar01 ?        00:00:02 postgres: bravo: wal writer process   
greg      8594  8583  0 Mar01 ?        00:00:02 postgres: bravo: autovacuum launcher process   
greg      8595  8583  0 Mar01 ?        00:00:07 postgres: bravo: stats collector process

There are a lot of other things added in Postgres 9.5. I recommend you visit this page for a complete list, and poke around on your own. A final shout out to all the people that are continually improving the tab-completion of psql. You rock.

Django Model Inheritance - Multi-Table Inheritance

By GoDjango - Django Screencasts from Django community aggregator: Community blog posts. Published on Jan 21, 2016.

In the last post we talked about Model Inheritance with regards to Abstract Model Classes, it was a great introduction into using OOP with your models to keep your code slim and tidy. You can read it at: Django Abstract Base Class - Model Inheritance

In this post we are going to discuss Multi-Table Inheritance. We will create a base model, that is not an abstract model, and inherit from the model. Then we are going to show you how to get and reference data from that models inheritance.

Multi Table Inheritance

When we talk about multi-table inheritance we are referring to having a top level table with all the fields that are core to that context. Then we are going to have other tables with the more specific fields. You select from the table you want, and then do a join on a relationship to get the parent table data. Since Django does this through models and code, we will talk about it in those terms.

In our example we are going to have a Ticket Model.

class Ticket(models.Model):
    title = models.CharField(max_length=255, blank=True)
    description = models.TextField(blank=True)
    name = models.CharField(max_length=255, blank=True)

This model is a good piece of code because no matter what type of ticket we ever have we are going to have a title, description, and name. This is normal for any OOP we would do as a base class. Since we have this base model we can subclass it to have more specific models with data associated with them.

class InternalTicket(Ticket):
    user = models.ForeignKey('auth.User', null=True)

class CustomerTicket(Ticket):
    product = models.TextField(blank=True)

Now we have 3 models available to us: Ticket, InternalTicket, and CustomerTicket. Each one has a title, description and name as fields. However, CustomerTicket does not have a user field and InternalTicket does not have product field. Also both InternalTicket and CustomerTicket are stored in separate database table from title, description and name. So if you looked up the tables in the database you would see a FK reference to the ticket table, id field and the field you declared. Everything else lives in the ticket table.

Getting and Creating Data

With all that said lets look at creating some data and getting it back out. Because that is really when it makes sense on how it works.

In [1]: CustomerTicket.objects.create(product='Videos', title='Odd Playback', description='doesnt play back fast enough', name='Jane Smith')
Out[1]: <CustomerTicket: CustomerTicket object>

We have created a CustomerTicket object with all of our fields set, and it succeeds. (Further proof it works)

In [2]: Ticket.objects.all()
Out[2]: [<Ticket: Ticket object>]

Here we are getting all tickets from the ticket model, which is including our CustomerTicket we create above.

In [3]: ticket = Ticket.objects.get()

In [4]: ticket.customerticket
Out[4]: <CustomerTicket: CustomerTicket object>

If we reference the customerticket property, which is the related name, it will return back the model so we can get the data from the model. We can look at that in action.

In [5]: ticket.customerticket.product
Out[5]: 'Videos'

In [6]: ticket.description
Out[6]: 'doesnt play back fast enough'

Finally, you can see the description is still on the ticket level and not the cutomerticket model.

How does it work

From the django docs:

multi-table inheritance uses an implicit OneToOneField to link the child and the parent

This means there is a one-to-one relationship between the ticket record and the corresponding subclassed models. Django just handles the translation of it from multiple tables to a single model so you don't really notice it.

CustomerTicket object

As our final look we can look at how the CustomerTicket object we created earlier references the ticket object.

n [1]: customer = CustomerTicket.objects.get()

In [2]: customer.description
Out[2]: 'doesnt play back fast enough'

As you can see django just handles the fact that description is on the model, even though the description data is in the ticket table.

In [3]: customer.ticket_ptr
Out[3]: <Ticket: Ticket object>

The .ticket_ptr property is the reference to the ticket object so if you need to actually use that model you can.

In [4]: customer.ticket_ptr_id
Out[4]: 1

ticket_ptr_id holds the id of the Ticket object in the database giving you another avenue for getting information out of the appropriate place.


You don't often need to use Multi-table inheritance, but when you do need to use it, it is nice that django makes it super simple to use. I have only seen a few instances where it makes sense, tickets being one, and social media posts as another. It provides great value in giving you core data, then when needed get the rest of the data. It does this while allowing you to use the models as separate models as if the inheritance didn't exist. Finally, another way to think of it is as data being used with polymorphism.

ShipIt Day Recap: Q1 2016

By Caktus Consulting Group from Django community aggregator: Community blog posts. Published on Jan 20, 2016.

Last Friday, the Cakti set aside regular client projects for our quarterly ShipIt Day, a chance for personal development and independent projects. People work individually or in groups to flex their creativity, tackle interesting problems, or expand their personal knowledge. This quarter’s ShipIt Day saw everything from cat animations to improvements on our Taylor Swift lyric generator app. Read about the various ShipIt Day projects for Q1 of 2016 below.

Neil Ashton worked with the open source DemocracyOS, a platform for collaborative decision making. The platform is used around the world to increase participation in democratic systems. Neil took a look at the basic web app stack for the platform and submitted several pull requests to fix bugs in that system. All of his pull requests have since been approved, making Neil an official contributor to DemocracyOS!

Calvin Spealman went into ShipIt Day with the intention of building a story generator. However, after running into a roadblock late Thursday evening, he transitioned instead to work on frontend tooling upgrades and related documentation in Caktus’ Django project template. In the meantime, Vinod upgraded one of Calvin’s projects to the latest version of Margarita. This enabled Calvin to demonstrate the new frontend tooling upgrades, while allowing Vinod to test his upgrade pathway documentation.

Like Calvin, Colin Copeland built in frontend changes to the Django project template and integrated those changes into an existing project.

cat animation still

Inspired by the work of Rachel Nabors, Erin Mullaney made a CSS3 animation out of one of her own drawings. By cycling through sprites of different images, she made an animation of a cat blinking while a mouse walked by. You can see the full animation here.

taylor swift lyric generator

Mark Lavin, Rebecca Conley, and Dmitriy Chukhin built on their work from a previous ShipIt Day, The Taylor Swift Song Generator. The team cleaned up some of the code and fixed the test suite, using Codecov for code test coverage and OpBeat for exception and performance monitoring. Mark also created Twitter preview cards for tweeting particularly choice TayTay lyrics generated by the app while Rebecca and Dmitriy enabled user-creation of a song title that would then become the first line of the chorus, making the app more interactive. By the end of the day the team was working on word visualization and cleaning up their data set, which are the main goals for the next chance the team has to work on the app.

Dan Poirier, Scott Morningstar, and Jeff Bradberry continued their work from a previous ShipIt Day as well, with the goal of getting the project template to deploy on Ansible in order to move away from Salt. They improved the usage of variables, added Supervisor, nginx, and gunicorn, pushed source code to a deployment target without going through github, and updated documentation. Though the team couldn’t get deployment to a virtual machine to work, they are incredibly close and hope to have a deploy by the next ShipIt Day!

Hunter MacDermut got into using es2015 with Gulp, Browserify, and Babel, building a task list that would auto-organize by priority. This to-do list app would be organized by project and task, and further taps on each item would increase that item’s priority value. Though Hunter didn’t have time to finish the sorting feature, in its current state, it is a functional to-do list app that relies on localStorage getting and setting. The repo for the app can be found here.

Though she didn’t technically participate in ShipIt Day, NC Nwoko did break from her usual routine and went to the Regional AIDS Interfaith Network (RAIN) in Charlotte, which helps youth get involved in their treatment. She helped train caseworkers in the use of our Epic Allies app.

durham school navigator

Victor Rocha continued his work on Code for Durham’s School Navigator app, building a new school profile to be tested in the community. David Ray added features and improvements to the mobile app version of the School Navigator, including geolocation functionality, a clear input button to make subsequent address searches more efficient, and technical debt cleanup, changing some code that was in triplicate to an angular directive.

Finally, Tobias McNulty worked on refactoring and cleaning up the deployment code for a client project. The project was based on an old Fabric-based deployment that Caktus used to use. He cleaned up the Fabric file, making it client agnostic, pulled out the configuration into a yml file, and merged the changes back into FabulAWS, the parent project. The next step will be to break these down into smaller files for autoscaling purposes. Meanwhile, Karen Tracey reviewed Tobias’ work.

hasattr() Considered Harmful

By Blog of Hynek Schlawack from Django community aggregator: Community blog posts. Published on Jan 12, 2016.

Don’t use Python hasattr() unless you want its quirks or are writing Python 3-only code.

Django CMS Plugin Authenticated User Variations

By Imaginary Landscape's Tech Blog from Django community aggregator: Community blog posts. Published on Jan 11, 2016.

Django CMS Plugin Authenticated User Variations

Basics of Ember.js application development

By Piotr Maliński from Django community aggregator: Community blog posts. Published on Jan 10, 2016.

After setting up the base of our ember application we can start doing something more with it. For my tutorials I've picked AdminLTE template to make a admin panel alike application.

Is Open Source Consulting Dead?

By chrism from . Published on Sep 10, 2013.

Has Elvis left the building? Will we be able to sustain ourselves as open source consultants?

Consulting and Patent Indemification

By chrism from . Published on Aug 09, 2013.

Article about consulting and patent indemnification

Python Advent Calendar 2012 Topic

By chrism from . Published on Dec 24, 2012.

An entry for the 2012 Japanese advent calendar at

Why I Like ZODB

By chrism from . Published on May 15, 2012.

Why I like ZODB better than other persistence systems for writing real-world web applications.

A str. __iter__ Gotcha in Cross-Compatible Py2/Py3 Code

By chrism from . Published on Mar 03, 2012.

A bug caused by a minor incompatibility can remain latent for long periods of time in a cross-compatible Python 2 / Python 3 codebase.

In Praise of Complaining

By chrism from . Published on Jan 01, 2012.

In praise of complaining, even when the complaints are absurd.

2012 Python Meme

By chrism from . Published on Dec 24, 2011.

My "Python meme" replies.

In Defense of Zope Libraries

By chrism from . Published on Dec 19, 2011.

A much too long defense of Pyramid's use of Zope libraries.

Plone Conference 2011 Pyramid Sprint

By chrism from . Published on Nov 10, 2011.

An update about the happenings at the recent 2011 Plone Conference Pyramid sprint.

Jobs-Ification of Software Development

By chrism from . Published on Oct 17, 2011.

Try not to Jobs-ify the task of software development.

WebOb Now on Python 3

By chrism from . Published on Oct 15, 2011.

Report about porting to Python 3.

Open Source Project Maintainer Sarcastic Response Cheat Sheet

By chrism from . Published on Jun 12, 2011.

Need a sarcastic response to a support interaction as an open source project maintainer? Look no further!

Pylons Miniconference #0 Wrapup

By chrism from . Published on May 04, 2011.

Last week, I visited the lovely Bay Area to attend the 0th Pylons Miniconference in San Francisco.

Pylons Project Meetup / Minicon

By chrism from . Published on Apr 14, 2011.

In the SF Bay Area on the 28th, 29th, and 30th of this month (April), 3 separate Pylons Project events.

PyCon 2011 Report

By chrism from . Published on Mar 19, 2011.

My personal PyCon 2011 Report