Database Archive

How to install PostgreSQL 9.3

This tutorial explains how to install PostgreSQL 9.3 on a CentOS 7 or RHEL 7 machine. It goes through the process of how to get the latest version of PostgreSQL through the very basic configurations needed to boot up the server.

Adding PostgreSQL Repository

CentOS ships with PostgreSQL installed available in it’s default repositories. It may not be the version you would like to install. For example CentOS 7 has PostgreSQL 9.2, while CentOS 6.5 has PostgreSQL 8.4. The current version of PostgreSQL is 9.3. For that reason we are going to add the PostgreSQL repority to our list of available repositories.

 Available Packages

PostgreSQL has quite a few packages available. For our purposes we only care about the server and it’s dependencies. But if you wanted some of the other utilities, like pg_upgrade(), you will need to install the contrib package. Or if you want to enable the Perl extension, then install the plperl package. Here is a listing of the available packages:

 Install PostgreSQL 9.3

To install PostgreSQL just run the yum install command as follows:

Initialize PostgreSQL 9.3

Before we can do anything we need to initialize PostgreSQL. This creates the database cluster and all the file and directories needed for PostgreSQL to store the database.

Configure PostgreSQL 9.3

By default PostgreSQL listens on localhost. To enable it to listen to any available address we must change this value either listen on all addresses with * or to a specific address. To do this we must edit the postgresql.conf file. This file holds all of the configuration parameters for PostgreSQL. The postgresql.conf file is stored in the data directory in our case on /var/lib/pgsql/9.3/data/.

Let’s set the listen_addresses to listen to all IP’s assigned to the machine:

Now PostgreSQL will listen on any external IP available.

In order to allow clients to connect from anywhere except the localhost, we must allow them by editing the pg_hba.conf file. This file also is stored in the data directory.

Let’s suppose you want your PostgreSQL server to accept connections from anywhere in your network. Your network range is 192.168.1.0/24. Then we would add the following line to the pg_hba.conf file:

Defining the above we are adding connection socket of type host which can be any TCP/IP socket connection, this is for all available databases and all available users. From connections comming from the 192.168.1.0/24 network and are being authenticated with md5 encrypted passwords.

You can add the settings manually or run the following command, which you should edit to accommodate your network range.

Start PostgreSQL 9.3 Server

Now that we have our server configured with basic settings we can start it.

You can run any of the two following commands to start PostgreSQL

To start PostgreSQL on startup enable the boot service

Check PostgreSQL 9.3 is Running

If you want to check the status of PostgreSQL

Login To PostgreSQL 9.3

To login to the PostgreSQL server we must change user to postgres. If you are logged in as root:

Once logged in as the postgres user execute the psql client command:

You are now logged into your new PostgreSQL 9.3 database.

 

If you followed all the instructions correctly you now have a running PostgreSQL 9.3 server on a CentOS 7 or RHEL 7 machine.

 

How To Calculate A PostgreSQL Database Size

Knowing where you stand on disk usage is really important in a database. The last thing you want to do is run out of disk space. The following tutorial will list some simple commands to help you find or calculate the disk usage of several PostgreSQL objects. This includes tables, indexes, views, tablespaces, and so on. For the following examples I used the PostgreSQL 9.3.2

How to find the largest table in a PostgreSQL database

To get the largest table we need to query the pg_class table. This is a catalog containing pretty much anything that has columns or is similar to a table. We can get the ten largest tables with the following query:

Once we execute the query we get our results:

Defining the above SELECT:

  • relname is the name of the object. This could be a table, index, or view
  • relpages is the  size of the given table in pages. This is only an estimate and is updated periodically.
  • relkind is the type of object and is defined as follows: r = ordinary table, i = index, S = sequence, v = view, m = materialized view, c = composite type, t = TOAST table, f = foreign table.

A note on pg_toast tables. TOAST (The Oversized-Attribute Storage Technique) tables are created when the data set of a table is too big. By default PostgreSQL uses 8kB page sizes. If tuples are larger than 8kB they cannot span multiple pages. To fix this, large values are compressed and split up into multiple physical rows.

How to find the largest index, sequence, view, materialized view, or composite type in a PostgreSQL database

Similar to our query for finding the largest table, however now we look for a different type of relkind value.

How to calculate the disk space used by a PostgreSQL database

The pg_database_size() function will return the file size in bytes of a given database. It only accepts one parameter which is the database name.

The output is returned in bytes. To print out the size in a more user friendly format we can call the same pg_database_size() function, but this time we pass it into pg_size_pretty().

How to calculate the disk space used by a PostgreSQL table

There are two functions for retrieving the disk usage of a table. The first, pg_relation_size() which returns the size of a table excluding the index.

The second option, pg_total_relation_size() which will include the index and toasted data.

How to calculate the index size of a table

pg_indexes_size() returns the physical disk usage of only indexes for a particular table.

How to calculate the size of a PostgreSQL tablespace

Lastly we calculate the size of tablespaces, this we do with the function pg_tablespace_size().

 

Reload PostgreSQL Configuration

PostgreSQL allows for many settings in their postgresql.conf file. Depending on your database you may need to tweak any of these settings to find the optimal performance or to work the way you want it in your environment. The easy way to apply any changes is just by restarting the database. But that would be counter productive, if your blog or application runs on PostgreSQL, then that would leave your users seeing an “Unable to connect to database.” error until the database starts up.

Luckily PostgreSQL has two options for applying changes made to the postgresql.conf without restarting.

1. With an SQL command

If you have a superuser credentials you can just execute the pg_reload_conf() function. This will apply any changes that have been made to the postgresql.conf.

2. With the terminal

It is possible to load changes done to the postgresql.conf file via the terminal. You will need to login as the postgres user or su into it if you have root access. Then execute the pg_ctl command with the reload parameter.

If your data folder is not in the default location, pg_ctl will complain

In that case you will need to give it the location. That is done by passing the -D flag followed b the location of the data folder

The current version of PostgreSQL I am using is 9.3.5. Not all versions of PostgreSQL offer the same settings, they may change over version numbers. Also not all settings are available to be changed on the fly. Some require that PostgreSQL be restarted to take effect. There are not many of these settings, but they are settings that you usually only set once. For example the ip and port number.

Below are all the settings available for PostgreSQL v9.3.5 on Centos 6.5 as returned by the SHOW ALL command. I have marked the ones that require a restart to take effect. All other can modified and applied either via the pg_reload_conf() function or the pg_ctl command.