Showing posts with label Tuning Hardware. Show all posts
Showing posts with label Tuning Hardware. Show all posts

Monday, June 5, 2017

How to Tuning Your PostgreSQL Server

In this articles, we will explain about How to Tuning Your PostgreSQL Server.  Actually, when we Tuning a server, we will customize list of parameters and configurations that own by a Server.



PostgreSQL ships with a basic configuration tuned for wide compatibility rather than performance. Odds are good the default parameters are very undersized for your system. Rather than get dragged into the details of everything you should eventually know (which you can find if you want it at the GUC Three Hour Tour), here we're going to sprint through a simplified view of the basics, with a look at the most common things people new to PostgreSQL aren't aware of. You should click on the name of the parameter in each section to jump to the relevant documentation in the PostgreSQL manual for more details after reading the quick intro here. There is also additional information available about many of these parameters, as well as a list of parameters you shouldn't adjust, at Server Configuration Tuning.


PostgreSQL settings can be manipulated a number of different ways, but generally you will want them changed in your configuration files, either directly or, starting with PostgreSQL 9.4, through ALTER SYSTEM. The specific options available change from release to release, the definitive list is in the source code at src/backend/utils/misc/guc.c for your version of PostgreSQL (but the pg_settings view works well enough for most purposes).

The types of settings

There are several different types of configuration settings, divided up based on the possible inputs they take


  • Boolean: true, false, on, off
  • Integer: Whole numbers (2112)
  • Float: Decimal values (21.12)
  • Memory / Disk: Integers (2112) or "computer units" (512MB, 2112GB). Avoid integers--you need to know the underlying unit to figure out what they mean.
  • Time: "Time units" aka d,m,s (30s). Sometimes the unit is left out; don't do that
  • Strings: Single quoted text ('pg_log')
  • ENUMs: Strings, but from a specific list ('WARNING', 'ERROR')
  • Lists: A comma separated list of strings ('"$user",public,tsearch2) 


And, When they can  take effect ??

When You Do Tuning Your PostgreSQL Server with above configuration settings, will have different levels of flexibility for when they can be changed, usually related to internal code restrictions. The complete list of levels is:

  • Postmaster: requires restart of server
  • Sighup: requires a HUP of the server, either by kill -HUP (usually -1), pg_ctl reload, or SELECT pg_reload_conf();
  • User: can be set within individual sessions, take effect only within that session
  • Internal: set at compile time, can't be changed, mainly for reference
  • Backend: settings which must be set before session start
  • Superuser: can be set at runtime for the server by superusers 


Most of the time you'll only use the first of these, but the second can be useful if you have a server you don't want to take down, while the user session settings can be helpful for some special situations. You can tell which type of parameter a setting is by looking at the "context" field in the pg_settings view.

Important notes about configuration files

  • Command line options override postgresql.auto.conf settings override postgresql.conf settings.
  • If the same setting is listed multiple times, the last one wins.
  • You can figure out the postgresql.conf location with SHOW config_file. It will generally be $PGDATA/postgresql.conf (SHOW data_directory), but watch out for symbolic links, postmaster.opts and other trickiness
  • Lines with # are comments and have no effect. For a new database, this will mean the setting is using the default, but on running systems this may not hold true! Changes to the configuration files do not take effect without a reload/restart, so it's possible for the system to be running something different from what is in the file. 


Viewing the current settings

  • Look at the configuration files. This is generally not definitive!
  • SHOW ALL, SHOW <setting> will show you the current value of the setting. Watch out for session specific changes
  • SELECT * FROM pg_settings will label session specific changes as locally modified 

Tuning Case : Tuning postgresql for large amounts of ram

When we have two identical servers (in terms of hardware), they are both standard installations of windows server 2008 r2, with minimal software installed (basically my code and required stuff like jvm etc).



On the one server, I am running sql server 2005, on the second server postgresql 9.1.  The difference in performance b/n these 2 servers is staggering, it's so bad on postgresql that I'm regretting my initial "let's use postgresql instead of paying for the sql server license" speech to my boss. We're talking differences of 30 seconds vs 15 mins for the same command, and it's not just this one command, it's any query or command I throw at it. They both have pretty much the same data (records were inserted in different order), and both databases have the exact same structure / indexes etc.

But I'm hoping it's just a matter of performance tuning. The thing is, sql server is pretty much using all 32 gigs of ram on the server, whereas postgresl is using nothing, definitely less than a gig though I haven't actually figured it out in fine detail.

How do We get postgresql to use 20+ gigs of ram? These servers were built specifically for this database stuff, so any ram not in use by the database and supporting processes is wasted in my opinion.

Here is supposed Solution that We can do for Tuning Case : Tuning postgresql for large amounts of ram.

Tuning postgresql can vary depond on suitable condition and approach. There are many tweakable constants, initialised via postgres.conf. The most important ones are:
  • max_connections: the number of concurrent sessions
  • work_mem : the maximal amount of memory to be used for intermediate results such as hash tables
  • shared_buffers the amount of memory dedicated to 'pinned' buffer space.
  • effective_cache_size the amount of memory assumed to be used by the OS's LRU buffers.
  • random_page_cost : an estimate for the relative cost of disk seeks.
max_connections should not be set higher than needed, connections cost resources even when idle; in most cases a connection would spend more time waiting inside than waiting outside. (at the price of concurrency) A nice rule-of-thumb formula is "number of spindles+number of processors+X"
work_mem is tricky: is can be applied to every subquery, so a query with 5 HASHJOINS might cost 5*work_mem. And for worst-case scenarios, you should also think of multiple sessions consuming this amount (again a reason to keep max_connections low).

shared_buffers is (IMHO) overrated. Normally it is advised to set it to about 1/4...1/2 of all available "free" memory, but I tend to keep it low, and set effective_cache_size to all available "free" memory.

random_page_cost is the cost for a seek+read on the disk. It is relative to the sequential_disk_cost, which is 1. The default (4) for random_page_cost is set too high for modern machines and network storage, normally it can be lowered to between 2 and 1.x.