Problem statement
Postgresql crashes randomly. Increasing more memory do not helped much but only delaying the crashing. There is no particularl activities that causing or caused to crash. It can be any DMLs or vacuum. Out of Memory was pretty consistent but do not always crashed the instance.Errors from the logs
.................
3 tm:2015-08-06 03:52:33.326 UTC db:VCDB pid:19666 STATEMENT: UPDATE VPX_DVPORT SET
1 tm:2015-08-06 03:52:33.520 UTC db:VCDB pid:15745 LOG: unexpected EOF on client connection 1 tm:2015-08-06 03:52:33.524 UTC db:VCDB pid:8630 LOG: could not receive data from client: Connection reset by peer 2 tm:2015-08-06 03:52:33.524 UTC db:VCDB pid:8630 LOG: unexpected EOF on
..........................................
could not receive data from client: Connection reset by peer
: could not fork autovacuum worker process: Cannot allocate memory
: could not fork autovacuum worker process: Cannot allocate memory
.......................................................................
ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used
1 tm:2015-08-06 03:53:27.838 UTC db:VCDB pid:1234 ERROR: out of memory
2 tm:2015-08-06 03:53:27.838 UTC db:VCDB pid:1234 DETAIL: Failed on request of size 32.
3 tm:2015-08-06 03:53:27.838 UTC db:VCDB pid:1234 CONTEXT: SQL statement "SELECT 1 FROM ONLY "vpx"."vpx_vdevice_file_backing" x WHERE "backing_id" OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"
Upon looking at the VM, I realized the swap was 0 even though the memory usage were very high. At such a high memory usage, I would expect swap to be used.
esxp-vc01:/var/log/vmware/vpx # free -m
total used free shared buffers cached
Mem: 24161 21849 2312 0 414 7797
-/+
buffers/cache: 13637 10524
Swap: 0 0 0
At that point, I suggested my customer to check with their administrator of why the Swap was at 0. Was it allocated? Was it turned on?
To get an overview of the swap, look at the fstab and swapon.
cat /etc/fstab
swapon -s
Conclusion
In ideal scenario, if swap is configured and turned on, it should have something similar as the following.
[root@Cloud3 ~]# swapon -s
Filename Type Size Used Priority
/dev/mapper/VolGroup00-LogVol01 partition 2064376 0 -1
[root@Cloud3 ~]# free -m
total used free shared buffers cached
Mem: 16050 13949 2101 0 313 13160
-/+ buffers/cache: 475 15575
Swap: 2015 0 2015
[root@Cloud3 ~]# cat /etc/fstab|grep swap
/dev/VolGroup00/LogVol01 swap swap defaults 0 0
[root@Cloud3 ~]# blkid|grep swap
/dev/mapper/VolGroup00-LogVol01: TYPE="swap"
/dev/VolGroup00/LogVol01: TYPE="swap"
Now that, the Postgresql: Cannot allocate memory, can be triggered by anything that causing the memory pool to exhaust. It basically saying, the box has ran out of memory.