Monday, January 26, 2015

Postgresql: Corruption: WAL files filled up

As last resort, environment was down and there is no transaction logs backup available for restoration. Resetting the WAL and this will result of losing data.

In this case, the transaction log of 0000000100000039000000A8 was missing.


From postgresql.log


Beginning archiving of 0000000100000039000000A8
0000000100000039000000A8 not found yet in archives, continuing
Last backup time found: 2014-10-24 00:00:01
Last backup path: /var/vmware/vpostgres/9.3-archive/backup/20141024/000001/arclog
grep: write error
ls: write error: Broken pipe
Beginning archiving of 0000000100000039000000A8
0000000100000039000000A8 not found yet in archives, continuing
Last backup time found: 2014-10-24 00:00:01
Last backup path: /var/vmware/vpostgres/9.3-archive/backup/20141024/000001/arclog
grep: write error
ls: write error: Broken pipe
Beginning archiving of 0000000100000039000000A8
0000000100000039000000A8 not found yet in archives, continuing
Last backup time found: 2014-10-24 00:00:01
Last backup path: /var/vmware/vpostgres/9.3-archive/backup/20141024/000001/arclog
grep: write error
From vpxd.log
2015-01-21 14:48:01.967 UTC,,,42343,,54bfbc21.a567,2,,2015-01-21 14:48:01 UTC,,0,LOG,00000,"invalid primary checkpoint record",,,,,,,,,""
2015-01-21 14:48:01.977 UTC,,,42343,,54bfbc21.a567,3,,2015-01-21 14:48:01 UTC,,0,LOG,00000,"invalid secondary checkpoint record",,,,,,,,,""
2015-01-21 14:48:01.977 UTC,,,42343,,54bfbc21.a567,4,,2015-01-21 14:48:01 UTC,,0,PANIC,XX000,"could not locate a valid checkpoint record",,,,,,,,,""


Stop all application services the reset the WAL.
sudo -u postgres /opt/vmware/vpostgres/current/bin/pg_resetxlog /var/vmware/vpostgres/current/pgdata/



There will be some data losses but the situation is better than database not coming back up. This typically happened after disk filed up.

No comments:

Post a Comment