Saturday, January 7, 2017

postgresql: corruption: ERROR: unexpected chunk number 0 (expected 1) for toast value XXXXXXXXX in pg_toast_XXXX


This issue is very common but luckily, they can be resolved majority of the time (it was once it did not worked for me). The corruption is a bit difficult to description and I am using a real scenario here.

The vcenter crashed and in the logs, the error reported was.


2017-01-05T15:51:06.640Z [7FC61081D700 error 'Default' opID=SWI-3a1cc46] [VdbStatement] SQLError was thrown: "ODBC error: (XX000) - ERROR: unexpected chunk number 0 (ex
pected 1) for toast value 25270341 in pg_toast_16875;
--> Error while executing the query" is returned when executing SQL statement "SELECT IS_CONSOLIDATE_NEEDED, DAS_PROTECTED_ACTUAL, DAS_PROTECTED_DESIRED, MANAGED_BY_TYP
E, MANAGED_BY_EXT_KEY, STORAGE_SPACE_UPDATED_TIME, AGGR_UNSHARED_STORAGE_SPACE, AGGR_UNCOMMITED_STORAGE_SPACE, AGGR_COMMITED_STORAGE_SPACE, RECORD_REPLAY_STATE, FAULT_T
OLERANCE_STATE, PENDING_ANNOT_SET_FLG, PENDING_NAME, ONLINE_STANDBY, MKS_CONNECTIONS, SCREEN_WIDTH, SCREEN_HEIGHT, TOOLS_MOUNTED, MEMORY_OVERHEAD, SUSPEND_INTERVAL, BOO
T_TIME, SUSPEND_TIME, UUID_INSTANCE, UUID_BIOS, NUM_DISK, NUM_NIC, NUM_VCPU, MEMORY_RES"


When trying to delete the offending entry from the vpx_vm using the id, it throws a different errors.

ERROR: tuple concurrently updated

Backing up the table will throw the same problem happens.

ERROR: unexpected chunk number 0 (expected 1) for toast value 25270341 in pg_toast_16875;


Steps to resolve the issue for VCSA.


1: Back up the database first.

2: Stop vpxd & vdcs service:
     service vmware-vpxd stop
     service vmware-vdcs stop

3: Connect to VCDB with:
   /opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB

4: Execute below query:
   select pg_toast_16875::regclass;

   Above query should give you output something like:
   regclass
   --------
   VPX_VM

5: Now re-index the table with below queries:
   REINDEX table pg_toast.pg_toast_16875;
   REINDEX table VPX_VM; (Table name should be whatever is returned from step#4)

6: Do vacuum and analyze for vpx_vm table with following query:
    VACUUM ANALYZE VPX_VM; (Table name should be whatever is returned from step#4)

7: \q

8: Start vpxd & vdcs service:
   service vmware-vpxd start
   service vmware-vdcs start

NOTE: The steps can lead to some data loss from table.

The key is to locate the pg_toast then reindex the pg_toast and its corresponding table.