Gavan Schneider
2012-11-24 01:52:17 UTC
Context:
After an unfortunate experience I have implemented WAL archive
backup strategy as per documentation with the minor extensions that:
1. The archive files are compressed;
2. Only 8Mb WAL files (it's a low activity setup);
3. Purge WAL segment every 10 minutes (still under
examination); and,
4. Able to rotate archive directory when desired, i.e.,
when doing a new baseline archive.
With this debugged and working I restored my salvaged data.
Assumptions (please correct me here):
From my reading I understand the WAL log file sequence as
recording all changes to the cluster such that replaying them
will recreate the cluster to a consistent state as at the time
the last WAL file was written to disk. From this I have
extrapolated that no activity should mean no need to put
alterations into a WAL file, i.e., minimal information in the
WAL file when PostgreSQL is idle.
I am also assuming the compressed size of a WAL file is a
rough measure of real information, as opposed to padding of the
file. And I did see compressed WAL files of less than 300 bytes
(0.3Kb!) before my data was loaded into the database, and much
bigger (compressed) files reflecting the data getting imported.
All very sensible and understandable.
My surprise is that with the data on board, and nothing
happening, those WAL files now appear to have a lot more real
content, i.e., I am assuming this means the data and/or
configuration are getting altered even when there are no
connections. I know vacuum can change things but surely it
should run out of things to do after a few passes.
This is a snapshot of the WAL archive files. There is a file
every ten minutes. No data has been changed for 12 hours, no
connections open to the database for at least 30 minutes. No
automatic processes other than default vacuum settings, etc. The
archive commit process always seems to be 6-8 files behind.
dir:Archive postgres$ ls -ls *21 active_wal_files/
20121121:
total 351744
...
584 -rw-------+ 1 postgres _postgres 295904 Nov 24 10:46 000000010000000100000087.bz2
232 -rw-------+ 1 postgres _postgres 116480 Nov 24 10:56 000000010000000100000088.bz2
3104 -rw-------+ 1 postgres _postgres 1585487 Nov 24 11:06 000000010000000100000089.bz2
1224 -rw-------+ 1 postgres _postgres 624755 Nov 24 11:16 00000001000000010000008A.bz2
744 -rw-------+ 1 postgres _postgres 377394 Nov 24 11:26 00000001000000010000008B.bz2
80 -rw-------+ 1 postgres _postgres 38856 Nov 24 11:36 00000001000000010000008C.bz2
32 -rw-------+ 1 postgres _postgres 16147 Nov 24 11:46 00000001000000010000008D.bz2
48 -rw-------+ 1 postgres _postgres 21710 Nov 24 11:56 00000001000000010000008E.bz2
6816 -rw-------+ 1 postgres _postgres 3487066 Nov 21 07:56 base_archive_20121121_075602.tbz
active_wal_files/:
total 131080
8 -rw-------+ 1 postgres _postgres 381 Nov 21 07:56 0000000100000000000000C5.00000020.backup
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:46 00000001000000010000008D
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:56 00000001000000010000008E
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 10:46 00000001000000010000008F
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 10:56 000000010000000100000090
16384 -rw-------+ 1 postgres staff 8388608 Nov 24 11:06 000000010000000100000091
16384 -rw-------+ 1 postgres staff 8388608 Nov 24 11:16 000000010000000100000092
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:26 000000010000000100000093
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:36 000000010000000100000094
0 drwx------+ 2 postgres staff 170 Nov 24 11:56 archive_status
dir:Archive postgres$
Question: what is behind the idle content in the WAL files?
Thank you for your time.
Regards
Gavan
--
Gavan Schneider <***@galeel.org>
Ph. +61 2 6355 6256 "Pendari" Anarel Road
Mb. +61 405 124 883 Sodwalls NSW 2790
Australia
After an unfortunate experience I have implemented WAL archive
backup strategy as per documentation with the minor extensions that:
1. The archive files are compressed;
2. Only 8Mb WAL files (it's a low activity setup);
3. Purge WAL segment every 10 minutes (still under
examination); and,
4. Able to rotate archive directory when desired, i.e.,
when doing a new baseline archive.
With this debugged and working I restored my salvaged data.
Assumptions (please correct me here):
From my reading I understand the WAL log file sequence as
recording all changes to the cluster such that replaying them
will recreate the cluster to a consistent state as at the time
the last WAL file was written to disk. From this I have
extrapolated that no activity should mean no need to put
alterations into a WAL file, i.e., minimal information in the
WAL file when PostgreSQL is idle.
I am also assuming the compressed size of a WAL file is a
rough measure of real information, as opposed to padding of the
file. And I did see compressed WAL files of less than 300 bytes
(0.3Kb!) before my data was loaded into the database, and much
bigger (compressed) files reflecting the data getting imported.
All very sensible and understandable.
My surprise is that with the data on board, and nothing
happening, those WAL files now appear to have a lot more real
content, i.e., I am assuming this means the data and/or
configuration are getting altered even when there are no
connections. I know vacuum can change things but surely it
should run out of things to do after a few passes.
This is a snapshot of the WAL archive files. There is a file
every ten minutes. No data has been changed for 12 hours, no
connections open to the database for at least 30 minutes. No
automatic processes other than default vacuum settings, etc. The
archive commit process always seems to be 6-8 files behind.
dir:Archive postgres$ ls -ls *21 active_wal_files/
20121121:
total 351744
...
584 -rw-------+ 1 postgres _postgres 295904 Nov 24 10:46 000000010000000100000087.bz2
232 -rw-------+ 1 postgres _postgres 116480 Nov 24 10:56 000000010000000100000088.bz2
3104 -rw-------+ 1 postgres _postgres 1585487 Nov 24 11:06 000000010000000100000089.bz2
1224 -rw-------+ 1 postgres _postgres 624755 Nov 24 11:16 00000001000000010000008A.bz2
744 -rw-------+ 1 postgres _postgres 377394 Nov 24 11:26 00000001000000010000008B.bz2
80 -rw-------+ 1 postgres _postgres 38856 Nov 24 11:36 00000001000000010000008C.bz2
32 -rw-------+ 1 postgres _postgres 16147 Nov 24 11:46 00000001000000010000008D.bz2
48 -rw-------+ 1 postgres _postgres 21710 Nov 24 11:56 00000001000000010000008E.bz2
6816 -rw-------+ 1 postgres _postgres 3487066 Nov 21 07:56 base_archive_20121121_075602.tbz
active_wal_files/:
total 131080
8 -rw-------+ 1 postgres _postgres 381 Nov 21 07:56 0000000100000000000000C5.00000020.backup
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:46 00000001000000010000008D
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:56 00000001000000010000008E
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 10:46 00000001000000010000008F
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 10:56 000000010000000100000090
16384 -rw-------+ 1 postgres staff 8388608 Nov 24 11:06 000000010000000100000091
16384 -rw-------+ 1 postgres staff 8388608 Nov 24 11:16 000000010000000100000092
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:26 000000010000000100000093
16384 -rw-------+ 1 postgres _postgres 8388608 Nov 24 11:36 000000010000000100000094
0 drwx------+ 2 postgres staff 170 Nov 24 11:56 archive_status
dir:Archive postgres$
Question: what is behind the idle content in the WAL files?
Thank you for your time.
Regards
Gavan
--
Gavan Schneider <***@galeel.org>
Ph. +61 2 6355 6256 "Pendari" Anarel Road
Mb. +61 405 124 883 Sodwalls NSW 2790
Australia
--
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
Sent via pgsql-novice mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice