Small. Fast. Reliable.
Choose any three.
Write Ahead Log Notes
Table Of Contents

1. Log File Format

A Write Ahead Log (WAL) file is divided into pages in the same way as an SQLite database file. Like an SQLite database file, pages are numbered starting from 1. The page size of a WAL file is the same as the page size of the corresponding database file. There are three types of pages in the log file, as follows:

A write-ahead log consists of a series of zero or more log segments. Each log segment consists of a single control page followed by a set of data pages. Each data page contains the data for a new version of a page within the database image. The control page contains the database page numbers corresponding to each data page in the log segment, and various checksums used to detect corruption in the log file.

The header page always begins at offset 0 of the log file. Following this is a special control page - "page two". The page designated "page two" is usually stored immediately following the header page (on page 2 of the log file). However, if the disk sector size is larger than the log file page size, "page two" is stored in the first page of the second sector of the log file. This allows both the header page and this first control page to be written to without risking damage to the other in the event of a power failure.

The contents of the latest version of the database may be found by taking the current contents of the database file and replacing the old versions of pages with the new versions found in each log segment in the log file, in order.

When it is read (recovered), a log file is in one of three forms, depending on the contents of the header page stored at the start of the file:

  1. The first page of the log file is a valid header that indicates that the log starts with a control page located immediately following (or shortly after, depending on the disk sector size) the header page.

  2. The first page of the log file does not contain a valid header file. In this case, the log starts with a control page located immediately following (or shortly after) the header page. This is similar to case 1, except that since the page-size and disk-sector size are both unknown (the disk sector size when the log file was created may be different to the sector size when it is being read) the control page must be located by depending on the checksums and trial and error.

  3. The first page of the log file is a valid header file that indicates that the log segments stored in a specified range of the file should be processed (read) first, before returning to process the log segment that begins with a control page immediately following (or shortly after) the header file.

In all three cases, the end of the log occurs when either the end of the file is reached or the checksum values found on a control page fail to match the contents of a control page or the data pages that follow it. Except, if the log file is in form 3, and a checksum fails while reading log segments from within the specified range inside the file, discard all information read so far and begin reading from the start of the file as in form 1.

1.1. Header Page Detail

Field Description
Page size Size of log pages in bytes.
Page two location Location of the physically first control page in the log file. This is determined by the disk sector size.
Gap region start If the log is arranged within the log file as in form 3 above, this field stores the page number of the first page in the range of pages from the body of the file to read first. If the log file is not in form 3, this field is set to 0.
Gap region end If the log is arranged within the log file as in form 3 above, this field stores the page number of the last page in the range of pages from the body of the file to read first. If the log file is not in form 3, this field is set to 0.
Gap control page checksum If the log is arranged within the log file as in form 3 above, this field stores a copy of the control page checksum field of the first page in the gap region.
Checksum Checksum based on the contents of this page (except for the bytes occupied by this field).

1.2. Control Page Detail

Field Description
Page size Size of log pages in bytes. TODO: This is only really necessary on the first control page in the file.
Segment id Integer value incremented with each log segment written to the log file. The purpose of this is currently to make sure that the control page checksum changes even if the data does not.
Next control page The page number of the next control page in the log. Specifying this explicitly (instead of implicitly based on the number of data pages that follow the current control page) allows control pages to be aligned to the start of disk sector boundaries.
Commit flag A Boolean flag indicating whether or not the log segment headed by the control page is a commit record. The last log segment written to the log by each SQL transaction is marked as a commit record.
Database size This field is only used by commit records. The number of pages in the database at the conclusion of the transaction.
Data page count The number of data pages in the log segment headed by the control page.
First skip page Normally, data pages are stored as a contiguous block immediately following the control page. However, occasionally this block is interrupted, as in "data 4" in "Log File Form 3" in the diagram at the top of this page. In this case, this field is set to the page number of the first log page that is part of the block not used for data pages associated with this log segment.
Last skip page Similar to the "first skip field", except this field contains the page number of the last log file page not used as a data page associated with this segment.
Master journal pointer This field may contain the file name of a master journal file. It is only ever used by commit records.
Page Number Array A database page number for each data page in the log segment headed by this control page.
Data checksum Checksum based on:
  • The contents of each of the data pages in the current log segment.
Control page checksum Checksum based on:
  • The contents of this control page, except for the bytes consumed by this field, and
  • The control page checksum of the previous control page in the log.
Internal checksum This field is only used by the "page two" control page. Checksum based on:
  • The contents of this control page, except for this field and the control page checksum field (but including the data checksum).

2. Recovering a Log File

This section describes the process of recovering (reading) a log file.

3. Database Write Operations

3.1. Writing a Transaction

3.2. Checkpoint Operations

3.3. Wrapping a Log File

The following diagram shows the procedure for safely "wrapping" the log file. This can only be done once one or more log segments from the start of the log file have been checkpointed. The following figure depicts the procedure.

If an OS or power failure occurs during step 2 and the header page is left in a corrupted state, the log file is left in form 2. Upon recovery, all segments are read from the log file starting from the first control page in the log file (marked segment "1" in the diagram above).

As shown in step 3 above, once step 2 has completed and the new header page synced to disk, the checkpointed part of the log file is no longer a valid part of the log. If recovery is required, after reading log segment 5, the recovery process will return to the start of the log file and attempt to read a log segment. However, attempting to verify the control page checksum using the control page checksum of segment 5 as one of the inputs will fail, signalling the end of the log.

In scenario A above, if the log segments in the gap region have already been checkpointed, the log may simply overwrite the old gap region when it encounters it. If a recovery is required, then the header page will still point to the old gap region. However, when the recovery process attempts to read it, the control page checksum will fail, causing the recovery to fallback to starting at the start of the log file (segment 6).

In scenario B, the control page for log segment 8 contains the first and last pages of the skip region - a region of pages to jump over when reading the data pages for log segment 8.

4. Database Locking and Read Operations

Both writers and checkpointers must obtain a lock before performing any work, as follows: