WAL

This document provides an explanation of WAL (Write Ahead Log).

Index

WAL

Introduction

LevelDB does not have an option to enable/disable WAL. To activate or deactivate WAL, you need to either use RocksDB or modify LevelDB's source code.

LevelDB uses WAL (Write Ahead Log) when writing data. WAL records logs of all transactions in LevelDB and is used to prevent transaction loss.

In LevelDB, data is temporarily stored in the Memtable first. However, since Memtable exists only in memory, data stored in Memtable can be lost if the system terminates abnormally or encounters an error. Analysis of this is detailed in the ETC section.

To prevent data loss, LevelDB stores transactions in a separate log file, which is the WAL.

Format

LevelDB's WAL is stored as a .log file in binary format. The WAL file stores two things: - Transaction header - Actual transaction data (payload)

The WAL header is stored in the following format:

Header components: 1. CRC Checksum (4byte) 2. Data size (2byte) 3. Record Type (1byte)

Payload

After the WAL header, the payload is stored in the following format.

Example: When PUT-ting 3 pairs of Key-Value data like {"A": "Hello world!", "B": "Good bye world!", "C": "I am hungry"}, the WAL file looks like this:

Functions

Here are the findings from analyzing WAL and MANIFEST related functions.

The log::reader file contains the Reader class and several functions.

Analysis focused on the bool Reader::ReadRecord(Slice* record, std::string* scratch) function.

log::Writer::AddRecord

Receives slice type data and writes it to WAL. Calls log::Writer::EmitPhysicalRecord function for writing.

log::Writer::EmitPhysicalRecord

Receives slice type data, creates a header, and writes to file. Adds the header and writes to .log file through PosixWritableFile::Append().

PosixWritableFile::Append()

Implementation of WritableFile in POSIX environment. WAL files are written through PosixWritableFile. Records slice data to buffer and writes to file when buffer is full.

ETC

Includes experimental results of data loss with WAL enabled/disabled. Since WAL options are only supported in RocksDB, experiments were conducted using RocksDB.

Summary

Experiments were conducted in the following situation: - After writing data with PUT command, forcefully terminate process (SIGINT) - Check for data loss before termination

Design

The code used in experiments can be found here.

Scenarios

Experiments were conducted with 4 scenarios:

WAL Enabled Manual Flush
Scenario 1 X X
Scenario 2 X O
Scenario 3 O X
Scenario 4 O O

Results

Results are as follows:

Scenario Data Integrity
Scenario 1 Lost all data in Memtable
Scenario 2 Lost some data before Manual Flush
Scenario 3 Preserved all data
Scenario 4 Preserved all data

Additional experiments used these conditions: - Total 1,000,000 PUT commands executed - [0] Flush every 1,000, [1] Flush every 10,000

Type Speed Data Integrity
WAL Disabled 1.843s Lost all data
WAL Enabled 3.604s Preserved all data
Manual Flush [0] & WAL Disabled 39.128s Lost some data
Manual Flush [1] & WAL Disabled 5.611s Lost some data

Bottom line