PMTiles Specification Deep Dive

Modern mapping infrastructure has shifted decisively away from directory-based tile caches toward single-file archives optimized for cloud storage and HTTP range requests. The PMTiles format represents a foundational evolution in how geospatial data is packaged, distributed, and consumed at scale. This page examines the binary architecture, indexing strategy, and implementation patterns required to integrate the format into automated vector tile generation and caching pipelines. For teams building distributed mapping platforms, understanding the low-level structure is essential for optimizing delivery latency, reducing storage overhead, and maintaining compatibility with modern tile servers.

Prerequisites & Pipeline Requirements

Before implementing PMTiles in production, ensure your engineering pipeline meets these baseline requirements:

  • Familiarity with HTTP/1.1 and HTTP/2 range request semantics
  • Working knowledge of tile coordinate systems (EPSG:3857, TMS/XYZ grid)
  • Proficiency in Python or Node.js for binary stream manipulation
  • Access to a cloud storage backend supporting byte-range fetches (S3, GCS, Cloudflare R2, or equivalent)
  • Baseline understanding of Vector Tile Architecture & Format Fundamentals to contextualize how PMTiles encapsulates protobuf-encoded geometry

The complete binary layout, versioning rules, and compression standards are formally documented in the official PMTiles v3 Specification.

Core Binary Architecture

PMTiles is engineered around five contiguous sections: a fixed-length header, a root directory, JSON metadata, optional leaf directories, and the tile data payload. Unlike legacy SQLite-based archives, PMTiles relies entirely on sequential byte offsets and HTTP range requests, eliminating database query overhead and locking contention.

Fixed-Length Header (127 Bytes)

The header occupies exactly 127 bytes at the beginning of the archive. The byte layout is:

Offset Length Field
0–6 7 bytes Magic bytes: ASCII PMTiles
7 1 byte Version (currently 3)
8–15 8 bytes Root directory offset (little-endian uint64)
16–23 8 bytes Root directory length (little-endian uint64)
24–31 8 bytes JSON metadata offset (little-endian uint64)
32–39 8 bytes JSON metadata length (little-endian uint64)
40–47 8 bytes Leaf directories offset (little-endian uint64)
48–55 8 bytes Leaf directories length (little-endian uint64)
56–63 8 bytes Tile data offset (little-endian uint64)
64–71 8 bytes Tile data length (little-endian uint64)
72–79 8 bytes Number of addressed tiles (uint64)
80–87 8 bytes Number of tile entries (uint64)
88–95 8 bytes Number of tile contents (uint64)
96 1 byte Clustered flag
97 1 byte Internal compression
98 1 byte Tile compression
99 1 byte Tile type
100 1 byte Min zoom
101 1 byte Max zoom
102–109 8 bytes Min position (lon/lat as E7 int32 pairs)
110–117 8 bytes Max position
118 1 byte Center zoom
119–126 8 bytes Center position

Because the header size is immutable, clients can fetch exactly 127 bytes to bootstrap the entire parsing process.

Tile Type & Compression Enums

The tile_type and tile_compression header fields use compact integer enums:

Tile Type:

Value Meaning
0 Unknown / Other
1 MVT Vector Tile
2 PNG
3 JPEG
4 WebP
5 AVIF

Compression:

Value Meaning
0 Unknown
1 None
2 gzip
3 Brotli
4 Zstandard (zstd)

Directory Index & Hilbert Curve Tile IDs

The directory acts as a spatial index, mapping tile coordinates to precise byte offsets and lengths within the payload section. PMTiles assigns each tile a unique integer ID based on its position on a Hilbert curve — a space-filling curve that maps 2D coordinates to a 1D sequence while preserving spatial locality. Tiles adjacent on the Hilbert curve tend to be geographically close, which means fetching a viewport’s worth of tiles requires fewer, larger byte-range requests rather than many scattered fetches.

The directory entries are varint-encoded and compressed (typically with zstd or gzip), reducing index size significantly for dense tilesets. When a client requests a specific tile, it:

  1. Fetches the 127-byte header
  2. Issues a targeted range request for the root directory
  3. Decompresses the directory and resolves the target tile’s offset
  4. Requests that exact byte range for the tile payload

For very large archives, a two-level directory structure (root + leaf directories) avoids loading a single enormous index into memory.

Tile Payload & Compression

Individual tiles are stored as contiguous raw byte sequences. For vector tiles, the payload strictly adheres to the Mapbox Vector Tile Specification v2.1. The PMTiles container remains agnostic to the internal tile format, though homogeneous archives yield better compression ratios.

When evaluating delivery strategies, weigh the storage savings of compressed payloads against the CPU overhead of on-the-fly decompression — a tradeoff also relevant in Vector vs Raster Tile Tradeoffs.

HTTP Range Request Workflow

The operational efficiency of PMTiles hinges on HTTP byte-range requests. Clients request specific byte ranges using the Range: bytes=start-end header; the server responds with 206 Partial Content. This workflow aligns with RFC 7233.

Modern CDNs cache each 206 response independently based on the Range header, effectively turning a single archive into thousands of cacheable micro-assets. Ensure your origin server returns Accept-Ranges: bytes and does not buffer full responses — proxy layers that ignore Range headers negate the format’s primary advantage.

Implementation Patterns for Automation

Python Stream Parsing

Python’s struct module and requests library provide a lightweight foundation for header parsing. The correct byte offsets per the v3 spec are shown below.

python
import struct
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def get_session() -> requests.Session:
    """Configure a resilient HTTP session with automatic retries."""
    session = requests.Session()
    retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])
    adapter = HTTPAdapter(max_retries=retry)
    session.mount("https://", adapter)
    session.mount("http://", adapter)
    return session

def fetch_pmtiles_header(url: str, session: requests.Session) -> dict:
    """Fetch and parse the 127-byte PMTiles v3 header."""
    response = session.get(url, headers={"Range": "bytes=0-126"}, stream=True)
    response.raise_for_status()
    h = response.content

    magic = h[:7].decode("ascii")
    if magic != "PMTiles":
        raise ValueError("Invalid PMTiles archive: missing magic bytes")

    version = struct.unpack("<B", h[7:8])[0]
    if version != 3:
        raise NotImplementedError(f"Unsupported PMTiles version: {version}")

    # Offsets per PMTiles v3 spec (https://github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md)
    dir_offset    = struct.unpack("<Q", h[8:16])[0]
    dir_length    = struct.unpack("<Q", h[16:24])[0]
    meta_offset   = struct.unpack("<Q", h[24:32])[0]
    meta_length   = struct.unpack("<Q", h[32:40])[0]
    tile_type     = struct.unpack("<B", h[99:100])[0]
    tile_compress = struct.unpack("<B", h[98:99])[0]
    min_zoom      = struct.unpack("<B", h[100:101])[0]
    max_zoom      = struct.unpack("<B", h[101:102])[0]

    return {
        "version":      version,
        "dir_offset":   dir_offset,
        "dir_length":   dir_length,
        "meta_offset":  meta_offset,
        "meta_length":  meta_length,
        "tile_type":    tile_type,
        "tile_compression": tile_compress,
        "min_zoom":     min_zoom,
        "max_zoom":     max_zoom,
    }

def fetch_tile_range(url: str, offset: int, length: int, session: requests.Session) -> bytes:
    """Retrieve a specific tile payload via HTTP range request."""
    end = offset + length - 1
    headers = {"Range": f"bytes={offset}-{end}"}
    response = session.get(url, headers=headers, stream=True)
    response.raise_for_status()
    return response.content

Node.js Buffer Handling

In Node.js, use DataView for little-endian 64-bit reads. The byte offsets below match the v3 spec.

javascript
async function fetchPMTilesHeader(url) {
  const res = await fetch(url, { headers: { Range: "bytes=0-126" } });
  if (!res.ok) throw new Error(`Header fetch failed: ${res.status}`);

  const buffer = await res.arrayBuffer();
  const view = new DataView(buffer);
  const magic = new TextDecoder().decode(new Uint8Array(buffer.slice(0, 7)));

  if (magic !== "PMTiles") throw new Error("Invalid archive magic bytes");

  const version = view.getUint8(7);
  if (version !== 3) throw new Error(`Unsupported version: ${version}`);

  // Byte offsets per PMTiles v3 spec
  const dirOffset   = Number(view.getBigUint64(8, true));
  const dirLength   = Number(view.getBigUint64(16, true));
  const metaOffset  = Number(view.getBigUint64(24, true));
  const metaLength  = Number(view.getBigUint64(32, true));
  const tileType    = view.getUint8(99);
  const tileCompress = view.getUint8(98);
  const minZoom     = view.getUint8(100);
  const maxZoom     = view.getUint8(101);

  return { version, dirOffset, dirLength, metaOffset, metaLength,
           tileType, tileCompress, minZoom, maxZoom };
}

async function fetchTileRange(url, offset, length) {
  const end = offset + length - 1;
  const res = await fetch(url, { headers: { Range: `bytes=${offset}-${end}` } });
  if (!res.ok) throw new Error(`Tile fetch failed: ${res.status}`);
  return new Uint8Array(await res.arrayBuffer());
}

Both implementations validate magic bytes and version before proceeding, preventing silent corruption in automated workflows.

Metadata, Clustering, and Validation

The JSON metadata block (located at meta_offset) stores tileset properties, attribution, min/max zoom, and bounding coordinates. For MVT archives, this block must include a vector_layers array per the TileJSON 3.0 specification.

The clustered flag in the header indicates whether the tile data section is ordered by Hilbert tile ID. Clustered archives enable clients to fetch viewport tiles with fewer, larger HTTP requests, which is particularly valuable on high-latency mobile networks.

When troubleshooting archive integrity or inspecting metadata in CI/CD pipelines, use dedicated utilities outlined in How to Inspect PMTiles Metadata with CLI Tools.

Edge Caching & Pipeline Integration

To deploy PMTiles successfully at scale:

  1. Validate Origin Support: Confirm your object storage returns 206 Partial Content and respects Range headers. S3 and GCS support this natively; custom Nginx configurations may require proxy_cache_valid 206 directives.
  2. Optimize Directory Compression: Use zstd level 3–4 for the directory index to balance size and decompression latency.
  3. Implement Cache Headers: Set Cache-Control: public, max-age=31536000, immutable for tile byte ranges; use shorter TTLs (max-age=3600) for the header/directory if metadata updates frequently.
  4. Monitor Range Request Metrics: Track 206 response ratios and partial download errors to identify proxy misconfigurations.
  5. Automate Archive Packaging: Integrate the pmtiles CLI (available from protomaps/go-pmtiles) into your tile generation pipeline to produce versioned archives, validate checksums, and upload to cloud storage with correct CORS headers.

For teams migrating from SQLite-backed tile storage, reviewing MBTiles Architecture & Limits clarifies where the older format bottlenecked and how PMTiles resolves those constraints through stateless HTTP delivery.

Next reading How to Inspect PMTiles Metadata with CLI Tools