PMTiles Specification Deep Dive
Modern mapping infrastructure has shifted decisively away from directory-based tile caches toward single-file archives optimized for cloud storage and HTTP range requests. The PMTiles format represents a foundational evolution in how geospatial data is packaged, distributed, and consumed at scale. This page examines the binary architecture, indexing strategy, and implementation patterns required to integrate the format into automated vector tile generation and caching pipelines. For teams building distributed mapping platforms, understanding the low-level structure is essential for optimizing delivery latency, reducing storage overhead, and maintaining compatibility with modern tile servers.
Prerequisites & Pipeline Requirements
Before implementing PMTiles in production, ensure your engineering pipeline meets these baseline requirements:
- Familiarity with HTTP/1.1 and HTTP/2 range request semantics
- Working knowledge of tile coordinate systems (EPSG:3857, TMS/XYZ grid)
- Proficiency in Python or Node.js for binary stream manipulation
- Access to a cloud storage backend supporting byte-range fetches (S3, GCS, Cloudflare R2, or equivalent)
- Baseline understanding of Vector Tile Architecture & Format Fundamentals to contextualize how PMTiles encapsulates protobuf-encoded geometry
The complete binary layout, versioning rules, and compression standards are formally documented in the official PMTiles v3 Specification.
Core Binary Architecture
PMTiles is engineered around five contiguous sections: a fixed-length header, a root directory, JSON metadata, optional leaf directories, and the tile data payload. Unlike legacy SQLite-based archives, PMTiles relies entirely on sequential byte offsets and HTTP range requests, eliminating database query overhead and locking contention.
Fixed-Length Header (127 Bytes)
The header occupies exactly 127 bytes at the beginning of the archive. The byte layout is:
| Offset | Length | Field |
|---|---|---|
| 0–6 | 7 bytes | Magic bytes: ASCII PMTiles |
| 7 | 1 byte | Version (currently 3) |
| 8–15 | 8 bytes | Root directory offset (little-endian uint64) |
| 16–23 | 8 bytes | Root directory length (little-endian uint64) |
| 24–31 | 8 bytes | JSON metadata offset (little-endian uint64) |
| 32–39 | 8 bytes | JSON metadata length (little-endian uint64) |
| 40–47 | 8 bytes | Leaf directories offset (little-endian uint64) |
| 48–55 | 8 bytes | Leaf directories length (little-endian uint64) |
| 56–63 | 8 bytes | Tile data offset (little-endian uint64) |
| 64–71 | 8 bytes | Tile data length (little-endian uint64) |
| 72–79 | 8 bytes | Number of addressed tiles (uint64) |
| 80–87 | 8 bytes | Number of tile entries (uint64) |
| 88–95 | 8 bytes | Number of tile contents (uint64) |
| 96 | 1 byte | Clustered flag |
| 97 | 1 byte | Internal compression |
| 98 | 1 byte | Tile compression |
| 99 | 1 byte | Tile type |
| 100 | 1 byte | Min zoom |
| 101 | 1 byte | Max zoom |
| 102–109 | 8 bytes | Min position (lon/lat as E7 int32 pairs) |
| 110–117 | 8 bytes | Max position |
| 118 | 1 byte | Center zoom |
| 119–126 | 8 bytes | Center position |
Because the header size is immutable, clients can fetch exactly 127 bytes to bootstrap the entire parsing process.
Tile Type & Compression Enums
The tile_type and tile_compression header fields use compact integer enums:
Tile Type:
| Value | Meaning |
|---|---|
0 |
Unknown / Other |
1 |
MVT Vector Tile |
2 |
PNG |
3 |
JPEG |
4 |
WebP |
5 |
AVIF |
Compression:
| Value | Meaning |
|---|---|
0 |
Unknown |
1 |
None |
2 |
gzip |
3 |
Brotli |
4 |
Zstandard (zstd) |
Directory Index & Hilbert Curve Tile IDs
The directory acts as a spatial index, mapping tile coordinates to precise byte offsets and lengths within the payload section. PMTiles assigns each tile a unique integer ID based on its position on a Hilbert curve — a space-filling curve that maps 2D coordinates to a 1D sequence while preserving spatial locality. Tiles adjacent on the Hilbert curve tend to be geographically close, which means fetching a viewport’s worth of tiles requires fewer, larger byte-range requests rather than many scattered fetches.
The directory entries are varint-encoded and compressed (typically with zstd or gzip), reducing index size significantly for dense tilesets. When a client requests a specific tile, it:
- Fetches the 127-byte header
- Issues a targeted range request for the root directory
- Decompresses the directory and resolves the target tile’s offset
- Requests that exact byte range for the tile payload
For very large archives, a two-level directory structure (root + leaf directories) avoids loading a single enormous index into memory.
Tile Payload & Compression
Individual tiles are stored as contiguous raw byte sequences. For vector tiles, the payload strictly adheres to the Mapbox Vector Tile Specification v2.1. The PMTiles container remains agnostic to the internal tile format, though homogeneous archives yield better compression ratios.
When evaluating delivery strategies, weigh the storage savings of compressed payloads against the CPU overhead of on-the-fly decompression — a tradeoff also relevant in Vector vs Raster Tile Tradeoffs.
HTTP Range Request Workflow
The operational efficiency of PMTiles hinges on HTTP byte-range requests. Clients request specific byte ranges using the Range: bytes=start-end header; the server responds with 206 Partial Content. This workflow aligns with RFC 7233.
Modern CDNs cache each 206 response independently based on the Range header, effectively turning a single archive into thousands of cacheable micro-assets. Ensure your origin server returns Accept-Ranges: bytes and does not buffer full responses — proxy layers that ignore Range headers negate the format’s primary advantage.
Implementation Patterns for Automation
Python Stream Parsing
Python’s struct module and requests library provide a lightweight foundation for header parsing. The correct byte offsets per the v3 spec are shown below.
import struct
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def get_session() -> requests.Session:
"""Configure a resilient HTTP session with automatic retries."""
session = requests.Session()
retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])
adapter = HTTPAdapter(max_retries=retry)
session.mount("https://", adapter)
session.mount("http://", adapter)
return session
def fetch_pmtiles_header(url: str, session: requests.Session) -> dict:
"""Fetch and parse the 127-byte PMTiles v3 header."""
response = session.get(url, headers={"Range": "bytes=0-126"}, stream=True)
response.raise_for_status()
h = response.content
magic = h[:7].decode("ascii")
if magic != "PMTiles":
raise ValueError("Invalid PMTiles archive: missing magic bytes")
version = struct.unpack("<B", h[7:8])[0]
if version != 3:
raise NotImplementedError(f"Unsupported PMTiles version: {version}")
# Offsets per PMTiles v3 spec (https://github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md)
dir_offset = struct.unpack("<Q", h[8:16])[0]
dir_length = struct.unpack("<Q", h[16:24])[0]
meta_offset = struct.unpack("<Q", h[24:32])[0]
meta_length = struct.unpack("<Q", h[32:40])[0]
tile_type = struct.unpack("<B", h[99:100])[0]
tile_compress = struct.unpack("<B", h[98:99])[0]
min_zoom = struct.unpack("<B", h[100:101])[0]
max_zoom = struct.unpack("<B", h[101:102])[0]
return {
"version": version,
"dir_offset": dir_offset,
"dir_length": dir_length,
"meta_offset": meta_offset,
"meta_length": meta_length,
"tile_type": tile_type,
"tile_compression": tile_compress,
"min_zoom": min_zoom,
"max_zoom": max_zoom,
}
def fetch_tile_range(url: str, offset: int, length: int, session: requests.Session) -> bytes:
"""Retrieve a specific tile payload via HTTP range request."""
end = offset + length - 1
headers = {"Range": f"bytes={offset}-{end}"}
response = session.get(url, headers=headers, stream=True)
response.raise_for_status()
return response.content
Node.js Buffer Handling
In Node.js, use DataView for little-endian 64-bit reads. The byte offsets below match the v3 spec.
async function fetchPMTilesHeader(url) {
const res = await fetch(url, { headers: { Range: "bytes=0-126" } });
if (!res.ok) throw new Error(`Header fetch failed: ${res.status}`);
const buffer = await res.arrayBuffer();
const view = new DataView(buffer);
const magic = new TextDecoder().decode(new Uint8Array(buffer.slice(0, 7)));
if (magic !== "PMTiles") throw new Error("Invalid archive magic bytes");
const version = view.getUint8(7);
if (version !== 3) throw new Error(`Unsupported version: ${version}`);
// Byte offsets per PMTiles v3 spec
const dirOffset = Number(view.getBigUint64(8, true));
const dirLength = Number(view.getBigUint64(16, true));
const metaOffset = Number(view.getBigUint64(24, true));
const metaLength = Number(view.getBigUint64(32, true));
const tileType = view.getUint8(99);
const tileCompress = view.getUint8(98);
const minZoom = view.getUint8(100);
const maxZoom = view.getUint8(101);
return { version, dirOffset, dirLength, metaOffset, metaLength,
tileType, tileCompress, minZoom, maxZoom };
}
async function fetchTileRange(url, offset, length) {
const end = offset + length - 1;
const res = await fetch(url, { headers: { Range: `bytes=${offset}-${end}` } });
if (!res.ok) throw new Error(`Tile fetch failed: ${res.status}`);
return new Uint8Array(await res.arrayBuffer());
}
Both implementations validate magic bytes and version before proceeding, preventing silent corruption in automated workflows.
Metadata, Clustering, and Validation
The JSON metadata block (located at meta_offset) stores tileset properties, attribution, min/max zoom, and bounding coordinates. For MVT archives, this block must include a vector_layers array per the TileJSON 3.0 specification.
The clustered flag in the header indicates whether the tile data section is ordered by Hilbert tile ID. Clustered archives enable clients to fetch viewport tiles with fewer, larger HTTP requests, which is particularly valuable on high-latency mobile networks.
When troubleshooting archive integrity or inspecting metadata in CI/CD pipelines, use dedicated utilities outlined in How to Inspect PMTiles Metadata with CLI Tools.
Edge Caching & Pipeline Integration
To deploy PMTiles successfully at scale:
- Validate Origin Support: Confirm your object storage returns
206 Partial Contentand respectsRangeheaders. S3 and GCS support this natively; custom Nginx configurations may requireproxy_cache_valid 206directives. - Optimize Directory Compression: Use zstd level 3–4 for the directory index to balance size and decompression latency.
- Implement Cache Headers: Set
Cache-Control: public, max-age=31536000, immutablefor tile byte ranges; use shorter TTLs (max-age=3600) for the header/directory if metadata updates frequently. - Monitor Range Request Metrics: Track
206response ratios and partial download errors to identify proxy misconfigurations. - Automate Archive Packaging: Integrate the
pmtilesCLI (available from protomaps/go-pmtiles) into your tile generation pipeline to produce versioned archives, validate checksums, and upload to cloud storage with correct CORS headers.
For teams migrating from SQLite-backed tile storage, reviewing MBTiles Architecture & Limits clarifies where the older format bottlenecked and how PMTiles resolves those constraints through stateless HTTP delivery.