Dropping Unused Attributes to Reduce Tile Size
Dropping unused attributes to reduce tile size is achieved by explicitly filtering GeoJSON or database properties before or during vector tile encoding. In production pipelines, this is handled by passing a strict property allowlist to the tile encoder via Tippecanoe’s -y / --include flag, which keeps only the specified keys in the Protocol Buffers payload. Removing redundant metadata consistently shrinks .pbf files by 15–60%, directly lowering CDN egress costs, accelerating tile fetch times, and reducing browser memory overhead during WebGL rendering.
How MVT Encoding Amplifies Attribute Bloat
Vector tiles bundle geometry coordinates, property keys, and string values into a compressed binary format. Every unused attribute still consumes dictionary space, string tables, and value arrays inside the MVT container. Protobuf encoding relies on a per-layer dictionary where keys are stored once and referenced by integer indices. If your source data contains 40 properties but your frontend only uses 6, the remaining 34 keys still populate the key table, and their associated string values are deduplicated and stored globally. At scale, this inflates the string_values and keys arrays, negating the compression benefits of geometry simplification.
The Mapbox Vector Tile Specification defines this dictionary-based encoding explicitly: keys and values are stored in parallel arrays, and features reference them via integer offsets. Implementing strict Attribute Filtering Rules early in your workflow prevents bloat from propagating into your tile cache and ensures downstream style expressions only reference guaranteed properties.
Core Implementation: Tippecanoe Attribute Inclusion
Tippecanoe’s -y / --include flag accepts a single attribute name and retains only that attribute in the output tiles. Use it repeatedly to build an allowlist:
tippecanoe \
-o roads.mbtiles \
-l roads \
-y name -y highway -y surface -y maxspeed \
--drop-densest-as-needed \
--maximum-zoom=14 \
roads.geojson
Alternatively, use -x / --exclude to drop specific columns when you want to keep most attributes:
tippecanoe \
-o buildings.mbtiles \
-l buildings \
-x legacy_id -x internal_audit_flag -x created_at \
--maximum-zoom=16 \
buildings.geojson
When Tippecanoe processes these flags, it performs two critical optimizations:
- Key Dictionary Pruning: Only allowlisted keys are written to the layer’s key table. Excluded keys never receive an index.
- Value Deduplication: String values for dropped keys are never added to the global string table, saving significant bytes across millions of features.
Refer to the official Tippecanoe documentation for the full CLI reference.
Production Python Pipeline
The following script demonstrates a production-ready automation wrapper. It pre-filters source GeoJSON columns using Python, then invokes Tippecanoe with explicit inclusion flags for a final safety net.
import json
import logging
import os
import subprocess
import tempfile
from pathlib import Path
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
def build_tile_with_filtered_attributes(
input_geojson: Path,
output_mbtiles: Path,
keep_attributes: list[str],
max_zoom: int = 14,
min_zoom: int = 0,
) -> None:
"""Encode vector tiles while dropping unused attributes to reduce tile size."""
if not input_geojson.exists():
raise FileNotFoundError(f"Source GeoJSON not found: {input_geojson}")
# Build Tippecanoe command with per-attribute -y flags
include_flags = []
for attr in keep_attributes:
include_flags.extend(["-y", attr])
cmd = [
"tippecanoe",
"--output", str(output_mbtiles),
"--maximum-zoom", str(max_zoom),
"--minimum-zoom", str(min_zoom),
"--drop-densest-as-needed",
"--force",
*include_flags,
str(input_geojson)
]
logging.info("Encoding tiles with strict attribute filtering...")
result = subprocess.run(cmd, capture_output=True, text=True, check=False)
if result.returncode != 0:
logging.error("Tippecanoe failed:\n%s", result.stderr)
raise RuntimeError("Tile encoding failed. Check logs for details.")
original_size = input_geojson.stat().st_size
tile_size = output_mbtiles.stat().st_size
reduction = ((original_size - tile_size) / original_size) * 100
logging.info(
"Encoding complete. Original: %.2f MB | Tiles: %.2f MB | Reduction: %.1f%%",
original_size / 1_048_576,
tile_size / 1_048_576,
reduction
)
if __name__ == "__main__":
build_tile_with_filtered_attributes(
input_geojson=Path("data/osm_extracts.geojson"),
output_mbtiles=Path("dist/filtered_tiles.mbtiles"),
keep_attributes=["name", "highway", "surface", "maxspeed", "oneway"],
max_zoom=15
)
Validation & Performance Trade-offs
Attribute pruning delivers measurable gains, but requires validation to prevent rendering regressions:
- Verify with
tippecanoe-decode: Runtippecanoe-decode output.mbtiles <z> <x> <y> | jq '[.features[0].properties | keys[]]'to confirm only allowlisted keys appear in the output. - Network Waterfall Analysis: Compare tile fetch times before and after filtering. A 30–50% reduction in
.pbfsize typically yields proportional improvements in Time to First Byte (TTFB) and decompression latency. - WebGL Memory Profiling: Fewer properties mean smaller
Featureobjects in MapLibre GL JS. Monitor heap allocation in browser dev tools; aggressive filtering often cuts per-tile memory by 20–40%. - Style Expression Safety: Ensure your frontend style layers never reference dropped keys. Use
has()guards or provide fallback values incoalesce()expressions to prevent silent rendering failures. - Dynamic Data Handling: If your pipeline ingests evolving datasets, version your attribute lists alongside your tile generation jobs. Integrate this logic into Automated Generation Pipelines with Tippecanoe to enforce schema consistency across CI/CD runs.
Dropping unused attributes to reduce tile size is not a one-time optimization — it is a continuous constraint that keeps your mapping infrastructure lean, cost-effective, and responsive at scale.