Data retention

Raw telemetry is retained for five years in TimescaleDB and backed by three continuous-aggregate tiers for longer-horizon charts.

Storage model

Telemetry is stored in a TimescaleDB hypertable (telemetry). TimescaleDB automatically partitions the hypertable into time-ordered chunks, enabling efficient range scans across large datasets without manual table partitioning.

Partners asking "how far back can I query?" should distinguish raw telemetry from downsampled chart history:

Raw telemetry: retained for 5 years (1825 days).
Continuous aggregates: retained independently unless a separate policy is added.

Continuous aggregates (CAGGs)

Three pre-computed materialized views sit on top of the raw hypertable. They are refreshed on a rolling schedule by TimescaleDB's background worker, so recent data lands in the raw table first and is promoted into the CAGGs within minutes.

Each CAGG pre-aggregates avg_value, min_value, max_value, and sample_count per (plant_id, sub_device_id, metric, bucket).

CAGG view	Bucket size	Refresh schedule	Look-back window
`telemetry_5m`	5 minutes	Every 5 minutes	1 hour
`telemetry_1h`	1 hour	Every hour	3 hours
`telemetry_1d`	1 day	Every day	3 days

All three CAGGs roll up from the raw telemetry hypertable directly (not cascaded), so each is independently consistent regardless of whether a lower tier has been refreshed.

Automatic tier selection

TelemetryService.querySubDevice() selects the appropriate source automatically based on the requested time window. You do not need to call different endpoints for different resolutions.

Requested span	Source used	Bucket returned
≤ 2 hours	Raw `telemetry` table (aggregated on-the-fly into 1-minute buckets)	1 minute
2 hours – 24 hours	`telemetry_5m`	5 minutes
1 day – 7 days	`telemetry_1h`	1 hour
> 7 days	`telemetry_1d`	1 day

For short windows (≤ 2 hours), the raw table is queried directly with a time_bucket('1 minute', time) aggregation applied at query time. For wider windows, the pre-computed CAGG is used and the bucket column is returned as-is.

Retention policy

As of the CPI meeting on 2026-05-05, Voke applies a TimescaleDB retention policy to the raw telemetry hypertable:

SELECT add_retention_policy('telemetry', INTERVAL '1825 days', if_not_exists => TRUE);

This drops raw chunks older than 5 years. CPI owns any archive requirement beyond that window.

Compression

The raw hypertable applies TimescaleDB native compression on chunks older than 7 days:

ALTER TABLE telemetry SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'plant_id, sub_device_id, metric',
  timescaledb.compress_orderby   = 'time DESC'
);
SELECT add_compression_policy('telemetry', INTERVAL '7 days');

Compressed chunks are fully transparent to queries — TimescaleDB decompresses on read. Compression reduces storage by 90–95 % for typical IoT time-series without affecting query semantics.

Implications for partners

Raw history is available for five years. ESM partners doing regulatory or subsidy-related reconciliation beyond that window should export or archive the needed raw data before it ages out.

Use the right query window. Wide scans (months, years) that fall into the telemetry_1d CAGG are fast. Queries that span more than 7 days but request a finer granularity than 1 day will be slower because they bypass the CAGGs and hit compressed raw chunks. Structure analytics queries to match the tier boundaries above.

No indefinite raw retention. Older docs said telemetry was kept forever. That is no longer true for raw rows after migration 1778100000000-AddTelemetryRetentionPolicy.