This is a development version of the documentation. Content may change without notice.
Voke Documentation
Concepts

Data retention

Raw telemetry is retained for five years in TimescaleDB and backed by three continuous-aggregate tiers for longer-horizon charts.

Storage model

Telemetry is stored in a TimescaleDB hypertable (telemetry). TimescaleDB automatically partitions the hypertable into time-ordered chunks, enabling efficient range scans across large datasets without manual table partitioning.

Partners asking "how far back can I query?" should distinguish raw telemetry from downsampled chart history:

  • Raw telemetry: retained for 5 years (1825 days).
  • Continuous aggregates: retained independently unless a separate policy is added.

Continuous aggregates (CAGGs)

Three pre-computed materialized views sit on top of the raw hypertable. They are refreshed on a rolling schedule by TimescaleDB's background worker, so recent data lands in the raw table first and is promoted into the CAGGs within minutes.

Each CAGG pre-aggregates avg_value, min_value, max_value, and sample_count per (plant_id, sub_device_id, metric, bucket).

CAGG viewBucket sizeRefresh scheduleLook-back window
telemetry_5m5 minutesEvery 5 minutes1 hour
telemetry_1h1 hourEvery hour3 hours
telemetry_1d1 dayEvery day3 days

All three CAGGs roll up from the raw telemetry hypertable directly (not cascaded), so each is independently consistent regardless of whether a lower tier has been refreshed.


Automatic tier selection

TelemetryService.querySubDevice() selects the appropriate source automatically based on the requested time window. You do not need to call different endpoints for different resolutions.

Requested spanSource usedBucket returned
≤ 2 hoursRaw telemetry table (aggregated on-the-fly into 1-minute buckets)1 minute
2 hours – 24 hourstelemetry_5m5 minutes
1 day – 7 daystelemetry_1h1 hour
> 7 daystelemetry_1d1 day

For short windows (≤ 2 hours), the raw table is queried directly with a time_bucket('1 minute', time) aggregation applied at query time. For wider windows, the pre-computed CAGG is used and the bucket column is returned as-is.


Retention policy

As of the CPI meeting on 2026-05-05, Voke applies a TimescaleDB retention policy to the raw telemetry hypertable:

SELECT add_retention_policy('telemetry', INTERVAL '1825 days', if_not_exists => TRUE);

This drops raw chunks older than 5 years. CPI owns any archive requirement beyond that window.


Compression

The raw hypertable applies TimescaleDB native compression on chunks older than 7 days:

ALTER TABLE telemetry SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'plant_id, sub_device_id, metric',
  timescaledb.compress_orderby   = 'time DESC'
);
SELECT add_compression_policy('telemetry', INTERVAL '7 days');

Compressed chunks are fully transparent to queries — TimescaleDB decompresses on read. Compression reduces storage by 90–95 % for typical IoT time-series without affecting query semantics.


Implications for partners

Raw history is available for five years. ESM partners doing regulatory or subsidy-related reconciliation beyond that window should export or archive the needed raw data before it ages out.

Use the right query window. Wide scans (months, years) that fall into the telemetry_1d CAGG are fast. Queries that span more than 7 days but request a finer granularity than 1 day will be slower because they bypass the CAGGs and hit compressed raw chunks. Structure analytics queries to match the tier boundaries above.

No indefinite raw retention. Older docs said telemetry was kept forever. That is no longer true for raw rows after migration 1778100000000-AddTelemetryRetentionPolicy.


See also

  • PLC / Snapshot envelope — how raw telemetry rows are created from the snapshot envelope
  • Signals — storage strategies (STATE_CHANGE vs TICK) that control write volume into the hypertable

On this page