Encodes events to Apache Parquet columnar format.

Parquet Encoder

Encodes Arrow RecordBatches to Apache Parquet columnar format.

Configuration

FieldTypeRequiredDefaultDescription
codecstringyesMust be "parquet"
compressionstringnononeCompression: none, lz4, snappy, zstd, gzip

Example

sinks:
  my_s3:
    type: s3
    bucket: data-lake
    prefix: events/
    encoding:
      codec: parquet
      compression: zstd

When to Use

  • Data lakes - S3, GCS optimal format
  • Analytics queries - Column pruning, predicate pushdown
  • Cost-effective storage - High compression ratios