Parquet Encoder
Encodes Arrow RecordBatches to Apache Parquet columnar format.
Configuration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
codec | string | yes | — | Must be "parquet" |
compression | string | no | none | Compression: none, lz4, snappy, zstd, gzip |
Example
sinks:
my_s3:
type: s3
bucket: data-lake
prefix: events/
encoding:
codec: parquet
compression: zstdWhen to Use
- Data lakes - S3, GCS optimal format
- Analytics queries - Column pruning, predicate pushdown
- Cost-effective storage - High compression ratios