Parquet Decoder
Decodes Apache Parquet columnar format into Arrow RecordBatches.
Configuration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
codec | string | yes | — | Must be "parquet" |
max_size | integer | no | 10485760 | Maximum message size in bytes (10MB default) |
Example
sources:
my_s3:
type: aws_s3
bucket: data-lake
prefix: events/
decoding:
codec: parquetWhen to Use
- Data lake ingestion - S3, GCS parquet files
- Batch analytics - Columnar query efficiency
- Historical data - Compressed columnar storage