etl_lib.data_sink.CSVBatchSink module
- class CSVBatchSink(context, task, predecessor, file_path, **kwargs)[source]
Bases:
BatchProcessorBatchProcessor to write batches of data to a CSV file.
- Parameters:
context (ETLContext)
task (Task)
predecessor (BatchProcessor)
file_path (Path)
- __init__(context, task, predecessor, file_path, **kwargs)[source]
Constructs a new CSVBatchSink.
- Parameters:
context (
ETLContext) –etl_lib.core.ETLContext.ETLContextinstance.task (
Task) –etl_lib.core.Task.Taskinstance owning this batchProcessor.predecessor (
BatchProcessor) – BatchProcessor whichget_batch()function will be called to receive batches to process.file_path (
Path) – Path to the CSV file where data will be written. If the file exists, data will be appended.**kwargs – Additional arguments passed to csv.DictWriter to allow tuning the csv creation.
- get_batch(batch_size)[source]
Provides a batch of data to the caller.
The batch itself could be called and processed from the provided predecessor or generated from other sources.
- Parameters:
max_batch__size – The max size of the batch the caller expects to receive.
batch_size (int)
- Return type:
- Returns
A generator that yields batches.