etl_lib.data_sink.CypherBatchSink module
- class CypherBatchSink(context, task, predecessor, query, **kwargs)[source]
Bases:
BatchProcessorBatchProcessor to write batches of data to a Neo4j database.
- Parameters:
context (ETLContext)
task (Task)
predecessor (BatchProcessor)
query (str)
- __init__(context, task, predecessor, query, **kwargs)[source]
Constructs a new CypherBatchSink.
- Parameters:
context (
ETLContext) –etl_lib.core.ETLContext.ETLContextinstance.task (
Task) –etl_lib.core.Task.Taskinstance owning this batchProcessor.predecessor (
BatchProcessor) – BatchProcessor whichget_batch()function will be called to receive batches to process.query (
str) – Cypher to write the query to Neo4j. Data will be passed as batch parameter. Therefore, the query should start with a UNWIND $batch AS row.
- get_batch(batch_size)[source]
Provides a batch of data to the caller.
The batch itself could be called and processed from the provided predecessor or generated from other sources.
- Parameters:
max_batch__size – The max size of the batch the caller expects to receive.
batch_size (int)
- Return type:
- Returns
A generator that yields batches.