etl_lib.task.data_loading.SQLLoad2Neo4jTask module
- class SQLLoad2Neo4jTask(context, batch_size=5000)[source]
Bases:
TaskLoad the output of the specified SQL query to Neo4j.
Uses BatchProcessors to read and write data. Subclasses must implement the methods returning the SQL and Cypher queries.
Example usage: (from the MusicBrainz example)
class LoadArtistCreditTask(SQLLoad2Neo4jTask): def _sql_query(self) -> str: return """ SELECT ac.id AS artist_credit_id, ac.name AS credit_name FROM artist_credit ac; """ def _cypher_query(self) -> str: return """ UNWIND $batch AS row MERGE (ac:ArtistCredit {id: row.artist_credit_id}) SET ac.name = row.credit_name """ def _count_query(self) -> str | None: return "SELECT COUNT(*) FROM artist_credit;"
- Parameters:
context (ETLContext)
batch_size (int)
- __init__(context, batch_size=5000)[source]
Construct a Task object.
- Parameters:
context (
ETLContext) –ETLContextinstance. Will be available to subclasses.batch_size (int)
- run_internal()[source]
Place to provide the logic to be performed.
This base class provides all the housekeeping and reporting, so that implementation must/should not need to care about them. Exceptions should not be captured by implementations. They are handled by this base class.
- Parameters:
kwargs – will be passed to run_internal
- Return type:
- Returns:
An instance of
TaskReturn.