etl_lib.task.data_loading.SQLLoad2Neo4jTask module

class SQLLoad2Neo4jTask(context, batch_size=5000)[source]

Bases: Task

Load the output of the specified SQL query to Neo4j.

Uses BatchProcessors to read and write data. Subclasses must implement the methods returning the SQL and Cypher queries.

Example usage: (from the MusicBrainz example)

class LoadArtistCreditTask(SQLLoad2Neo4jTask):
    def _sql_query(self) -> str:
        return """
            SELECT ac.id AS artist_credit_id, ac.name AS credit_name
            FROM artist_credit ac;
            """

    def _cypher_query(self) -> str:
        return """
               UNWIND $batch AS row
               MERGE (ac:ArtistCredit {id: row.artist_credit_id})
               SET ac.name = row.credit_name
              """

    def _count_query(self) -> str | None:
        return "SELECT COUNT(*) FROM artist_credit;"
Parameters:
__init__(context, batch_size=5000)[source]

Construct a Task object.

Parameters:
run_internal()[source]

Place to provide the logic to be performed.

This base class provides all the housekeeping and reporting, so that implementation must/should not need to care about them. Exceptions should not be captured by implementations. They are handled by this base class.

Parameters:

kwargs – will be passed to run_internal

Return type:

TaskReturn

Returns:

An instance of TaskReturn.