etl_lib.core.ETLContext module
- class ETLContext(env_vars)[source]
Bases:
objectGeneral context information.
Will be passed to all
Taskto provide access to environment variables and functionally deemed general enough that all parts of the ETL pipeline would need it.- Parameters:
env_vars (dict)
- __init__(env_vars)[source]
Create a new ETLContext.
- Parameters:
env_vars (
dict) – Environment variables. Stored internally and can be accessed viaenv().
The context created will contain an
Neo4jContextand aProgressReporter. It also configures an instrumentation writer from environment values. See there for keys used from the provided env_vars dict.
- class Neo4jContext(env_vars)[source]
Bases:
objectHolds the connection to the neo4j database and provides facilities to execute queries.
- Parameters:
env_vars (dict)
- __init__(env_vars)[source]
Create a new Neo4j context.
Reads the following env_vars keys: - NEO4J_URI, - NEO4J_USERNAME, - NEO4J_PASSWORD. - NEO4J_DATABASE,
Optional: pass Neo4j Python driver configuration via env vars using the prefix NEO4J_DRIVER_. Example: - NEO4J_DRIVER_MAX_CONNECTION_POOL_SIZE=200 - NEO4J_DRIVER_CONNECTION_TIMEOUT=10 - NEO4J_DRIVER_KEEP_ALIVE=true - NEO4J_DRIVER_NOTIFICATIONS_MIN_SEVERITY=OFF - NEO4J_DRIVER_NOTIFICATIONS_DISABLED_CATEGORIES=DEPRECATION,PERFORMANCE
- Parameters:
env_vars (dict)
- query_database(session, query, **kwargs)[source]
Executes Cypher and returns (records, counters) with retryable write semantics. Accepts either a single query string or a list of queries. Does not work with CALL {} IN TRANSACTION queries.
- Return type:
- Parameters:
session (Session)
- session(database=None)[source]
Create a new Neo4j session in write mode, caller is responsible to close the session.
- Parameters:
database – name of the database to use for this session. If not provided, the database name provided during construction will be used.
- Returns:
newly created Neo4j session.
- class QueryResult(data, summery)[source]
Bases:
NamedTupleResult of a query against the neo4j database.
- class SQLContext(database_url, pool_size=10, max_overflow=20)[source]
Bases:
object- __init__(database_url, pool_size=10, max_overflow=20)[source]
Initializes the SQL context with an SQLAlchemy engine.
- engine: Engine
- append_results(r1, r2)[source]
Appends two QueryResult objects, summing the values for duplicate keys in the summary.
- Parameters:
r1 (
QueryResult) – The first QueryResult object.r2 (
QueryResult) – The second QueryResult object to append.
- Return type:
- Returns:
A new QueryResult object with combined data and summed summary counts.