php-etl/sql-plugin

This plugin allows you to perform SQL queries in the ETL pipelines

Installs: 3 013

Dependents: 0

Suggesters: 1

Security: 0

Stars: 0

Watchers: 2

Forks: 1

Open Issues: 1

Type:gyroscops-plugin

v0.4.4 2023-07-10 07:44 UTC

README

Mutations PHPUnit Quality PHPStan level 5 PHPStan level 6 PHPStan level 7 PHPStan level 8 PHP

What is it ?

The SQL plugin allows you to write your own SQL queries and use them into the Pipeline stack.

SQL, Structured Query Language, is a language for manipulating databases.

Installation

composer require php-etl/sql-plugin

Usage

Database connection

The SQL plugin uses the PDO extension and relies on its interface to access databases using the dsn, username and password parameters.

This connection must be present in any case, whether it be when defining the extractor, loader or lookup.

connection:
    dsn: 'mysql:host=127.0.0.1;port=3306;dbname=kiboko'
    username: username
    password: password

It is possible to specify options at the time of this connection using options. Currently, it is only possible to specify if the database connection should be persistent.

connection:
    # ...
    options:
      persistent: true

Building an extractor

sql:
  extractor:
    query: 'SELECT * FROM table1'
  connection:
    dsn: 'mysql:host=127.0.0.1;port=3306;dbname=kiboko'
    username: username
    password: password

Building a lookup

sql:
  lookup:
    query: 'SELECT * FROM table2 WHERE bar = foo'
    merge:
      map:
        - field: '[options]'
          expression: 'lookup["name"]'
  connection:
    dsn: 'mysql:host=127.0.0.1;port=3306;dbname=kiboko'
    username: username
    password: password

Building a loader

sql:
  loader:
    query: 'INSERT INTO table1 VALUES (bar, foo, barfoo)'
  connection:
    dsn: 'mysql:host=127.0.0.1;port=3306;dbname=kiboko'
    username: username
    password: password

Advanced Usage

Using params in your queries

Thanks to the SQL plugin, it is possible to write your queries with parameters.

If you write a prepared statement using named parameters (:param), your parameter key in the configuration will be the name of your parameter without the : :

sql:
  loader:
    query: 'INSERT INTO table1 VALUES (:value1, :value2, :value3)'
    parameters:
      - key: value1
        value: '@=input["value1"]'
      - key: value2
        value: '@=input["value3"]'
      - key: value3
        value: '@=input["value3"]'
    # ... 

If you are using a prepared statement using interrogative markers (?), your parameter key in the configuration will be its position (starting from 1) :

sql:
  loader: 
    query: 'INSERT INTO table1 VALUES (?, ?, ?)'
    parameters:
      - key: 1
        value: '@=input["value1"]'
      - key: 2
        value: '@=input["value3"]'
      - key: 3
        value: '@=input["value3"]'
  # ... 

Creating before and after queries

In some cases, you may need to run queries in order to best prepare for the execution of your pipeline.

Before queries

Before queries will be executed before performing the query written in the configuration. Often, these are queries that set up the database.

sql:
  before:
    queries:
      - 'CREATE TABLE foo (id INTEGER NOT NULL, value VARCHAR(255) NOT NULL)'
      - 'INSERT INTO foo (id, value) VALUES (1, "Lorem ipsum dolor")'
      - 'INSERT INTO foo (id, value) VALUES (2, "Sit amet consecutir")'
  # ...

After queries

After queries will be executed after performing the query written in the configuration. Often, these are queries that clean up the database.

sql:
  after:
    queries:
      - 'DROP TABLE foo'
  # ...