phparrot/parrot

Rule-based generation of fixture databases by sampling operational databases. Forked from maple-syrup-group/dbsampler

Installs: 8

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 1

Forks: 0

Open Issues: 8

Type:project

0.0.1 2021-02-10 10:36 UTC

This package is auto-updated.

Last update: 2024-12-28 22:14:45 UTC


README

A general tool for extracting and cleaning selected tables from a database for use as fixtures. Copies a subset of tables from one database to another under the control of a json configuration file. The latter database can then be dumped to SQL for use as a fixture file.

Why PHParrot?

When we visited our nan at weekends as kids, my brother and I would wait until all the adults were out of the room and then endlessly repeat all the (admittedly very mild) swear words we knew at her pet parrot, in the hope that it would copy us.

As this tool copies (parrots) text from one place to another (and because this seemed like a perfect opportunity to give a project one of those names with 'PHP' stuffed into it), 'PHParrot' kind-of seemed appropriate.

Aside: Nan's parrot never repeated any words we tried teaching it, but it had learned to belch (my uncle's influence).

Usage

  • Create the target database(s) you wish to fill. The tool will only output to existing databases. The content of the Destination Databases will be trashed.
  • Create config/credentials.json with the DB server configuration.
  • Create config/*.db.json files to define mappings for each required database
  • Run bin/dbsampler.php

Configuration formats

All config files live in the config subdirectory. Files cannot contain comments as the JSON format does not support this, but as a convention, fields called "comment" will be ignored where possible.

credentials.json

"driver": "pdo_mysql" is currently assumed but this may change in future.

MySQL

See config/credentials.dist.json:

{
  "driver": "pdo_mysql",
  "dbUser": "root",
  "dbPassword": "SOMEPASSWORD",
  "dbHost": "127.0.0.1"
}

If you need different source and dest servers, this becomes:

{
  "source": {
    "driver": "pdo_mysql",
    "dbUser": "root",
    "dbPassword": "SOMEPASSWORD",
    "dbHost": "sourceDB.example.com"
  },

  "dest" : {
    "driver": "pdo_mysql",
    "dbUser": "root",
    "dbPassword": "SOMEPASSWORD",
    "dbHost": "127.0.0.1"
  }
}

If you need to prepare the connections, add an initialSql stanza:

{
  "source": {
    "driver": "pdo_mysql",
    "dbUser": "root",
    "dbPassword": "SOMEPASSWORD",
    "dbHost": "sourceDB.example.com",
    "initialSql": [
      "SET NAMES UTF8"
    ]
  },
  "dest": {
    "driver": "pdo_mysql",
    "dbUser": "root",
    "dbPassword": "SOMEPASSWORD",
    "dbHost": "127.0.0.1",
    "initialSql": [
      "SET NAMES UTF8",
      "SET foreign_key_checks = 0"
    ]
  }
}        
Sqlite

See config/credentials.dist.json:

{
    "driver": "pdo_sqlite",
    "directory": "..\/path\/to\/sqlite-dbs"
}

Paths are assumed to be relative to the config file unless they start with a '/'. Sqlite databases to be migrated are assumed to be *.sqlite files in this directory

dbname.db.json

{
  "name": "small-sqlite-test",          # Configuration name
  "sourceDb": "small-source",           # Name of the source DB
  "destDb": "small-dest",               # Name of the destination DB. This DB will get trashed
  "tables": {                           # A set of tables to be copied over. Each table is defined as "table": config
                                        # Every config stanza requires a sampler field. For now, look these up in 
                                        # \PHParrot\Parrot\MigrationConfigProcessor::$samplerMap
                                        # All other fields depend on the specific sampler being used; these should 
                                        # all be documented in their own class files in src/Sample
    "fruits": {
      "sampler": "matched",
      "constraints": {
        "name": [
          "apple",
          "pear"
        ]
      },
      "remember": {
        "id": "fruit_ids"               # Cross-referencing is supported by "remember" stanzas
                                        # These take the field name of which the values are to be remembered
                                        # matched to a variable name in which the values will be stored
                                        # Note: Variable declarations do not include a '$' symbol 
                                        # References MUST be 'remember'ed before being used, there is no
      }                                 # dependency resolution here, so order your config appropriately
    },
    "vegetables": {
      "sampler": "NewestById",
      "idField": "id",
      "quantity": 2
    },
    "fruit_x_basket": {
      "sampler": "matched",
      "constraints": {
        "fruit_id": "$fruit_ids"        # Remembered variables, with $ sign, can be used as cross-references
                                        # This will expand to all ids of the fruits table matched above
      },
      "where" : [
        "basket_id > 1"                 # The matched sampler can also accept a list of arbitrary WHERE clauses
      ],
      "remember": {
        "basket_id": "basket_ids"
      }
    },
    "baskets": {
      "sampler": "matched",             # samplers support field cleaners that are defined in
                                        # \PHParrot\Parrot\FieldCleanerProvider::getCleanerByName
                                        # They modify or replace the content of the field that they are keyed to
      "constraints": {
        "id": "$basket_ids"
      },
      "cleanFields": {
        "name": "fakefullname"
      }
    }
  },
  "views": [                            # view support is experimental
    "some_view"                         # views are specified as name only but format may change
  ]                                     # The destination's CURRENT_USER() is used as the DEFINER for MySQL DBs
}
"Faker" cleaners

Any 'faker' (fzaninotto/faker) generator that does not require parameters can be used directly in the cleanFields stanza by using "name": "faker:GENERATOR", eg:

 "cleanFields": {
   "ip": "faker:ipv4"
 },

Extending the project

The tool is designed to be extended primarily by adding custom Samplers (which must implement \PHParrot\Parrot\Sampler\Sampler) and cleaners (which must implement \PHParrot\Parrot\Cleaner\FieldCleaner ànd be registered with \PHParrot\Parrot\Cleaner\RowCleaner::registerCleaner).

Currently, only mysql and sqlite databases are supported, but this could also be extended.

This tool does not convert between database drivers. This means that if your source database is MySQL, then the destination database must also be MySQL.