maba/database-inconsistency-finder

Library that helps finding inconsistencies between database tables or different databases

0.1.0 2019-08-07 08:25 UTC

This package is not auto-updated.

Last update: 2024-11-08 09:11:43 UTC


README

Latest Version on Packagist Software License Build Status Coverage Status Quality Score Total Downloads

Library that helps to find orphaned records (if they should be removed when nothing points to them), references to non-existing records and invalid number of references to those records.

When could this be helpful?

  • you don't use foreign keys and could have invalid references. This could happen due to different reasons:
    • you have several different databases, for example when sharding or using microservices;
    • you don't use foreign keys for easier database structure migrations;
    • you have application that just does not use foreign keys;
  • you want to find orphaned records. For example, you can have Files in a database table and if nothing points to this record, we want to delete the file itself.

Normally, these restrictions would be guaranteed by your application. Unfortunately, stuff happens and there might be some inconsistencies that just occur time to time.

Installation

composer require maba/database-inconsistency-finder

Configuration and usage

$connection = DriverManager::getConnection(['url' => 'mysql://user:secret@localhost/mydb']);
$connection1 = DriverManager::getConnection(['url' => 'mysql://user:secret@db.example.org/otherdb']);

$referencesConfiguration = (new ReferencesConfiguration())
   ->setReferencedColumn(
       (new ReferencedColumn())
           ->setConnection($connection)
           ->setTableName('files')
           ->setIdColumnName('id')
           ->setReferenceNumberColumnName('reference_count')
   )
   ->addTableReferences(
       (new TableReferences())
           ->setConnection($connection)
           ->setTableName('profiles')
           ->setColumnNames(['avatar_file_id', 'cv_file_id'])
   )
   ->addTableReferences(
       (new TableReferences())
           ->setConnection($connection1)
           ->setTableName('documents')
           ->setColumnNames(['file_id'])
   )
;

$inconsistencyFinder = (new Factory())
    ->createInconsistencyFinder($referencesConfiguration)
;

$result = $inconsistencyFinder->find();

if ($result->areInconsistenciesFound()) {
    var_dump(
        $result->getOrphanedRecordIds(),
        $result->getMissingReferenceCounts(),
        $result->getInvalidReferenceCounts()
    );
}

Currently all work is done synchronously. You can configure this by implementing JobDistributorFactoryInterface and related JobDistributorInterface. In this case create service tree yourself, do not use the Factory class.

Internals

Consistency validation is performed in the following manner:

  • ID range is queried from the database (from-to IDs in the main table)
  • range is divided into separate intervals for job distribution
  • each job is given to concrete worker
  • worker validates consistency by using SUM query to the database, which is relatively fast
  • if inconsistencies are found in the interval, it's split into even smaller intervals
  • with each smaller interval SUM query is repeated
  • for those intervals where inconsistencies are found, inconsistency seeking algorithm is ran

Inconsistency seeking algorithm

  • all IDs and their corresponding reference counts are fetched from the database
  • all related tables are iterated over and all IDs are fetched from there
  • fetched data is looped to find any inconsistencies

These actions are performed in-memory, so it's essential that interval in this stage would be already quite small.

Consistency validation

Consistency is validated by issuing SUM queries to database. To avoid false positives, we select not the sum of reference counts, but sum of CRC32 of referenced IDs (and sum them that many times how many times they were referenced).

Semantic versioning

This library follows semantic versioning.

See Symfony BC rules for basic information about what can be changed and what not in the API.

Running tests

composer update
cd docker
docker-compose up -d
docker exec -it database_inconsistency_finder_test_php bin/phpunit
docker-compose down

Contributing

Feel free to create issues and give pull requests.

You can fix any code style issues using this command:

composer fix-cs