tanoconsulting / datavalidatorbundle
A tool to validate Data stored in Databases and Filesystems
Fund package maintenance!
tanoconsulting
Installs: 208
Dependents: 1
Suggesters: 0
Security: 0
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Type:symfony-bundle
Requires
- php: ^7.3|^8.0
- ext-json: *
- doctrine/dbal: ^2.11|^3.0
- symfony/console: ^5.0.0
- symfony/dependency-injection: ^5.0.0
- symfony/event-dispatcher: ^5.0.0
- symfony/finder: ^5.0.0
- symfony/yaml: ^5.0.0
Requires (Dev)
- phpunit/phpunit: ^8.5.12
- symfony/validator: ^5.0.0
Suggests
- ext-pcntl: To allow gracefully exiting long running validation commands
This package is auto-updated.
Last update: 2025-01-10 23:40:37 UTC
README
Goals
Allow checking integrity of data in a database, going beyond what the database schema definition enforces.
Allow checking the integrity of a set of files (WIP).
Usecase
There are many scenarios in which usage of the constraints configured in a database schema is not sufficient to enforce data integrity, such as f.e.:
- the db engine in use not supporting advanced/complex data validation constraints
- the db engine in use does support advanced data validation constraints, but those are not being used
- data integrity constraints which are too complex to express easily using the db engine
- db native constraints having been disabled for speed during mass import operations
- constraints having been implemented in application code, with multiple apps writing to the database
In all those cases, a separate tool which can validate that the data stored in the database adheres to a set of rules can come in handy.
Requirements
- php 7.3 or later
- a database supported by Doctrine DBAL (2.11 or 3.0 or later)
- Symfony components: see
composer.json
Quick start
-
the set of constraints can be defined in a yaml or json file. This sample shows the supported syntax, using yaml:
constraints: - ForeignKey: child: ezapprove_items: collaboration_id parent: ezcollab_item: id - ForeignKey: child: ezbinaryfile: [contentobject_attribute_id, version] parent: ezcontentobject_attribute: [id, version] - ForeignKey: child: ezcontentobject: id parent: ezcontentobject_version: contentobject_id except: 'ezcontentobject.status = 1 AND ezcontentobject_version.status = 1' - Query: name: classes_with_same_identifier sql: 'SELECT identifier, COUNT(*) AS identical_identifiers FROM ezcontentclass WHERE version = 0 GROUP BY identifier HAVING COUNT(*) > 1' # skip the validation of this constraint in a silent manner if the table is missing by using the line below: requires: {table: ezcontentclass}
-
run the validation command
php bin/console datavalidator:validate:database --config-file=<my_schema_constraints.yaml>
This presumes that your application has set up a database connection configuration doctrine named
default
. If that is not the case, you can run:php bin/console datavalidator:validate:database --config-file=<my_schema_constraints.yaml> --database=<mysql://user:pwd@localhost/mydb>
If you want to list the validations constraints without validating them run:
php bin/console datavalidator:validate:database --config-file=<my_schema_constraints.yaml> --dry-run
By default the results show the number of database rows found which violate each constraint. To see the data of those rows instead, use:
php bin/console datavalidator:validate:database --config-file=<my_schema_constraints.yaml> --display-data
Constraints currently supported
- foreign key definitions
- custom sql queries
See the doc/samples folder for examples constraints of well-known applications' database schemas.
More advanced usage
Defining validation constraints in your code
Instead of using a dedicated configuration file on the command line, you can configure the validation constraints in code, either:
- by setting a value to configuration parameter
data_validator.constraints.database
, or - by tagging services with the
data_validator.constraint_provider.database
tag. Those services will have to implement a public methodgetConstraintDefinitions()
that returns all the relevant constraints definitions
Creating your own constraint types
WIP
Troubleshooting
-
Use the
-v
command line option to see details of execution -
If the execution of the constraint validation is taking a long time, you can use CTRL-C to stop execution halfway: the script will exit gracefully printing any violation found up to that point
-
To avoid excessive memory usage from large queries, when running Symfony in "debug mode", such as commonly for "dev" envs, add the
--no-debug
option to your commands. If possible, use a non-debug Symfony env. -
If you still get an 'allowed memory size' fatal error, run the commands with
php -d memory_limit=-1
.
FAQ
- Q: can I run the validations in a Controller or Event instead of a cli command? A: technically yes, but it is generally not recommended, as the database queries used for validating the whole data set might take long to execute
Thanks
Code based on the Symfony/Validator component; thanks to all its developers!