k-kinzal / sql-faker
Faker Provider for generating syntactically valid SQL statements
Requires
- php: ^8.1
- fakerphp/faker: ^1.23
Requires (Dev)
- brianium/paratest: ^6.11 || ^7
- friendsofphp/php-cs-fixer: ^3.92
- infection/infection: ^0.28 || ^0.29 || ^0.30 || ^0.31 || ^0.32
- k-kinzal/testcontainers-php: ^0.5.1
- nikic/php-fuzzer: ^0.0.11
- phpbench/phpbench: ^1.4
- phpstan/phpstan: ^2.1
- phpstan/phpstan-strict-rules: ^2.0
- phpunit/phpunit: ^10.5 || ^11 || ^12 || ^13
README
A FakerPHP provider for generating syntactically valid MySQL SQL statements based on MySQL's official Bison grammar (sql_yacc.yy).
Overview
SQL Faker uses formal grammar derivation to generate random but syntactically valid MySQL SQL. It parses MySQL's actual Bison grammar definition and uses it to produce any MySQL statement type (DML, DDL, TCL, etc.) and SQL fragments (expressions, clauses, subqueries, CTEs).
- Grammar-accurate - Derived from MySQL's official
sql_yacc.yy, not hand-written rules - Version-aware - Supports MySQL 5.6 through 9.1, generating only syntax valid for the target version
- Depth-controllable -
maxDepthparameter controls SQL complexity from simple to arbitrarily nested - Fragment generation - Generate not just full statements but individual expressions, clauses, and subqueries
- FakerPHP integration - Standard Faker provider pattern, works with seeded generators for reproducibility
Requirements
- PHP 8.1 or higher
- fakerphp/faker ^1.23
Installation
composer require --dev k-kinzal/sql-faker
Usage
Basic Usage
use Faker\Factory; use SqlFaker\MySqlProvider; $faker = Factory::create(); $faker->addProvider(new MySqlProvider($faker)); // Generate random SQL statements $faker->sql(); // Any valid MySQL statement $faker->selectStatement(); // SELECT statement $faker->insertStatement(); // INSERT statement $faker->updateStatement(); // UPDATE statement $faker->deleteStatement(); // DELETE statement
Statement Types
// DML $faker->selectStatement(); $faker->insertStatement(); $faker->updateStatement(); $faker->deleteStatement(); $faker->replaceStatement(); $faker->truncateStatement(); // DDL $faker->createTableStatement(); $faker->alterTableStatement(); $faker->dropTableStatement(); $faker->createIndexStatement(); $faker->dropIndexStatement(); // TCL $faker->beginStatement(); $faker->commitStatement(); $faker->rollbackStatement(); // Any simple statement $faker->simpleStatement();
SQL Fragments
Generate individual SQL components for targeted testing:
// Expressions $faker->expr(); // Any expression $faker->simpleExpr(); // Simple expression $faker->literal(); // Literal value $faker->predicate(); // Predicate (comparison) // Clauses $faker->whereClause(); // WHERE clause $faker->orderClause(); // ORDER BY clause $faker->limitClause(); // LIMIT clause // Table references $faker->tableReference(); // Table reference $faker->joinedTable(); // Joined table expression $faker->tableIdent(); // Table identifier // Subqueries and CTEs $faker->subquery(); // Subquery $faker->withClause(); // CTE (WITH clause)
Terminal Generators
Generate individual lexical tokens:
$faker->identifier(); // e.g., "t1", "col42" $faker->quotedIdentifier(); // e.g., "`my_table`" $faker->stringLiteral(); // e.g., "'abc123'" $faker->nationalStringLiteral(); // e.g., "N'abc'" $faker->integerLiteral(); // e.g., "42" $faker->longIntegerLiteral(); // e.g., "2147483647" $faker->unsignedBigIntLiteral(); // e.g., "18446744073709551615" $faker->decimalLiteral(); // e.g., "123.45" $faker->floatLiteral(); // e.g., "1.23e10" $faker->hexLiteral(); // e.g., "0xdeadbeef" $faker->binaryLiteral(); // e.g., "0b1010"
Controlling Complexity
Use maxDepth to control the complexity of generated SQL. Lower values produce simpler statements:
$faker->selectStatement(maxDepth: 3); // Simple SELECT $faker->selectStatement(maxDepth: 6); // Moderate SELECT $faker->selectStatement(); // Complex SELECT (unlimited depth)
The generator uses shortest-path termination: once the target depth is reached, it selects the shortest production alternative at each step to terminate quickly.
Specifying MySQL Version
// Use specific MySQL version (default: mysql-8.4.7) $faker->addProvider(new MySqlProvider($faker, 'mysql-5.7.44'));
Supported versions:
| Version | Tag |
|---|---|
| MySQL 5.6 | mysql-5.6.51 |
| MySQL 5.7 | mysql-5.7.44 |
| MySQL 8.0 | mysql-8.0.44 |
| MySQL 8.1 | mysql-8.1.0 |
| MySQL 8.2 | mysql-8.2.0 |
| MySQL 8.3 | mysql-8.3.0 |
| MySQL 8.4 | mysql-8.4.7 (default) |
| MySQL 9.0 | mysql-9.0.1 |
| MySQL 9.1 | mysql-9.1.0 |
Reproducible Generation
Use a seeded Faker generator for reproducible SQL output:
$faker = Factory::create(); $faker->seed(12345); $faker->addProvider(new MySqlProvider($faker)); // Same seed always produces the same SQL $sql = $faker->selectStatement(maxDepth: 6);
How It Works
SQL Faker implements formal grammar derivation (leftmost derivation):
- Grammar Loading - Pre-compiled MySQL Bison grammar is loaded from serialized AST
- Derivation - Starting from a non-terminal (e.g.,
select_stmt), the generator repeatedly replaces the leftmost non-terminal with a randomly chosen production alternative - Depth Control - Before
maxDepth, alternatives are chosen randomly; at/aftermaxDepth, the shortest alternative is selected to terminate quickly - Terminal Rendering - Terminal symbols are rendered to SQL strings with proper spacing and syntax
select_stmt
→ SELECT select_item_list FROM table_reference
→ SELECT expr FROM table_ident
→ SELECT simple_expr FROM IDENT
→ SELECT NUM FROM IDENT
→ SELECT 42 FROM t1
Development
# Run tests composer test # Run linter (PHP-CS-Fixer + PHPStan level max) composer lint # Run fuzz tests composer fuzz # Build grammar AST from MySQL's sql_yacc.yy composer build-mysql # Fix code style composer format
License
MIT License. See LICENSE for details.