mottaviani-dev / laravel-reductor
ML-powered test suite optimization for Laravel - Reduce CI/CD time by identifying redundant tests
Requires
- php: ^8.1|^8.2|^8.3
- czproject/git-php: ^4.2
- illuminate/console: ^9.0|^10.0|^11.0
- illuminate/database: ^9.0|^10.0|^11.0
- illuminate/support: ^9.0|^10.0|^11.0
- phpunit/php-code-coverage: ^10.1|^11.0
- symfony/process: ^6.0|^7.0
- symfony/yaml: ^6.0|^7.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.0
- orchestra/testbench: ^7.0|^8.0|^9.0
- phpstan/phpstan: ^1.0
- phpunit/phpunit: ^10.0|^11.0
Suggests
- ext-pcov: For faster code coverage generation (recommended)
- ext-xdebug: For code coverage generation (slower than PCOV)
- pestphp/pest: For Pest PHP test framework support
README
Accelerate your Laravel test suite by identifying and eliminating redundant tests using unsupervised machine learning.
Key Features
- Research-Validated ML: Implements Sebastian et al. (2024) unsupervised machine learning methodology
- Laravel-Aware Design: Handles Laravel-specific patterns, shared bootstraps, and testing idioms
- Semantic + Coverage Analysis: Uses hybrid 640D vectors combining TF-IDF semantics and execution coverage
- Safety-First Clustering: Prevents merging tests with opposite behavior (e.g., success vs failure)
- Interactive CLI Review: Validate clusters before merging, with entropy and safety scores
- CI/CD Integration Ready: Automate detection of test duplication regressions
- Multi-Format Reports: Markdown, CSV, JSON, YAML, and HTML output options
Requirements
- PHP ≥ 8.1, Laravel ≥ 9.0
- Python ≥ 3.7 with
numpy
,scikit-learn
,scipy
- PHPUnit 10+ or Pest
- Code coverage via Xdebug or PCOV
Installation
composer require --dev reductor/laravel-test-reduction pip3 install numpy scikit-learn scipy php artisan reductor:install
Quick Start
Prerequisites
Configure phpunit.xml
for proper coverage format and exclusions:
<phpunit> <!-- Required: .cov format for per-test coverage data --> <logging> <log type="coverage-php" target="coverage.cov"/> </logging> <!-- Recommended: Focus on application code only --> <coverage processUncoveredFiles="true"> <include> <directory suffix=".php">app</directory> <directory suffix=".php">src</directory> </include> <exclude> <directory suffix=".php">bootstrap</directory> <directory suffix=".php">config</directory> <directory suffix=".php">database</directory> <directory suffix=".php">routes</directory> <directory suffix=".php">storage</directory> <directory suffix=".php">tests</directory> <directory suffix=".php">vendor</directory> </exclude> </coverage> </phpunit>
Steps
- Generate Coverage:
php artisan test --coverage
- Run Redundancy Analysis:
php artisan test:reduce --cluster --coverage=coverage.cov
- Review Results:
open storage/test-reduction/redundancy_report.md
For advanced usage, interactive review, CI integration, and troubleshooting, see the docs/ folder.
Sample Output
Analysis Summary
================
Total Tests Analyzed: 118
Redundant Test Clusters: 25
Tests in Clusters: 78
Potential Test Reduction: 53 (44.9%)
Top Redundant Clusters
Cluster #11: 8 tests (97.5% similar)
• CreateTest::it_creates_an_asset_with_valid_payload#0
• CreateTest::it_creates_an_asset_with_valid_payload#1
• CreateTest::it_creates_an_asset_with_string_custom_fields#0
... and 5 more
Cluster #12: 7 tests (98.5% similar)
• CreateTest::it_creates_an_asset_with_no_end_date_fail#0
• CreateTest::admin_creates_asset_with_past_end_at_date_fail#0
• CreateTest::admin_creates_asset_with_past_start_at_date_fail#0
... and 4 more
=== Semantic Vector Statistics ===
Non-zero vectors: 118 (100% - no extraction failures)
Average vector magnitude: 1.0
Average non-zero elements: 21.7
Advanced Usage
For advanced features, see the docs/ folder:
- Laravel Integration Guide - Weight tuning, CI integration, Git hooks
- Usage Examples - GitHub Actions, GitLab CI, programmatic usage
- Troubleshooting - Common issues and solutions
Quick Advanced Examples:
# Interactive review php artisan tests:reduce --coverage-file=coverage.cov --interactive # Custom weights php artisan tests:reduce --semantic-weight=0.8 --coverage-weight=0.2 # Different output formats php artisan tests:reduce --format=html --output-dir=reports
How It Works
- Test Coverage: Collects executed lines per test
- Semantic Vectorization: Extracts TF-IDF tokens from test methods
- MinHash Fingerprints: Builds 512-bit sparse binary vector
- Clustering: DBSCAN groups similar test cases
- Safety Checks: Assertion-aware and intent-matching logic to validate clusters
Performance Benefits
- 20-40% faster test execution
- 30-50% fewer redundant tests to maintain
- 15-30% CI pipeline time savings
Troubleshooting
Common Issues
Problem: "Only 1 test processed" instead of all tests
Solution: Ensure you're using .cov
format, not --coverage-clover
(XML). The XML format lacks per-test granularity.
Problem: "Warning: X/Y semantic vectors are zero"
Solution: This was fixed in v1.0.1+ with better parameterized test handling. Update the package.
Problem: "Python module not found" in Docker
Solution: Use the container's runtime: docker exec container_name php artisan tests:reduce
Problem: High similarity but tests aren't actually redundant
Solution: Review the generated reports carefully. High similarity doesn't always mean redundancy - especially for validation tests.
Verifying Setup
Check your configuration:
# 1. Verify .cov file exists and has content ls -la coverage.cov # 2. Verify Python dependencies python3 -c "import numpy, sklearn, scipy; print('Dependencies OK')" # 3. Test source code extraction php artisan tests:reduce --validate
Research Basis
This project is based on the systematic mapping study:
Sebastian, A., Naseem, H., & Catal, C. (2024). Unsupervised Machine Learning Approaches for Test Suite Reduction. Applied Artificial Intelligence, 38(1), e2322336.
The systematic mapping study analyzed 34 research papers and identified key patterns in unsupervised test suite reduction approaches. Laravel Reductor implements the validated methodology while extending it for production use.
Key Research Findings Applied
- Algorithm Selection: DBSCAN and K-means clustering (research-validated)
- Feature Engineering: TF-IDF semantic analysis + coverage fingerprinting
- Evaluation Metrics: Coverage preservation and test suite size reduction
- Safety Mechanisms: Multi-layered validation to prevent dangerous test removal
Laravel-Specific Adaptations
- Shared coverage filtering - Removes Laravel bootstrap/vendor code noise
- Adaptive semantic/coverage weighting - Default 70/30 split optimized for Laravel
- Assertion-aware tokenization - Special handling for
assertStatus
,assertJson
, etc. - Parameterized test handling - Strips
#0
,#1
suffixes from data providers - Cluster safety engine - Prevents merging opposing tests (
success
/fail
,null
/empty
) - Framework integration - Native Laravel service provider and Artisan commands
Research Compliance
Laravel Reductor addresses key research gaps identified in the literature:
- Scalability - Memory-efficient processing for enterprise test suites
- Artifact Availability - Complete open-source implementation
- Safety Validation - Multi-layered validation framework
- Production Integration - CI/CD pipeline support and DevOps tooling
For detailed analysis of the research alignment, see docs/RESEARCH_METHODOLOGY.md.
Acknowledgments
Special thanks to the authors of the Sebastian et al. paper for the foundational methodology and systematic analysis of the field.