shipmonk / copy-paste-detector
Finds duplicated PHP code structures using AST-based analysis, inspired by CloneDR to detect Type-2 (parameterized) code clones.
dev-master
2026-05-12 10:21 UTC
Requires
- php: ^8.1
- nikic/php-parser: ^4.19 || ^5.0
- symfony/console: ^5.4 || ^6.0 || ^7.0 || ^8.0
Requires (Dev)
- editorconfig-checker/editorconfig-checker: ^10.7.0
- ergebnis/composer-normalize: ^2.19.0
- phpstan/phpstan: 2.1.46
- phpstan/phpstan-phpunit: ^2.0
- phpstan/phpstan-strict-rules: ^2.0
- phpunit/phpunit: ^10.5.62
- sebastian/diff: ^4.0 || ^5.0 || ^6.0 || ^7.0
- shipmonk/coding-standard: ^0.2.1
- shipmonk/composer-dependency-analyser: ^1.8
- shipmonk/coverage-guard: ^1.0
- shipmonk/dead-code-detector: ^1.0
- shipmonk/name-collision-detector: ^2.1.1
- shipmonk/phpstan-rules: ^4.3
Suggests
- sebastian/diff: to use the --patch option to filter clones by a git diff
This package is auto-updated.
Last update: 2026-05-12 10:21:50 UTC
README
An AST-based structural code clone detector for PHP, inspired by the CloneDR methodology. This tool efficiently detects Type-2 (parameterized) code clones using AST analysis and hash-based exact matching.
Installation
composer require --dev shipmonk/copy-paste-detector
Basic Usage
vendor/bin/copy-paste-detector src/
CLI Options
-
--config=config.phpor-c config.php- Path to configuration file
- Defaults to
copy-paste-detector.phpin current directory - Configuration file must return a
CopyPasteDetector\Config\Configinstance
-
--min-node-count=100or-m 100- Minimum number of AST nodes for a subtree to be considered
- Defaults to 50
-
--cache-dir=cache/- Directory for caching parsed structures
- Defaults to system temp directory
-
--patch=changes.patch- Path to a git diff/patch file (extension must be
.patchor.diff) - Only reports clone groups with at least one instance fully inside the patch's added lines.
- Requires
sebastian/diffto be installed.
- Path to a git diff/patch file (extension must be
-
--ansi/--no-ansi- Force enable or disable ANSI color output
- By default, colors are auto-detected based on the terminal
Check if MR copied code from elsewhere
git diff master...HEAD > changes.patch
vendor/bin/copy-paste-detector --patch=changes.patch src/ tests/
- A clone group is reported if at least one instance lies fully inside the patch's added lines. The other instances may be either elsewhere in the codebase or also inside the patch.
Configuration File
Create a copy-paste-detector.php file in your project root to configure detection settings:
<?php use CopyPasteDetector\Config\Config; $config = new Config(); // Set paths to analyze $config->setPaths(['src/', 'tests/']); // Set the minimum node count for clone detection $config->setMinNodeCount(50); // Set cache directory (optional, defaults to system temp directory) $config->setCacheDir('cache/copy-paste-detector/'); // Exclude paths from analysis $config->setExcludePaths(['tests/_fixtures', 'src/Generated/']); // Enable clickable links to your IDE $config->setEditorUrl('phpstorm://open?file={file}&line={line}'); // Configure anonymization strategies $config->setAnonymizeVariables(true); // treat variable names like `$foo` and `$bar` as equivalent $config->setAnonymizeLiterals(false); // treat string and number literals as equivalent $config->setAnonymizeNames(false); // treat function and class names as equivalent $config->setAnonymizeIdentifiers(false); // treat method and constant names as equivalent return $config;
Contributing
- Check your code by
composer check - Autofix coding-style by
composer fix:cs - All functionality must be tested