phpcpd-next scans your PHP code and reports blocks that have been copy/pasted from one place to another — the kind of duplication that's easy to miss in review and painful to keep in sync later.
It's maintained by Luciano Federico Pereira as a successor to Sebastian Bergmann's phpcpd (archived), and stays a drop-in replacement with the same phpcpd command.
What's new is that it catches more than word-for-word copies: it also flags duplicates where the lines were reordered, or where a statement was added or removed between two otherwise identical blocks:
- Three detection engines — Rabin-Karp (exact), TokenBag (reordered), and an opt-in suffix tree (gapped Type-3), with rename-insensitive
--fuzzymatching on top - Four output formats — console text, PMD-CPD XML, JSON, and SARIF 2.1.0 for GitHub Code Scanning
- A headless API for calling detection in-process, plus a PHPUnit trait that turns duplication into a test assertion
- CI features — meaningful exit codes, full result caching, and per-file incremental indexing
- Framework presets including Laravel, with CLI flags that override preset defaults
- PHP 8.5+ with zero Composer runtime dependencies and deterministic results
Three Engines, Run Together by Default
Most copy/paste detectors only find exact duplication. phpcpd-next runs Rabin-Karp (exact contiguous matches) and TokenBag (order-invariant overlap, so shuffled statements still register) together on every default run. A suffix-tree engine for gapped clones — where a statement was inserted or removed between otherwise identical blocks — is opt-in:
# Default: exact + reordered detectionphpcpd src/ # Rabin-Karp only (faster, no reorder detection)phpcpd --rk src/ # Gapped Type-3 clones via suffix treephpcpd --algorithm=suffixtree src/
The console output points at the duplicated ranges and suggests a refactor rather than just listing line numbers:
Found 2 code clones with 21 duplicated lines in 2 files: - app/Services/Billing.php:12-33 (21 lines) app/Services/Invoicing.php:40-61 → Consider extracting the shared lines into a reusable method or constant. 37.50% duplicated lines out of 56 total lines of code.
SARIF Output for GitHub Code Scanning
Alongside PMD-CPD XML and JSON, phpcpd-next writes SARIF 2.1.0, so clones show up in the GitHub Security tab. Inconsistent (diverged) clones map to warning severity and exact clones to note:
- name: Detect duplicated code run: vendor/bin/phpcpd --log-sarif=phpcpd.sarif src/ || true - name: Upload results uses: github/codeql-action/upload-sarif@v3 with: sarif_file: phpcpd.sarif
Headless API and PHPUnit Assertions
Beyond the CLI, detection runs in-process through a static detect() call — no shelling out, no report files:
use LucianoPereira\PhpcpdNext\Phpcpd; $clones = Phpcpd::detect( paths: 'app', minTokens: 60, algorithm: null, // null = Rabin-Karp + TokenBag preset: 'laravel',); foreach ($clones as $clone) { echo $clone->numberOfLines(), " lines\n";}
A bundled trait turns that into a test, so duplication becomes a regression check that fails with clone locations:
use LucianoPereira\PhpcpdNext\PHPUnit\AssertNoDuplication;use PHPUnit\Framework\TestCase; final class DuplicationTest extends TestCase{ use AssertNoDuplication; public function test_app_is_dry(): void { $this->assertNoDuplication(__DIR__ . '/../app', minTokens: 70); }}
Incremental Caching for CI
For larger codebases, --cache stores results keyed by a configuration fingerprint and file-manifest hash, replaying the cached result when nothing changed. --incremental goes further, re-tokenizing only changed files and reusing the rest from a per-file index (Rabin-Karp only), printing a summary like (incremental index: 412 reused, 3 scanned):
- uses: actions/cache@v4 with: path: .phpcpd-cache key: phpcpd-${{ hashFiles('**/*.php') }} restore-keys: phpcpd-- run: vendor/bin/phpcpd --incremental --cache-dir .phpcpd-cache src/
Installation
The tool requires PHP 8.5+, ext-dom, and ext-mbstring, and installs as a dev dependency:
composer require --dev phpcpd-next/phpcpdvendor/bin/phpcpd src/
A Laravel preset scans app, routes, database, and config while excluding vendor code, Blade views, migrations, and IDE-helper files:
vendor/bin/phpcpd --preset=laravel app/Services --min-tokens=60
You can find the source and full documentation on GitHub.