vangelis/repophp

RepoPHP is a PHP package that packs a repository into a single AI-friendly file for LLM processing.

0.7.2 2025-03-03 21:54 UTC

This package is auto-updated.

Last update: 2025-04-04 20:44:52 UTC


README

RepoPHP

Latest Version on Packagist Tests Total Downloads

RepoPHP is a PHP package that packs a repository into a single AI-friendly file for LLM processing. Similar to [repomix](https://github.com/yamadashy/repomix)

Demo

RepoPHP Demo

Installation

You can install the package via composer:

composer require vangelis/repophp --dev

Configuration File

RepoPHP supports a configuration file to set default options. Create one of these files in your project directory:

  • .repophp.json
  • repophp.json
  • .repophp.config.json
  • repophp.config.json

Example configuration file:

{
  "repository": "/path/to/repository",
  "output": "packed_repo.txt",
  "format": "markdown",
  "encoding": "cl100k_base",
  "exclude": [".env.local", "*.log"],
  "no-gitignore": false,
  "compress": true,
  "max-tokens": 100000,
  "remote": false,
  "branch": "main",
  "incremental": false,
  "base-file": null
}

With this configuration file in place, you can simply run:

vendor/bin/repophp pack

Command-line arguments will override settings from the configuration file.

Usage

Pack Command Usage

You can use the pack command to package a local repository directory into a single file, suitable for processing by AI-based systems.

Available Options for the pack Command:

Required Arguments:

  • output: The path to the output file where the packed content will be stored.

  • repository: The path to the repository directory that you want to pack or a remote repository URL if used with the --remote flag.

Optional Flags and Settings:

  • --remote, -rem: Treat the repository argument as a remote Git repository URL.

  • --branch , -bra (default: main): Branch to checkout when cloning a remote repository.

  • --format <plain|markdown|json|xml>, -fmt (default: plain): Specifies the format of the output file. Supported formats:

    • plain: Plain text format.
    • markdown: Markdown format for better readability.
    • json: JSON format for structured data.
    • xml: XML format for structured data.
  • --encoding , -enc (default: p50k_base): Token encoding to use (cl100k_base, p50k_base, r50k_base, p50k_edit).

  • --exclude , -exc : Additional file patterns to exclude during the packing process. These patterns are added to the default exclusions (e.g., .env, composer.lock, etc.). This option can be used multiple times to add multiple patterns.

  • --max-tokens , -max (default: 0): Maximum number of tokens per output file. If set to a positive number, the repository will be split into multiple files when the token limit is reached. Set to 0 (default) for no limit.

  • --no-gitignore, -nog: If this flag is provided, .gitignore files will not be used to exclude files.

  • --compress, -com: Remove comments and empty lines from files for more compact output.

Example Usage

Local repository:

vendor/bin/repophp pack output.txt /path/to/repository --format=json --exclude="*.log" --exclude=".env.local" --no-gitignore --compress  

Remote Repository:

vendor/bin/repophp pack output.txt https://github.com/username/repository.git --remote --branch develop

With short option names:

vendor/bin/repophp pack output.txt /path/to/repo -fmt json -com -enc cl100k_base

Complex example:

vendor/bin/repophp pack output.txt https://github.com/vangelis183/repophp.git --remote --branch main --compress --encoding cl100k_base --format plain

Split large repository by token count:

vendor/bin/repophp pack output.txt /path/to/repo --max-tokens=100000 --encoding cl100k_base

Incremental packing (diff mode):

vendor/bin/repophp pack output.txt /path/to/repo --incremental --base-file=/path/to/previous/pack.txt

This will create a diff file containing only files that have changed since the previous pack, which is especially useful for large repositories where you only want to analyze recent changes.

Breakdown:

  • Packs the repository located at /path/to/repository or clones the remote repository URL.
  • Stores the packed content in /path/to/output.txt.
  • Uses the specified output format (json, plain, markdown, or xml).
  • Excludes files matching the specified patterns.
  • Option to ignore .gitignore rules.
  • Option to compress the output by removing comments and empty lines.
  • For remote repositories, checks out the specified branch.
  • Uses the specified token encoding for calculating token counts.

Additional Behavior

  • Overwrite Handling:
    If the output file already exists, you will be prompted to confirm whether you want to overwrite the file. If you choose not to overwrite, a new file will be created with a timestamp appended to its name.

  • Supported Formats:
    The following formats are supported (as defined in RepoPHP):

    • plain
    • markdown
    • json
    • xml
  • Default Exclusions:
    Some files are excluded automatically during the packing process (e.g., .env, composer.lock, and other commonly ignored files). The list of default exclusions can be found in the RepoPHP class.

Error Handling

The pack command gracefully handles errors such as:

  • Invalid repository paths.
  • Invalid output paths.
  • Unsupported output formats.
  • Failures in creating or writing the output file.

If any error occurs, an appropriate error message will be shown in the console.

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

ToDos

  • Move settings to configuration
  • Git Repositiory Information
  • Directory structure
  • More tests
  • Token Count for each file and entire repo
  • Consider different encodings
  • Add compression (Comments etc.)
  • Add option for remote Git Repositories
  • Add option for specific branch
  • Add repository splitting for large codebases
  • Implement incremental/diff-based packing
  • Add editable custom config to override defaults
  • Add security checks for files (Keys, Passwords etc.)
  • Create advanced filtering options (by date, content)
  • Add repository analytics and metrics
  • Implement model-specific optimization profiles
  • Develop CI/CD integration options
  • Build interactive CLI mode

Ideas

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

If you've found a bug regarding security please use the issue tracker.

Credits

License

The MIT License (MIT). Please see License File for more information.