ivanbaric / sanigen
Declarative sanitization and attribute generators for Eloquent models.
Installs: 1
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/ivanbaric/sanigen
Requires
- php: ^8.2
- illuminate/support: ^12.0
Requires (Dev)
- laravel/pint: ^1.0
- orchestra/testbench: ^10.0
- pestphp/pest: ^3.8
- phpstan/phpstan: ^1.10
README
Sanigen is a powerful Laravel package that provides declarative sanitization and attribute generation for Eloquent models. With Sanigen, you can:
- Automatically sanitize model attributes using predefined or custom sanitizers
- Generate values for model attributes during creation
- Define reusable sanitization pipelines through configuration aliases
- Maintain clean, consistent data across your application with minimal effort
Table of Contents
- Installation
- Basic Configuration
- Quick Start
- Sanitizers
- Generators
- Configuration
- Advanced Usage
- Contributing
- License
Installation
Requirements
- PHP 8.2 or higher
- Laravel 12.0 or higher
Via Composer
composer require ivanbaric/sanigen
The package will automatically register its service provider if you're using Laravel's package auto-discovery.
Publish Configuration
Publish the configuration file to customize sanitizers, generators, and other settings:
php artisan vendor:publish --provider="IvanBaric\Sanigen\SanigenServiceProvider" --tag="config"
This will create a config/sanigen.php file in your application.
Basic Configuration
After publishing the configuration, you can customize the package behavior in config/sanigen.php:
return [ // Enable or disable the package functionality 'enabled' => true, // Define sanitization aliases (pipelines of sanitizers) 'sanitization_aliases' => [ 'text:clean' => 'trim|strip_tags|remove_newlines|single_space', // ... more aliases ], // Configure allowed HTML tags for sanitizers that strip tags 'allowed_html_tags' => '<p><b><i><strong><em><ul><ol><li><br><a><h1><h2><h3><h4><h5><h6><table><tr><td><th><thead><tbody><code><pre><blockquote><q><cite><hr><dl><dt><dd>', // Set default encoding for sanitizers 'encoding' => 'UTF-8', ];
Quick Start
Add the Sanigen trait to your Eloquent model and define sanitization rules and generators:
use Illuminate\Database\Eloquent\Model; use IvanBaric\Sanigen\Traits\Sanigen; class Post extends Model { use Sanigen; // Define attributes to be generated on model creation protected $generate = [ 'slug' => 'slugify:title', 'uuid' => 'uuid', ]; // Define sanitization rules for attributes protected $sanitize = [ 'title' => 'text:title', 'content' => 'text:secure', 'email' => 'email:clean', ]; }
Now, when you create or update a Post model:
// Creating a new post $post = Post::create([ 'title' => ' My First Post ', 'content' => '<script>alert("XSS")</script><p>This is my content</p>', 'email' => ' USER@EXAMPLE.COM ', ]); // The model will automatically: // 1. Generate a slug: 'my-first-post' // 2. Generate a UUID // 3. Sanitize the title: 'My first post' // 4. Sanitize the content by removing the script tag: '<p>This is my content</p>' // 5. Sanitize the email: 'user@example.com'
Sanitizers
Sanitizers clean and transform string values to ensure data consistency and security.
Using Sanitizers in Models
Define sanitization rules in your model using the $sanitize property:
protected $sanitize = [ 'attribute_name' => 'sanitizer_name', 'another_attribute' => 'sanitizer1|sanitizer2|sanitizer3', 'complex_attribute' => 'text:secure', // Using a predefined alias ];
Sanitization is automatically applied when creating a model (if using the Sanigen trait) and when updating a model (always).
Available Sanitizers
Sanigen includes many built-in sanitizers for common use cases:
| Sanitizer | Description | Example |
|---|---|---|
alpha_dash |
Keeps letters, numbers, hyphens, underscores | "Hello-123_!" → "Hello-123_" |
alpha_only |
Keeps only letters | "Hello123" → "Hello" |
alphanumeric_only |
Keeps only letters and numbers | "Hello123!" → "Hello123" |
ascii_only |
Keeps only ASCII characters | Removes non-ASCII Unicode characters |
decimal_only |
Keeps only digits and decimal point | "Price: $123.45" → "123.45" |
email |
Sanitizes email addresses | " USER@EXAMPLE.COM " → "user@example.com" |
emoji_remove |
Removes all emoji characters | Strips Unicode emoji blocks |
escape |
Converts special characters to HTML entities | "<script>" → "<script>" |
htmlspecialchars |
HTML5-compatible special char conversion | Similar to escape but with HTML5 support |
json_escape |
Escapes characters for JSON | Escapes quotes, backslashes, etc. |
lower |
Converts to lowercase | "Hello" → "hello" |
no_html |
Removes all HTML tags | "<p>Hello</p>" → "Hello" |
no_js |
JavaScript removal and XSS protection | Removes scripts, event handlers, alert() functions, etc. |
numeric_only |
Keeps only digits | "Price: $123.45" → "12345" |
phone |
Sanitizes phone numbers (E.164) | "(123) 456-7890" → "+1234567890" |
remove_newlines |
Removes all line breaks | Converts multi-line text to single line |
single_space |
Normalizes whitespace to single spaces | "Hello World" → "Hello World" |
slug |
Creates URL-friendly slug | "Hello World" → "hello-world" |
strip_tags |
Removes HTML tags except allowed ones | "<script>Hello</script>" → "Hello" |
trim |
Removes whitespace from beginning and end | " Hello " → "Hello" |
ucfirst |
Capitalizes first character | "hello" → "Hello" |
upper |
Converts to uppercase | "hello" → "HELLO" |
url |
Ensures URLs have a protocol | "example.com" → "https://example.com" |
Sanitization Aliases
One of the most powerful features of Sanigen is the ability to define sanitization aliases - reusable pipelines of sanitizers that can be applied together.
The package comes with many predefined aliases in the configuration:
// Example aliases from config/sanigen.php 'sanitization_aliases' => [ 'text:clean' => 'strip_tags|remove_newlines|trim|single_space', 'text:secure' => 'no_html|no_js|emoji_remove|trim|single_space', 'text:title' => 'no_html|no_js|emoji_remove|remove_newlines|trim|single_space|lower|ucfirst', 'email:clean' => 'trim|lower|email', 'url:clean' => 'trim|remove_newlines|no_js', 'url:secure' => 'trim|remove_newlines|no_js|url', 'number:integer' => 'trim|numeric_only', 'number:decimal' => 'trim|decimal_only', 'phone:clean' => 'trim|phone', 'text:alpha_dash' => 'trim|lower|alpha_dash', 'json:escape' => 'trim|json_escape', // ... and more ],
Use these aliases in your models:
protected $sanitize = [ 'title' => 'text:title', 'content' => 'text:secure', 'email' => 'email:clean', 'website' => 'url:secure', ];
Creating Custom Sanitizers
You can create your own sanitizers by implementing the Sanitizer interface:
namespace App\Sanitizers; use IvanBaric\Sanigen\Sanitizers\Contracts\Sanitizer; class MyCustomSanitizer implements Sanitizer { public function apply(string $value): string { // Transform the value return $transformed_value; } }
Register your custom sanitizer:
// In a service provider use IvanBaric\Sanigen\Registries\SanitizerRegistry; SanitizerRegistry::register('my_custom', \App\Sanitizers\MyCustomSanitizer::class);
Then use it in your models:
protected $sanitize = [ 'attribute' => 'my_custom', ];
Generators
Generators automatically create values for model attributes during creation.
Using Generators in Models
Define generators in your model using the $generate property:
protected $generate = [ 'attribute_name' => 'generator_name', 'parameterized_attribute' => 'generator:parameter', ];
Generators are applied only when creating a model, and only if the attribute is empty.
Available Generators
Sanigen includes several built-in generators:
| Generator | Description | Example |
|---|---|---|
autoincrement |
Increments from the highest existing value | 1, 2, 3, ... |
carbon:+7 days |
Creates a date with the specified offset | Current date + 7 days |
random_string:8 |
Generates a random string of specified length | "a1b2c3d4" (random string) |
slugify:field |
Creates a unique slug from another field (ensures uniqueness with configurable suffix types) | "my-post-title", "my-post-title-2023-07-20" |
ulid |
Generates a ULID (sortable identifier) | "01F8MECHZX3TBDSZ7XR1QKR505" |
unique_string:8 |
Generates a unique random string of specified length (ensures uniqueness by checking the database) | "a1b2c3d4" (8 chars) |
user:property |
Uses a property from the authenticated user | "john@example.com" (user's email) |
uuid |
Generates a UUID (v4 by default) | "550e8400-e29b-41d4-a716-446655440000" |
uuid:v4 |
Generates a UUID v4 (random-based) | "550e8400-e29b-41d4-a716-446655440000" |
uuid:v7 |
Generates a UUID v7 (time-ordered) | "017f22e2-79b0-7cc3-98c4-dc0c0c07398f" |
uuid:v8 |
Generates a UUID v8 (custom format) | "017f22e2-79b0-8cc3-98c4-dc0c0c07398f" |
Parameter Passing
Many generators accept parameters using the colon syntax:
protected $generate = [ 'code' => 'unique_string:6', // 6-character unique random string (ensures uniqueness) 'token' => 'random_string:16', // 16-character random string (no uniqueness check) 'slug' => 'slugify:title', // Unique slug based on the title field (with -1, -2 suffixes if needed) 'expires_at' => 'carbon:+30 days', // Date 30 days in the future (carbon:now, carbon:tomorrow 14:00 etc.) 'author_id' => 'user:id', // Current user's ID 'team_id' => 'user:current_team_id', // Current user's team ID 'author_email' => 'user:email', // Current user's email 'order' => 'autoincrement', // Next available number (max + 1) 'uuid' => 'uuid', // UUID v4 (default): "550e8400-e29b-41d4-a716-446655440000" 'uuid_v7' => 'uuid:v7', // UUID v7 (time-ordered): "017f22e2-79b0-7cc3-98c4-dc0c0c07398f" 'uuid_v8' => 'uuid:v8', // UUID v8 (custom format): "017f22e2-79b0-8cc3-98c4-dc0c0c07398f" 'ulid' => 'ulid' // ULID: "01F8MECHZX3TBDSZ7XR1QKR505" ];
Creating Custom Generators
You can create your own generators by implementing the GeneratorContract interface:
namespace App\Generators; use IvanBaric\Sanigen\Generators\Contracts\GeneratorContract; class MyCustomGenerator implements GeneratorContract { public function generate(string $field, object $model): mixed { // Generate a value return $generated_value; } }
Register your custom generator:
// In a service provider use IvanBaric\Sanigen\Registries\GeneratorRegistry; GeneratorRegistry::register('my_custom', \App\Generators\MyCustomGenerator::class);
Then use it in your models:
protected $generate = [ 'attribute' => 'my_custom', ];
Configuration
Package Status
You can enable or disable the entire package functionality:
// In config/sanigen.php 'enabled' => true, // or false to disable
When disabled, no automatic sanitization or generation will occur.
Sanitization Aliases
The most powerful feature of Sanigen is the ability to define custom sanitization pipelines as aliases. This allows you to:
- Create reusable sanitization strategies
- Apply multiple sanitizers with a single alias
- Standardize sanitization across your application
- Make your models cleaner and more readable
Define your own aliases in the configuration:
// In config/sanigen.php 'sanitization_aliases' => [ // Standard text processing 'text:clean' => 'trim|strip_tags|remove_newlines|single_space', // Custom aliases for your application 'username' => 'trim|lower|alphanumeric_only', 'product:sku' => 'trim|upper|ascii_only', 'address' => 'trim|single_space|no_js|htmlspecialchars', ],
Then use these aliases in your models:
protected $sanitize = [ 'username' => 'username', 'sku' => 'product:sku', 'shipping_address' => 'address', ];
Allowed HTML Tags
Configure which HTML tags are allowed when using sanitizers like strip_tags or no_js:
// In config/sanigen.php 'allowed_html_tags' => '<p><b><i><strong><em><ul><ol><li><br><a><h1><h2><h3><h4><h5><h6><table><tr><td><th><thead><tbody><code><pre><blockquote><q><cite><hr><dl><dt><dd>',
Default Encoding
Set the default character encoding for sanitizers:
// In config/sanigen.php 'encoding' => 'UTF-8',
Slug Generator Configuration
The slug generator (slugify) can be configured to use different types of suffixes for ensuring uniqueness:
// In config/sanigen.php 'generator_settings' => [ 'slugify' => [ // Type of suffix to use for ensuring uniqueness // Options: 'increment', 'date', 'uuid' 'suffix_type' => 'increment', // Format for date suffix (used when suffix_type is 'date') 'date_format' => 'Y-m-d', ], ],
Available suffix types:
increment: Appends an incremental number (e.g.,-1,-2,-3) to ensure uniqueness (default)date: Appends the current date in the specified format (e.g.,-2023-07-20)uuid: Appends a UUID to ensure uniqueness (e.g.,-550e8400-e29b-41d4-a716-446655440000)
You can also specify the suffix type directly in your model:
// In your model protected $generate = [ 'slug' => 'slugify:title,date', // Use date suffix 'another_slug' => 'slugify:name,uuid', // Use uuid suffix ];
Advanced Usage
Resanitizing Existing Records
If you need to apply sanitization rules to existing records in your database (for example, after adding new sanitization rules or fixing issues with existing data), you can use the sanigen:resanitize command:
php artisan sanigen:resanitize "App\Models\Post" --chunk=200
This command:
- Takes a model class name as an argument
- Processes records in chunks to prevent memory overflow (default chunk size is 200)
- Applies all sanitization rules defined in the model's
$sanitizeproperty - Uses database transactions for safety
- Uses
saveQuietly()to avoid triggering model events that might cause infinite loops
Important: The command will display a warning and ask for confirmation before proceeding, as it will modify existing records in your database. It's strongly recommended to create a backup of your database before running this command.
Options
--chunk=<size>: Set the number of records to process at once (default: 200)--force: Skip the confirmation prompt
Example
# Process all Post models with a chunk size of 200 php artisan sanigen:resanitize "App\Models\Post" --chunk=200 # Process all User models with the default chunk size php artisan sanigen:resanitize "App\Models\User" # Skip confirmation prompt php artisan sanigen:resanitize "App\Models\Post" --force
Choosing an Optimal Chunk Size
When processing large datasets with many attributes, choosing an appropriate chunk size is important to prevent memory issues:
- Too large: Processing too many records at once can lead to memory exhaustion, especially with models that have many attributes or complex sanitization rules.
- Too small: Very small chunk sizes may result in slower overall processing due to the overhead of creating many small transactions.
Best practices:
- For models with many attributes or complex sanitization (like JavaScript removal), use smaller chunk sizes (100-250)
- For simpler models with few attributes, larger chunk sizes (500-1000) may be more efficient
- If you encounter memory issues, reduce the chunk size
- For very large tables, ensure the chunk size is significantly smaller than the total number of records
- Avoid using a chunk size equal to the total number of records, as this can lead to memory exhaustion
Performance Testing
The package includes a performance test that can be used to evaluate the performance of the sanitization process with a large number of records and many attributes per record. This is particularly useful for testing the sanigen:resanitize command on large datasets.
Running the Performance Test
To run the performance test:
# Run with default settings (10,000 records) php artisan test --filter=PerformanceTest # Run with custom configuration using environment variables PERFORMANCE_TEST_RECORDS=5000 PERFORMANCE_TEST_BATCH_SIZE=1000 PERFORMANCE_TEST_CHUNK_SIZE=1000 php artisan test --filter=PerformanceTest
Configuration Options
The performance test can be configured using environment variables:
| Variable | Description | Default |
|---|---|---|
PERFORMANCE_TEST_RECORDS |
Number of records to generate | 500 |
PERFORMANCE_TEST_BATCH_SIZE |
Number of records to insert in each batch | 500 |
PERFORMANCE_TEST_CHUNK_SIZE |
Number of records to process in each chunk during sanitization | Auto-calculated (half of total records, max 250) |
PERFORMANCE_TEST_CLEANUP |
Whether to clean up (delete) test data after the test | false |
Important Note on Chunk Size: The test automatically calculates an optimal chunk size to prevent memory issues. For best performance, ensure the chunk size is smaller than the total number of records. When processing large datasets with many attributes, using a chunk size equal to the total records can lead to memory exhaustion. The default calculation (half of total records, maximum 250) works well for most scenarios.
Test Process
The performance test:
- Creates a test model with dozens of attributes using different sanitizers
- Generates the specified number of records with unsanitized data
- Runs the
sanigen:resanitizecommand on these records - Measures and reports performance metrics
Performance Metrics
The test collects and reports the following metrics:
- Generation Time: Time taken to generate the test records
- Sanitization Time: Time taken to sanitize all records
- Total Time: Total execution time
- Memory Usage: Memory usage before, after generation, and after sanitization
- Memory Increase: Memory increase during generation and sanitization
Interpreting Results
The performance metrics can help you:
- Evaluate the performance impact of sanitization on your application
- Determine optimal chunk sizes for processing large datasets
- Identify potential memory issues with large datasets
- Compare performance across different environments or configurations
For large production databases, it's recommended to start with a small number of records and gradually increase to find the optimal configuration for your specific environment.
Combining Generators and Sanitizers
One powerful feature of Sanigen is the ability to combine generators and sanitizers on the same field. For example, you can generate a unique code and then automatically convert it to uppercase:
class Coupon extends Model { use Sanigen; protected $generate = [ 'code' => 'unique_string:6', // Generate a 6-character unique random string ]; protected $sanitize = [ 'code' => 'upper', // Convert the code to uppercase ]; }
When you create a new Coupon model:
$coupon = Coupon::create();
The flow is:
- The unique_string generator creates a unique random 6-character string (e.g., "a1b2c3")
- The uppercase sanitizer converts it to uppercase (e.g., "A1B2C3")
The result is a 6-character uppercase code that is both generated and sanitized automatically.
Manual Sanitization
You can manually sanitize attributes:
$model->sanitizeAttributes();
Support for Spatie Translatable Fields
Sanigen supports Spatie's Laravel Translatable package, which allows you to store translations for your model's attributes. When using Spatie Translatable, translations are stored as arrays (e.g., $name['en'] = "<script>alert("xss")</script>smart").
Sanigen will automatically detect and sanitize these array values, applying the sanitization rules to each translation individually:
use Illuminate\Database\Eloquent\Model; use IvanBaric\Sanigen\Traits\Sanigen; use Spatie\Translatable\HasTranslations; class Product extends Model { use Sanigen; use HasTranslations; public $translatable = [ 'name', 'description', ]; protected $sanitize = [ 'name' => 'no_js|trim', 'description' => 'text:secure', ]; }
When you set translations with potentially unsafe content:
$product = new Product(); $product->setTranslation('name', 'hr', '<script>alert("xss")</script>smart'); $product->setTranslation('name', 'en', 'Smart <script>alert("xss")</script> Product'); $product->save(); // Or set all translations at once: $product->name = [ 'hr' => '<script>alert("xss")</script>smart', 'en' => 'Smart <script>alert("xss")</script> Product' ]; $product->save();
Sanigen will sanitize each translation individually:
echo $product->getTranslation('name', 'hr'); // Outputs: "smart" echo $product->getTranslation('name', 'en'); // Outputs: "Smart Product"
This ensures that all your translatable content is properly sanitized, regardless of the language.
Combining with Laravel Validation
Sanigen works well with Laravel's validation:
// In a controller $validated = $request->validate([ 'title' => 'required|string|max:255', 'content' => 'required|string', 'email' => 'required|email', ]); // Create model with validated data $post = Post::create($validated); // Sanigen will automatically: // 1. Generate any missing attributes // 2. Sanitize the input according to rules // 3. Apply $casts after sanitization // Note: After sanitization is complete, Laravel's $casts will be applied // to the model attributes. This means your data goes through sanitization // first, and then type casting occurs, ensuring both clean and properly // typed data in your models.
Similar Packages
If you're looking for alternatives or complementary packages to Sanigen, here are some other Laravel packages that provide similar functionality:
- Elegant Sanitizer - A Laravel package that provides an elegant way to sanitize and transform input data.
- WAAVI Sanitizer - Provides an easy way to format user input, both through the provided filters or through custom ones that can easily be added to the sanitizer library.
Each of these packages takes a slightly different approach to data sanitization and generation, so you might find one that better suits your specific needs or use them together for more comprehensive data handling.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
The MIT License (MIT). Please see License File for more information.
Support
If you find this package useful, consider buying me a coffee: