serbanrobu / purestruct
Requires (Dev)
- phpunit/phpunit: ^9.5
This package is auto-updated.
Last update: 2025-05-21 00:23:40 UTC
README
A library that offers utility functions that consume less-structured input and produces more-structured output in order to create data structures that makes illegal states unrepresentable. This facilitates making total functions1.
If you don't value total functions, this library might not be for you.
Table of Contents
Installation π
You can install the package via composer:
composer require serbanrobu/purestruct
Credits π
This library was made with "Parse, don't validate." mantra in mind. I highly recommend reading this blog post.
Many concepts have been applied from the functional programming world such as Functor, Applicative, Semigroup, Monoid, Monad and data structures such as Pair, List (Seq), Map (Dict), Either. Although it is not necessary to know them in order to use the library, it would certainly help.
The library was highly inspired by Json.Decode module from elm/json Elm package and validation Haskell package.
The Problem π€
Let's say we have this input:
$input = [ 'name' => 'John Doe', 'email' => 'johndoe@example.com', ];
And we have some function that operate on this input:
function handleUser(array $user): mixed { // TODO }
Any input from outside the app should be validated, right? With Laravel framework you probably would have done something like:
$user = Validator::make( data: $input, rules: [ 'name' => 'required|max:255', 'email' => 'required|max:255|email', ], ) ->validate(); handleUser($user);
The function handleUser
expect to receive as a parameter a user with a certain
valid structure, although its type, array
, does not say anything about it.
The function allows it to be called with any array
value that will, probably,
cause an error to be thrown if it does not have the expected form. This means
that the function is partial.
Why not include the validation in the handleUser
function? Would that make it
a total function? What if we had more functions operating on that user?
handleUser1($user); handleUser2($user);
Sould validation be included in both functions in this case? That sounds inefficient, doesn't it? We have already validated the input but we have no proof that we did this because we discarded this information. One way to keep proof of input validation is to encapsulate the validated data in a structure that makes illegal states unrepresentable.
class User { private function __construct( public readonly string $name, public readonly string $email, ) { } public function tryFrom(array $input): ?self { // TODO Validate input return new self( name: $input['name'], email: $input['email'], ); } }
By making the constructor private
we made sure that the instantiation of the
class is possible only with validated data. Once we have an instance, the
properties no longer be invalidated because we made the structure immutable by
making the properties readonly
2.
Any instance of this class is a proof that we have validated the input. We can
pass this proof to other functions to use it. Now the handleUser
function
tells us more about what it needs to work.
function handleUser(User $user): mixed { // TODO }
We can have several functions that use the same proof without creating it again. What if we have another function that doesn't need the whole user but only his email?
function handleEmail(string $email): mixed { // TODO }
Can we do this?
handleEmail($user->email);
We could. But in doing so we lose the proof that this email is valid and the
handleEmail
function would not be total. We can fix this by breaking our proof
into smaller ones. Namely, using value objects.
class NonEmptyString { private function __construct(public readonly string $inner) { } public function tryFrom(string $input): ?self { // TODO } } class Max255NonEmptyString { private function __construct(public readonly NonEmptyString $inner) { } public function tryFrom(NonEmptyString $input): ?self { // TODO } } class Email { private function __construct(public readonly Max255NonEmptyString $inner) { } public function tryFrom(Max225NonEmptyString $input): ?self { // TODO } } class User { public function __construct( public Max255NonEmptyString $name, public Email $email, ) { } }
Lots of code, I know. The good thing is that if these classes are not very
specific, they can be reused for many structures or other value objects. Another
good thing we can see is that the User
class should no longer have the
constructor private
nor be immutable (although it would provide other
benefits3) because
any user instance is valid even if we change its state. We can now pass these
value objects anywhere in the application without losing the proof of their
validation and we can clearly express the input of a function.
function handleEmail(Email $email): mixed { // TODO }
Many value objects are string based. To work more efficiently, you may want to implement the Stringable interface.
What about validation? Can we use Laravel validator or something similar? We could, but it would be very annoying. Why? One reason not to use Laravel validator would be that the validator expected to receive an array as input not a value string/int/bool/etc. So it is not made to validate value objects. Sure, you can emulate that this way:
Validate::make( data: ['value' => $input], rules: ['value' => '...'], );
but then the errors would be specific to the value
field. And what do we do
with all the exceptions? We should catch the Validation
exception from all
validators and then collect the errors and throw a new exception with all the
errors.
throw new ValidationException::withMessages($mergedErrors);
Sounds like a lot of work and lots of try-catch
.
The Solution π‘
Decoding π¨
This is where Decoder
comes into play. A decoder knows how to "decode" a
certain type of value. Decoders already exist for most builtin PHP types but
custom decoders can be created.
For example this is a decoder that know how to decode a int
:
function decodeInt(mixed $value): Either { $decoder = Decoder::int(); return $decoder->decode($value); }
The result of decoding in this case can either be
Either::left(Error::failure('Expecting an INT'))
or
Either::right($value)
The user name and email can be decoded using the following decoders:
$nameDecoder = Decoder::string()->required()->max(255); $emailDecoder = Decoder::string()->email();
Lifting πͺ
What if we want to validate multiple fields at the same time? We can "lift" a
function in the context of Decoder
.
For example, the user could be decoded using the following decoder:
$createUser = fn ($name) => fn ($email) => new User($name, $email); $userDecoder = Decoder::pure($createUser) ->apply($nameDecoder->field('name')) ->apply($emailDecoder->field('email')); // Or simpler: $userDecoder = Decoder::lift( $createUser, $nameDecoder->field('name'), $emailDecoder->field('email'), );
The pure
function puts a value in the Decoder
context. Note that the
function must be curried. π
Decoder::pure('foo')->decode('bar'); // Same as Either::right('foo');
The apply
function will work only on a function decoder and works as follows:
Decoder::pure(fn ($x) => $x * 2)->apply(Decoder::pure(3)); // Same as Decoder::pure(6);
Validation πββοΈ
Note that when decoding, the decoder will stop at the first error. If we want to
decode a multi-field structure and collect errors from all fields, Validator
is the way to go. Validator
works very similar to Decoder
but will not stop
at the fist error it encounters, but instead will collect all the errors when
the apply
(or indirect lift
) function is used. Instead of decode
function,
you need to use validate
. This function will return a Validation
instance
which can be either
Validation::failure($errors)
or
Validation::success($value);
where errors is a Seq
(sequence) of Error
. All of the above examples work
similarly for Validator
.
If we wanted to validate an enum we could create a validator for it:
enum Suit { case Hearts; case Diamonds; case Clubs; case Spades; public static function validator(): Validator { return Validator::string() ->map(fn ($val) => Str::trim($val)) ->bind( fn ($val) => match ($val) { 'hearts' => Validator::succeed(self::Hearts), 'diamonds' => Validator::succeed(self::Diamonds), 'clubs' => Validator::succeed(self::Clubs), 'spades' => Validator::succeed(self::Spades), default => Validator::fail('It\'s not a valid suit case'), }, ); } }
We can modify the validator using map
method or chaing them using bind
. The
functions succeed
and fail
are used to ignore the input and produce a
certain value or failure.
Here is an example of a UUID class an its validator:
class Uuid { private function __construct(public readonly string $inner) { } public static function validator(): Validator { return Validator::string()->bind( fn ($val) => Str::isUuid($val) ? Validator::succeed(new self($val)) : Validator::fail('It\'s not a valid UUID'), ); } }
This is a complete example of the user class we created above and the value objects including validators.
class NonEmptyString { use Validatable; private function __construct(public readonly string $inner) { } public static function validator(): Validator { return Validator::string()->required()->map(self::curry()); } } class Max255NonEmptyString { use Validatable; private function __construct(public readonly NonEmptyString $inner) { } public static function validator(NonEmptyString $input): Validator { return NonEmptyString::validator()->max(255)->map(self::curry()); } } class Email { use Validatable; private function __construct(public readonly Max255NonEmptyString $inner) { } public static function validator(Max225NonEmptyString $input): Validator { return Max255NonEmptyString::validator()->email()->map(self::curry()); } } class User { use Validatable; public function __construct( public Max255NonEmptyString $name, public Email $email, ) { } public static function validator(): Validator { return Validator::lift( self::curry(), Max255NonEmptyString::validator()->field('name'), Email::validator()->field('email'), ); } }
The Curry
trait is included in Validatable
and provides curry
function
which returns the class constructor function curried.
User::curry(); // Same as fn ($name) => fn ($email) => new User($name, $email);
The Validatable
trait provides validate
function which is a shortcut taking
the validator from the instance and than validate.
User::validate($input); // Same as User::validator()->validate($input);
Note that the validator for a structure that has as fields other structures /
value objects that are validatable is quite straightforward so that it can be
automated using reflection.
In fact, that's exactly what the trait Validatable
does. Provides a default
implementation for the validator
function but expects all fields in the
structure to be validatable, otherwise it will give an error. So we can omit
the validator implementation for the User class. The structure may also contain
nested validatable structures. For nested structures, the Validatable
trait
does not even need to be included.
For example:
class Address { public function __construct( Max255NonEmptyString $line1, Max255NonEmptyString $line2, ) { } } class User { use Validatable; public function __construct( Uuid $id, Max255NonEmptyString $name, Email $email, Address $address, ) { } }
Maybe you want to create an array of users. Unfortunately there are no generics in PHP and this means that you will have to create a separate class for each type of collection. A simple structure that contains only validated users can be defined as follows.
class UserArray { use Validatable; private function __construct(public readonly array $inner) { } public static function validator(): Validator { return User::validator()->array()->map(self::curry()); } }
Elimination π₯
How do we get the value from Validation
? One way is to match on it.
$input = [ 'name' => 'John Doe', 'email' => 'johndoe@example.com', ]; $validation = User::validate($input); /** @var ?User */ $user = $validation->match( success: fn (User $user) => $user, failure: fn (Seq $errors) => null, );
If we are sure that the validation will be successful or we just want to
experiment with this library we can use the unwrap
function which will return
the validated value if the validation was successful or otherwire it will throw
an exception with an explicit message telling us why the validation failed.
/** * @throws Exception * @var User */ $user = $validation->unwrap();
Sum Types β
What if we needed a union of data types (aka Sum Types)? PHP does not support this but one way to emulate this is through inheritance. For example, let's say we need a structure to store an IP address that can be v4 or v6.
abstract class IpAddr { } class V4 extends IpAddr { private function __construct(public readonly string $inner) { } } class V6 extends IpAddr { private function __construct(public readonly string $inner) { } }
The example above is a common thing to do but we have a problem. If a function
were to receive an IpAddr
as a parameter we could expect anything because we
can create as many derived classes as we want and this means that it is
impossible to be sure that you cover all cases trying to check what type of
instance is with instanceof
, unless you have a fallback.
A better approach, in my opinion, would be something like this:
class V4 { use Validatable; private function __construct(public readonly string $inner) { } public static function validator(): Validator { return Validator::string()->bind( fn ($ip) => filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4) ? Validator::succeed(new self($ip)) : Validator::fail('It\'s not a valid IPv4 address'), ); } } class V6 { use Validatable; private function __construct(public readonly string $inner) { } public static function validator(): Validator { return Validator::string()->bind( fn ($ip) => filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6) ? Validator::succeed(new self($ip)) : Validator::fail('It\'s not a valid IPv6 address'), ); } } class IpAddr { use EnumValidatable; public function __construct(public V4|V6 $inner) { } }
In this way it is clear with the variants we ar dealing with. Note that you
cannot use Validatable
when dealing with union types, but EnumValidatable
is
made specifically for this purpose. We can now validate input similar to this:
IpAddr::validate(['V4' => '127.0.0.1']); IpAddr::validate(['V6' => '::1']);
The name of the trait contains the name Enum because it was inspired by Rust enums. Here is the similar example written in Rust.
If you want to enumerate 4 bytes instead of parsing a string for IP v4 variant you could do as follows.
class UInt8 { use Validatable; private function __construct(public readonly int $inner) { } public static function validator(): Validator { return Validator::int()->bind( fn (int $byte) => $byte >= 0 && $byte <= 255 ? Validator::succeed(self::curry()) : Validator::fail('The byte is out of range'), ); } } class V4 { use TupleValidatable; public function __construct( public UInt8 $first, public UInt8 $second, public UInt8 $third, public UInt8 $fourth, ) { } }
What TupleValidatable
does you may want to ask. If we had used Validatable
we would have created an IPv4 like this:
IpAddr::validate([ 'V4' => [ 'first' => 127, 'second' => 0, 'third' => 0, 'fourth' => 1, ], ]);
It would be much easier to create it the following way, wouldn't it?
IpAddr::validate(['V4' => [127, 0, 0, 1]]);
This is exactly what TupleValidatable
lets us do. Namely, to ignore the names
of the fields.
This way of representing union types as JSON was inspired by the Rust package called Serde.
Bare-bones π
Most examples involved creating a structure for representing the valid input. But this is not mandatory. For example, if you want to create an IP validator as above without creating classes you can do it as follows.
$byteValidator = Validator::int()->bind( fn ($byte) => $byte >= 0 && $byte <= 255 ? Validator::succeed($byte) : Validator::fail('The byte is out of range'), ); $ipV4Validator = Validator::lift( fn ($b0) => fn ($b1) => fn ($b2) => fn ($b3) => [$b0, $b1, $b2, $b3], $byteValidator->index(0), $byteValidator->index(1), $byteValidator->index(2), $byteValidator->index(3), ); $ipV6Validator = Validator::string()->bind( fn ($ip) => filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6) ? Validator::succeed($ip) : Validator::fail('It\'s not a valid IPv6 address'), ); $ipValidator = Validator::oneOf( Seq::fromArray([ $ipV4Validator->field('V4'), $ipV6Validator->field('V6'), ]), );
Usage in Laravel π§°
Let's say we want to validate the request with our validator in a user
controller. Let's assume we've already defined a validator for UserForm
.
class UserController extends Controller { public function store(Request $request): JsonResponse { $validation = UserForm::validate($request->all()); return $validation->match( success: function (UserForm $form) { $user = User::create($form->toArray()); return response()->json($user); }, failure: function (Seq $errors) { return response()->json([ 'errors' => [ // TODO ], ]); }, ); } }
We know what to do if the input has been successfully validated. But what do we
do if it fails? We have the Seq
of Error
to do what we want with them. An
error contains a lot of useful information about why the validator failed. But
you may be accustomed to the errors generated by Laravel Validator and you don't
want to bother creating custom errors. Or you may want to use the Blade
directives to display errors for a specific input field. No problem. We can
emulate our errors as if they were generated by the Laravel Validator similar
to the following example.
function unwrapValidation(Validation $validation): mixed { return $validation->unwrapOrElse( fn (Seq $errors) => throw ValidationException::withMessages( Dict::fromSeq( $errors->map(function (Error $error) { $pair = $error->toPair(); $pair->first = Str::of($pair->first) ->replace('[', '.') ->replace(']', ''); return $pair; }), ) ->toArray(), ), ); }
You can use this package similar to the way you use Laravel Validator. Note that
the Error
Seq
contains more information than the message array we converted
it to. This means that in some cases the error message may not be very specific
or clear for a particular field. But of course we can change this conversion
function as we wish.
Using the function defined above simplifies things a bit:
class UserController extends Controller { public function store(Request $request): User { $validation = UserForm::validate($request->all()); $form = unwrapValidation($validation); return User::create($form->toArray()); } }
The code for validating the input from the request and converting it into a
valid structure is likely to be the same for all Validatable
structures.
Let's automate things a bit and use the Laravel Service
Container to validate the request
input the moment when injecting a specific Validatable
instance.
class AppServiceProvider extends ServiceProvider { public function register(): void { $validatableClassNames = [ UserForm::class, ]; foreach ($validatableClassNames as $className) { $this->app->bind($className, function () { unwrapValidation($className::validate(request()->all())); }); } } }
Of course you can create a specific provider for this if you want. Now the code in the controller is nice and clean.
class UserController extends Controller { public function store(UserForm $form): User { return User::create($form->toArray()); } }
Conclusion π
In the examples above you can see how easy it is to combine different validators to create very complex structures without worrying about collecting errors if any validator has failed in the process. It's so composable that if feels like LEGO. Although not mandatory, this package helps to easily create structures that do not allow invalid states. There are already many validators for different types of data in the package but it certainly won't cover all cases. You can easily create some custom ones and combine them to form a validator for a much more complex structure, which may not even have been possible to validate with a validator like the one in Laravel framework.
License π
MIT
-
1A total function is a function which is defined for all inputs of the right type, that is, for all of a domain.
-
2You can emulate
readonly
feature in older PHP versions by making the propertyprivate
and create a getter for it. -
3Among the benefits offered by immutability are: it makes the code more testable, readable, allows you to reason more easily about the code, it helps making illegal states unrepresentable, thread-safety, etc.