Two toothbrushes and some boxes on the background

Last month we talked about Zod, a TypeScript-first schema validation library with static type inference. We look at the problems it tries to solve and how it does it. In 2023 we used Zod extensively in our cloud, in some of our plugins, in some scripts… it’s so useful!

I love Zod and I refuse to fetch data or validate something if it’s not available–it’s proven to be indispensable in my toolbox. But sometimes… sometimes life sucks and you have to implement something in PHP and… damn it!

I really miss Zod when I’m on PHP. So much so that I decided to create my own version of it. Yup, you heard that right: I created my own. Why not? I wanted a lightweight version of Zod that would help me validate and sanitize data in our plugins easily and effectively, using a syntax I’m already familiar with. And that’s what I like to share with you today: how to implement your own Zod validation library in PHP.

Now, if you don’t want to do any of this and you simply want Zod in PHP, I have some good news: there’s a package you can get require via Composer. But, come on, where’s the fun in that?

Our Goal

First of all, let’s quickly recap how Zod works in TypeScript. Here’s the example we used last month:

import z from 'zod';
export const editorialTaskSchema = z.object( {
  id: z.string().uuid(),
  task: z.string().min( 1 ),
  completed: z.boolean(),
  postId: z.number().positive().optional(),
} );

Basically, we import z from the zod package and this grants us access to a whole set of functions for defining the schema we want. In the example above, for instance, we define a schema for an object that’s supposed to contain several properties. For each property, we specify its own schema: id must be a string, task must be a non-empty string, and so on.

With this, we can then use the schema to parse an unknown variable:

declare const maybeTask: unknown;
// Parse with exception throwing:
try {
  const task = editorialTaskSchema.parse( maybeTask );
} catch ( e ) { ... }
// Parse safely:
const result = editorialTaskSchema.safeParse( maybeTask );
if ( result.success ) {
  const task = result.data;
} else {
  const error = result.error;
}

and convert it into the type we want.

So, my goal for today is to have something extremely similar in PHP. I want to be able to define the schema as follows:

use Nelio\Zod\Zod as Z;
$editorial_task_schema = Z::object( [
  'id'        => Z::string()->uuid(),
  'task'      => Z::string()->min( 1 ),
  'completed' => Z::boolean(),
  'postId'    => Z::number()->positive()->optional(),
] );

and then parse an unknown variable just like we did in TypeScript:

$maybe_task = [ ... ];
// Parse with exception throwing:
try {
  const $task = $editorial_task_schema->parse( $maybe_task );
} catch ( e ) { ... }
// Parse safely:
const $result = $editorial_task_schema->safe_parse( $maybe_task );
if ( $result['success'] ) {
  const $task = $result['data'];
} else {
  const $error = $result['error'];
}

Basic Implementation

Let’s start with the basics. The first thing you’ll notice when you look at the Zod schema is that we want a single class Zod with a few static methods. This class is like a factory that allows us instantiate whatever schemas we want: a string schema, an object schema, etc. So let’s create that basic skeleton:

namespace Nelio\Zod;
class Zod {
  public static function boolean() {
    return null;
  }
  public static function number() {
    return null;
  }
  public static function string() {
    return null;
  }
  // ...
}

Nice! The problem is that all these classes return null. But that’s not what we want. Whenever we create a schema, we want it to be an instance of “something” that allows us to parse and/or safe_parse a variable, so clearly we also need a Schema class:

namespace Nelio\Zod;
abstract class Schema {
  public function parse( $value ) {
    return $this->parse_value( $value );
  }
  public function safe_parse( $value ) {
    try {
      $result = $this->parse( $value );
      return array(
        'success' => true,
        'data'    => $result,
      );
    } catch ( \Exception $e ) {
      return array(
        'success' => false,
        'error'   => $e->getMessage(),
      );
    }
  }
  abstract protected function parse_value( $value );
}

Why abstract, you ask? Well, clearly, parsing a string is different than parsing a, for example, a number. This means we need different Schemas, each with its own parsing implementation.

The abstract class Schema has a clear interface with the two methods that our consumers will use (parse and safe_parse). It also requires all its concrete subclasses to implement a method that, given a certain $value, validates whether or not it matches the schema; if it does, it’ll return the parsed value; if not, it’ll throw an exception.

Basic Schema Types

There’s three basic types that need their own Schema subclass: boolean, number, and string. Let’s try to implement the first one.

A boolean value has only two possible options: either true or false, so that’s all we have to check:

namespace Nelio\Zod;
class BooleanSchema extends Schema {
  protected function parse_value( $value ) {
    if ( ! in_array( $value, [ true, false ], true ) ) {
      throw new \Exception( 'Not a boolean value' );
    }
    return $value;
  }
}

That’s it! See how easy it was? If we now rewrite our Zod class to use this schema:

class Zod {
  public static function boolean(): BooleanSchema {
    return new BooleanSchema();
  }
}

we can quickly try it out:

use Nelio\Zod\Zod as Z;
$s = Z::boolean();
var_dump( [
  $s->safe_parse( true ),
  $s->safe_parse( false ),
  $s->safe_parse( 1 ),
  $s->safe_parse( 'hello' ),
] );

and get the following results:

Array
(
  [0] => Array
    (
      [success] => 1
      [data] => true
    )
  [1] => Array
    (
      [success] => 1
      [data] => false
    )
  [2] => Array
    (
      [success] =>
      [error] => Not a boolean value
    )
  [3] => Array
    (
      [success] =>
      [error] => Not a boolean value
    )
)

Pretty awesome!

What about the other two schemas? Well, they’re just as simple as this one. The only differenceis that you can add a few more checks. For example, let’s consider the NumberSchema. All you gotta do is check if the $value is, at least, numeric and then make sure it’s not a string:

namespace Nelio\Zod;
class NumberSchema extends Schema {
  protected function parse_value( $value ) {
    if ( ! is_numeric( $value ) {
      throw new \Exception( 'Not a numeric value' );
    }
    if ( is_string( $value ) ) {
      throw new \Exception( 'Numeric value is a string' );
    }
    return $value;
  }
}

But, of course, we know Zod can do more things with numbers. We can set a lower and/or upper limit, we have some convenient functions to say we want the number to be positive, or non-negative, or negative… how do we do all that? Well, we simply need to add those functions here, set some internal properties, and use those in our parse_value method:

namespace Nelio\Zod;
class NumberSchema extends Schema {
  private $min;
  private $max;
  public function min( $min ) {
    $this->min = $min;
    return $this;
  }
  public function max( $max ) {
    $this->max = $max;
    return $this;
  }
  public function positive() {
    return $this->min( 1 );
  }
  // Add more helper functions...
  protected function parse_value( $value ) {
    if ( ! is_numeric( $value ) {
      throw new \Exception( 'Not a numeric value' );
    }
    if ( is_string( $value ) ) {
      throw new \Exception( 'Numeric value is a string' );
    }
    if ( ! is_null( $this->min ) && $value < $this->min ) {
      throw new \Exception( 'Value is too small' );
    }
    if ( ! is_null( $this->max) && $$this->max < $value ) {
      throw new \Exception( 'Value is too big' );
    }
    return $value;
  }
}

And that’s it! Notice something important: all our helper functions return $this. There are two reasons for this. On the one hand, it’s more convenient if we end up with a Schema instance afterwards. On the other hand, ending with such an instance allows us to chain severa functions together:

$schema = Z::number()->min( 3 )->max( 10 );

Would you like to try to implement the StringSchema on your own now? I’m sure you’d be able to! Let me know in the comment section below how it went for you.

Advanced Stuff

What about more complicated stuff? Well, once you’ve figured out the basic pattern, it’s actually pretty easy. Each new Schema you implement is only responsible for itself and can depend on the other pieces you’ve implemented.

Arrays

Let’s say you want to build a new schema that validates (a) that a given variable is an array and (b) if it is, that all the elements in that array satisfy a certain schema (such us, for example, being a string with at least 3 characters). That is, you want the following:

$schema = Z::array( Z::string()->min( 3 ) );

How would you do it? Well, let’s follow the same recipe we did before:

namespace Nelio\Zod;
class ArraySchema extends Schema {
  protected function parse_value( $value ) {
    // TODO
  }
}

Now pay attention. This schema is a little bit more complex, because it takes a schema as an input parameter, which is the one we’ll use to parse all the elements in the array. So, let’s see:

namespace Nelio\Zod;
class ArraySchema extends Schema {
  private $schema;
  public function __construct( $schema ) {
    $this->schema = $schema;
  }
  protected function parse_value( $value ) {
    // TODO
  }
}

and now that we have the schema that all objects in the array must comply with, it’s time to implement parse_value. I’m sure you’d be able to implement it, but let me help you:

protected function parse_value( $value ) {
  if ( ! is_array( $value ) ) {
    throw new \Exception( 'Not an array' );
  }
  $result = array();
  foreach ( $value as $v ) {
    $result[] = $this->schema->parse( $v );
  }
  return $result;
}

See? All we have to do is make sure that $value is indeed an array and, if it is, parse each value in the array using our $this->schema. If all the elements of the array satisfy this inner schema, there’s nothing to worry about: we simply built the $result one item at a time. But if one of the elements doesn’t, $this->schema->parse will throw an exception. And exception that the ArraySchema‘s parse_value will also throw, thus resulting in an invalid parse.

Simple. Elegant. Awesome.

Objects

Object schemas are essentially the same as array schemas. The only difference is, object schemas define multiple keys, each with its own schema. Here’s a first attempt at implementation, based on what we did in the previous section:

namespace Nelio\Zod;
class ObjectSchema extends Schema {
  private $schemas;
  public function __construct( $schemas ) {
    $this->schemas = $schemas;
  }
  protected function parse_value( $value ) {
    if ( is_object( $value ) ) {
      $value = get_object_vars( $value );
    }
    if ( ! is_array( $value ) ) {
      throw new \Exception( 'Not an object' );
    }
    $result = array();
    foreach ( $this->schemas as $prop => $schema ) {
      $result[ $prop ] = $schema->parse(
        isset( $value[ $prop ] ) ? $value[ $prop ] : null
      );
    }
    return array_filter( $result, fn( $p ) => ! is_null( $p ) );
  }
}

Essentially, we map over all the properties defined in our inner $schemas and then we construct a result by applying the associated $schema->parse. If a property parser throws an exception, the object parser will throw it as well. If it’s successful, we have our input $value properly validated.

There’s one thing missing from this whole equation. Remember our original goal? We wanted to be able to define an object with optional properties, like postId here:

use Nelio\Zod\Zod as Z;
$editorial_task_schema = Z::object( [
  'id'        => Z::string()->uuid(),
  'task'      => Z::string()->min( 1 ),
  'completed' => Z::boolean(),
  'postId'    => Z::number()->positive()->optional(),
] );

Luckily, this one’s easy. All schemas we use in an object can be optional, so we’ll add the optional method in our abstract Schema and we’ll then be able to use it as needed:

namespace Nelio\Zod;
abstract class Schema {
  private $is_optional = false;
  public function optional() {
    $this->is_optional = true;
  }
  public function parse( $value ) {
    if ( $this->is_optional && is_null( $value ) ) {
      return null;
    }
    return $this->parse_value( $value );
  }
  public function safe_parse( $value ) {
    // ...
  }
  abstract protected function parse_value( $value );
}

There you go! We modify our parse method so that, if the schema is optional and there’s no input $value, it simply returns null. This way, our object schema can filter out null values in its return statement.

What now?

There’s a few more things you can do and implement. Default values, transformations, union types… you name it!

I hoped you learned something interesting here and, if you ever need to validate your data, you implement your own Zod or, well, you use one of the existing libraries. Let me know in the comment section below how well you did 😁

Featured Image by Toa Heftiba on Unsplash.

Leave a Reply

Your email address will not be published. Required fields are marked *

I have read and agree to the Nelio Software Privacy Policy

Your personal data will be located on SiteGround and will be treated by Nelio Software with the sole purpose of publishing this comment here. The legitimation is carried out through your express consent. Contact us to access, rectify, limit, or delete your data.