Fading lights on a dark road

We’ve been recently discussing a lot how to validate data in PHP. For instance, just a few weeks ago we talked about implementing Zod-like validation in PHP and then how to use those schemas to validate and sanitizate your REST endpoints, with fancy error messages and what not. I really like that approach to effectively describe succinctly and effectively the shape of the data we expect.

But if I’m being totally honest with you, Zod schemas can be too much when all you want to do is validate/sanitize a single variable. In these cases, all you need is a simple function to get the job done. And that’s what we’ll talk about today—how to clearly communicate your intent and get rid of the extra noise.

One-step Validation and Sanitization

As you may already know, you can easily register your own endpoints in WordPress using the register_rest_route during the rest_api_init action as follows.

namespace Your_Plugin\Example;
add_action( 'rest_api_init', function () {
  register_rest_route(
    'your-plugin/v1',
    '/endpoint-name/(?P<id>\d+)',
    array(
      'methods'  => 'GET',
      'callback' => __NAMESPACE__ . '\get_something_by_id',
      'args'     => array(
        'arg-1' => array(
          'required'          => true,
          'validate_callback' => __NAMESPACE__ . '\validate_arg_1',
          'sanitize_callback' => __NAMESPACE__ . '\sanitize_arg_1',
        ),
        'arg-2' => array(
          'required'          => false,
          'default'           => 'something',
          'validate_callback' => __NAMESPACE__ . '\validate_arg_2',
          'sanitize_callback' => __NAMESPACE__ . '\sanitize_arg_2',
        ),
      ),
    )
  );
}Code language: PHP (php)

For each arg your endpoint expects, you can easily specify some properties:

  • if the argument is required or not,
  • what default value should it use when none was provided,
  • a validate function that will check if the received data is valid or not, and/or
  • a sanitize function that may transform the received data as needed.

More often than not, you want to run simple validation and sanitization callbacks. Things like check if the provided value is_email or sanitizing a text field using WordPress’ built-in function sanitize_text_field. In these scenarios, all you gotta do is specify that function in the appropriate callback:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => 'is_email',
  ),
  'name'  => array(
    'required'          => false,
    'sanitize_callback' => 'sanitize_text_field',
  ),
),Code language: PHP (php)

and everything will work just fine.

Multi-step Validation and Sanitization

Providing a simple function to validate or sanitize an argument is quite straightfoward, as we’ve just seen. But what if we want to do more than one thing? What if, for example, we want to first sanitize_text_field the name and then trim the extra spaces it may contain around it?

Well, one obvious solution would be to create a specific function (sanitize_name) to sanitize that field in particular:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => 'is_email',
  ),
  'name'  => array(
    'required'          => false,
    'sanitize_callback' => __NAMESPACE__ . '\sanitize_name',
  ),
),
...
function sanitize_name( $value ) {
  return trim( sanitize_text_field( $value ) );
}Code language: PHP (php)

Is it a great solution? I don’t think so, as we end up with a ton of simple helper functions just because we’re unable to express a simple concept—we want to first sanitize the field and then trim it.

To be fair, though, we could easily rewrite the previous snippet using an arrow function and get rid of all those extra helper functions:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => 'is_email',
  ),
  'name'  => array(
    'required'          => false,
    'sanitize_callback' => fn( $v ) => trim( sanitize_text_field( $v ) ),
  ),
),Code language: PHP (php)

but that doesn’t quite cut it for me either. If I want to apply multiple transformations (like f1, then f2, then f3, and finally f4), I end up with a lot of nested calls that make it difficult to understand what’s happening:

'sanitize_callback' => fn( $v ) => f4( f3( f2( f1( $v ) ) ) ),Code language: PHP (php)

and it even looks like I’m applying the functions in the opposite order. I want f1 to be the first to apply (and that’s precisely what I’m doing), but it’s the last one that shows up? Weird.

Running Functions in Sequence with flow

Luckily, this is actually super easy to fix. All we need is a single helper function that can run all the transformations we want over a given argument. We can achieve this by implementing a higher-order function that takes all the transformations we want to apply and produces a new function that, given an argument, will apply all the transformations in order.

Let’s say this helper function is named flow. If we call it as follows:

$result = flow( 'f1', 'f2', 'f3', 'f4' );Code language: PHP (php)

we’d expect $result to be equivalent to this:

$result = function( $arg ) {
  $arg = f1( $arg );
  $arg = f2( $arg );
  $arg = f3( $arg );
  $arg = f4( $arg );
  return $arg;
}Code language: PHP (php)

How can we build such a function in PHP? It’s actually pretty simple. We can use a foreach loop and apply one function at a time:

function flow( callable $func, callable ...$funcs ): callable {
  $funcs = array( $func, ...$funcs );
  return function( $value ) use ( &$funcs ) {
    foreach ( $funcs as $f ) {
      $value = call_user_func( $f, $value );
    }
    return $value;
  };
}Code language: PHP (php)

Let’s break it down:

  1. First, we have the function signature. As you can see, it’s a function that takes at least one function $func and then zero or more extra functions $funcs.
  2. Then, we combine all the functions into a single array $funcs. This array contains all the transformations we want to apply.
  3. Finally, we return the function that will apply all $funcs to any given $value.
    • As you can see, this final function simply applies each function $f in $funcs to a $value in order to get the new $value for the next iteration.
    • Once all transformations have been applied, we can easily return $value and we’re done.

We could have also written this function using a functional programming style, for those who are more keen of it:

function flow( callable $func, callable ...$funcs ): callable {
  return fn( $value ) => array_reduce(
    array( $func, ...$funcs ),
    fn( $v, $f ) => call_user_func( $f, $v ),
    $value
  );
}Code language: PHP (php)

where we use array_reduce to “encode” the foreach block from the previous version. Pretty neat, huh?

How would our original example look like using flow, you ask? Here you have it:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => 'is_email',
  ),
  'name'  => array(
    'required'          => false,
    'sanitize_callback' => flow( 'sanitize_text_field', 'trim' ),
  ),
),Code language: PHP (php)

Validating Multiple Predicates

Now, what about validate_callback? Validation functions are a little bit different, in the sense that we’re not transforming a value through a sequence of functions, but we’re actually running a sequence of checks on the same value. For example, let’s say we want to test that a certain value is a Google mail address that doesn’t contain the + argument in it. Here’s the helper function we need:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => 'is_gmail_with_no_plus',
  ),
),
...
function is_gmail_with_no_plus( $value ) {
  if ( ! is_email( $value ) ) {
    return false;
  }
  if ( ! str_ends_with( $value, '@gmail.com' ) ) {
    return false;
  }
  if ( str_contains( $value, '+' ) ) {
    return false;
  }
  return true;
}Code language: PHP (php)

Again, this is quite similar to the problem we had before. We need a ton of helper functions just to run a few checks on the provided $value. But, as we’ve seen, this can be easily fixed by using an arrow function with a single check:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => fn( $value ) => (
      is_email( $value ) &&
      str_ends_with( $value, '@gmail.com' ) &&
      ! str_contains( $value, '+' )
    )
  ),
),Code language: PHP (php)

which is pretty cool: each line checks one property and it’s pretty easy to understand what this anonymous functions is doing.

But that doesn’t always work. As you may already know, your validate_callback will return a boolean value to signal if the provided $value is valid (true) or invalid (false), but it can also return a WP_Error object when the value is invalid and we want to tell the endpoint’s consumer why the provided value is invalid.

If we were to use WP_Error object, the previous helper function would look like this:

function is_gmail_with_no_plus( $value ) {
  if ( ! is_email( $value ) ) {
    return new WP_Error( 'invalid-email' );
  }
  if ( ! str_ends_with( $value, '@gmail.com' ) ) {
    return new WP_Error( 'no-gmail' );
  }
  if ( str_contains( $value, '+' ) ) {
    return new WP_Error( 'plus-found' );
  }
  return true;
}Code language: PHP (php)

Now, this function can’t be “beautified” using an arrow function. We need all these if blocks, new WP_Error statements, and checks, and error codes… which, in my opinion, make it a little bit more difficult to understand at a glance what checks we’re running.

So let’s fix this.

Just like we did with our flow helper, let’s assume we have a validate helper that will take all the checks as an argument and produce a helper function that will run them all:

'args' => array(
  'email' => array(
    'required'          => true,
    'validate_callback' => validate( array(
      'invalid-email' => fn( $v ) => ! is_email( $v ),
      'no-gmail'      => fn( $v ) => ! str_ends_with( $value, '@gmail.com' ),
      'plus-found'    => fn( $v ) => str_contains( $value, '+' ),
    ) )
  ),
),Code language: PHP (php)

This solution makes is extremely clear that:

  1. we want to test three conditions,
  2. which error codes can be triggered,
  3. how each condition is actually tested.

Now, given that input argument, how would we implement validate? It’s pretty simple:

function validate( $predicates ) {
  return function( $value ) use ( &$predicates ) {
    foreach ( $predicates as $error_code => $predicate ) {
      if ( call_user_func( $predicate, $value ) ) {
        return new WP_Error( $error_code );
      }
    }
    return true;
  };
}Code language: PHP (php)

And there you have it! Yet another helper function to make your actual code easier to understand.

I hope you like this post and, if you did, please share it with your friends and colleagues. Also, if you use a different approach to data sanitization and validation, let us know in the comment section below.

Featured Image by Marc Sendra Martorell on Unsplash.

Leave a Reply

Your email address will not be published. Required fields are marked *

I have read and agree to the Nelio Software Privacy Policy

Your personal data will be located on SiteGround and will be treated by Nelio Software with the sole purpose of publishing this comment here. The legitimation is carried out through your express consent. Contact us to access, rectify, limit, or delete your data.