WordPress Cron: Background Tasks Without Making Your Server Collapse

If only I had a calendar where I could schedule all my upcoming posts… Hold on a sec, I do have one! And it even helps me to promote it on social networks! Discover our new plugin!

If you need to automatically run tasks very frequently in WordPress, you may end up collapsing your server or receiving timeouts (your server stops executing your code because it required too much processing time or resources 😱). Luckily, we can schedule this type of tasks using WordPress cron events. I’ll explain how to do it, but let me explain first why we had to use this functionality in Nelio.

Nelio Content’s Analytics: Simplifying a Complex Problem

If you’re a new visitor who end up here without knowing anything about us, welcome! In Nelio we are dedicated to create plugins for WordPress and to make a living out of it. Today I’ll tell you about Nelio Content, our plugin to help you improve both the creation and promotion of your website content.

Analytics by Nelio Content.
Analytics by Nelio Content.

Since version 1.2.0, Nelio Content includes an analytics section where you can check how well (or bad 😑) your content is performing. The idea is that you can see both the effectiveness of your posts according to the number of pageviews they have and the degree of engagement they have generated in your social networks (that is, how many likes they have on Facebook or LinkedIn, how many comments they received, etc.).

As you see, this is very easy to explain and understand, but adding analytics entails a big problem: how can I get the analytics data updated without causing my WordPress server to freeze? Damn, that’s tough 😓.

In an ideal world, you’d want to have all this data updated every millisecond (or even every nanosecond), but I’m sorry to tell you this is impossible. Before we started programming like crazy, Nelio’s team sat down to discuss what made sense. There were two options on the table:

  1. Every time a visitor visits a page or post, we save the visit tracking info in the database to count as a +1 in the total number of visits of that content.
  2. We access this information through an external service, such as Google Analytics.

If you use Nelio Content, you will know the option we finally implemented was the second one. Among others, the main reasons were:

  • Saving the tracking information of visits directly in WordPress is a nightmare. If you have a web with a lot of traffic you’ll end up compromising its speed because of the database. You’ll collapse it doing writes in the tables constantly. Bad idea!
  • Storing this information in the cloud has a (high) cost that we would have to pass on to our customers. That’s feasible, but we wanted analytics to be accessible to everyone… So it’s off the table.
  • If we track the information about visits (regardless of where we save the data), our users won’t be able to see real data at the moment, because you’d need to wait a few weeks so that we can collect information about your visitors. But we want data ASAP (not only we develop Nelio Content, but also use it on a daily basis… and we’re some of the most demanding customers we have 😊)!
  • If most websites are already tracking this information with specialized external tools, are we going to reinvent the wheel here? It does not make much sense.

Taking all of this into account, we decided to use the Google Analytics API to get the pageviews information. Now, we still have the problem of when to update the data. Querying Google Analytics every time we need the data is not the wisest solution. In addition, we also have social network analytics, which we cannot calculate every time because some APIs have restrictions on the number of calls that can be made in a certain period of time (yes, I’m looking at you, Facebook 😞).

In the end, we arrived to the conclusion that we must have this information pre-calculated in the client’s database, and then update it from time to time if it makes sense (note that if we try to update it every minute, we will be collapsing both the database and the WordPress server). And don’t worry about storing this information in the database—for each post, we just save its total number of pageviews and engagement metrics (that is, we only store aggregated data, not every individual tracking information). Therefore, the additional space we occupy in the client’s database is minimal (a few more postmetas, which we automatically remove if you disable the plugin and so indicate us at that time).

Going back to the issue of updating analytics only from time to time, and only if it makes sense, what we’ve finally implemented is the following:

  1. Allow users to calculate analytic data for the latest posts when enabling analytics in Nelio Content. This way you can have valid data from the first minute of using our plugin.
  2. Use the WordPress cron to re-calculate the analytics following this pattern:
    • First, update the analytics of the content published today.
    • One hour later, pick up a few posts published a month ago and update their stats.
    • One hour later, pick up a few posts published more than a month ago and update their stats.
    • One hour later, pick up a few posts from the top of the analytics and update their stats.
    • One hour later, start the process again from the beginning.
  3. Give the option to the users so that they can update the analytics information at any given time in the settings of Nelio Content. This is useful for the most demanding users who might want to start the re-calculation process themselves.

As you can see, we are only running the process every hour, and in each execution we only re-calculate the analytics of a very small number of posts, so that the server load is not affected and we respect the limits of the social network APIs. In addition to this, the selection of posts to re-calculate is done wisely. For each post, the date of the last time the statistics were recalculated is stored. We use this value to sort the posts and update those with older data.

This approach works quite well on most situations: when new content is published it usually gets more visits, comments, and activity in social networks during the first day of life. Therefore, the entire pattern we described before makes sense. In Nelio Content we update more frequently the newest posts and also the most relevant ones (that on top of the ranking), while still saving some time to update the rest of your content.

How the WordPress Cron Works?

Every time a page is loaded in WordPress (either a post, your homepage, or any other content accessible through a URL in WordPress), a list of scheduled functions is checked to see if they should be executed or not. If there are some functions, they’ll be launched asynchronously, so that their execution doesn’t affect your page load.

In WordPress, tasks are managed by wp-cron.php. There are two types of tasks:

  • Tasks that run only once at a predetermined time, such as publishing a scheduled post for Thursday at 10am in the morning.
  • Recurring tasks. These are tasks that have to be executed from time to time, such as checking for updates of plugins or themes.

To schedule a task at a preset time, you just need to do this:

This way you will execute the function my_task() one hour after the moment in which you execute this code (you can test it by putting it directly in your functions.php, for example). Look at the information in the Codex about the wp_schedule_single_event function.

And here you have an example to create a recurring task that will run every hour:

Again, you have the details in the Codex for the wp_schedule_event function. It’s very simple, believe me. If you want a more detailed tutorial, don’t miss this one from SitePoint.

By the way, if you want to see which tasks you have scheduled in your WordPress cron, you can do it by installing the WP Control plugin. This has been very useful to do the tests during the development of Nelio Content analytics.

In addition to the WordPress cron, you have other options available to run tasks in the background with WordPress. The most popular are these two:

Both are libraries that are going to provide you with more complex options than you’ll find with the WordPress cron, but I recommend you take a look at them because they may be more suited to your specific needs.

Final Remarks

If you have reached the end of this post, congratulations! As you’ve seen, there is always a lot of work behind a concrete functionality. Surprisingly, though, most of this work is about discussing/deciding how to do things, rather than actually getting them done.

The development of plugins for WordPress has its difficulties. If you don’t have the necessary experience, you can do incredible damage. You should look for information and try to make sure that the option you chose to solve any given problem is “the best” (or at least the “least bad”). To do this, you have multiple resources, such as asking your local community, where you can share your most complex doubts, or read the plugin development handbook itself. What would have happened if we had decided to update analytics each time a visitor comes? 🤔

Finally, please leave a comment explaining to me what you thought of this article or detailing how you use cron tasks. It won’t take you more than a minute and you’ll make me very happy! 😍

Featured image by Markus Spiske.

by

Antonio obtained his PhD in Computer Science at UPC. He has several publications in the field of data mining and information retrieval applied to conceptual modeling and health informatics. He specialized in the design, development, and integration of web services and cloud applications. He's an active contributor to the WordPress community and participates in meetups, seminars and WordCamps.

Leave a Reply

Your email address will not be published. Required fields are marked *