Last week I shared the first steps for merging our multiple webs into the one you’re seeing right now. In that post I explained how I exported the databases of the three source sites, how I cleaned them so that they only contain relevant tables, and how I loaded all those tables in the same database without any collisions.
The database I used for the merge was the following:
As you can see, there were four blocks: three of them corresponded to our previous websites, which were WPrincipiante, Nelio A/B Testing, and (the old version of) Nelio Software. You can easily spot them because of the prefixes we used. The fourth block contained the tables we used in the final multi-site installation.
Today, I describe the mapping we defined between source and destination tables, so that we could move content from one end to the other properly. Even though this seems an easy step, it’s actually the most important one for merging sites—a clear mapping guides the whole migration process. Let’s go!
Our new website is a multi-site WordPress installation, available in English (neliosoftware.com) and Spanish (neliosoftware.com/es/). The goal of this migration was to merge our three blogs and had them all under the same domain. On the one hand, Nelio A/B Testing and (the old) Nelio Software blogs, both in English, had to be merged in the English version of our new site. On the other hand, WPrincipiante (a Spanish blog) had to be moved to the Spanish version.
There’s plenty of data that could be moved from one site to the other, but we decided we’d only move “blog” content. That is, users, posts, comments, categories, and tags. Note that the order in which I mentioned the elements is not random—some elements depend on others, so we had to start with those that don’t depend on anything else, or we would have had trouble dealing with identifiers. Users don’t refer to any other element, so they were easily moved. Then we moved posts, which hold a reference to their authors (that is, to users). Finally, we moved comments (which link to both posts and users), and categories and tags (which classify posts).
These were the tables we had to move to the destination tables of our multi-site installation:
In case you don’t know, multi-site WordPress installations use multiple table sets (one per sub-site). However, user-related tables are shared among all installations. Therefore, we only needed to consider two destination tables:
In our case, the only users we had in our source websites were 11 post authors. In order to do that, we simply needed to insert each source user into the destination table. This is as simple as running a query on the source
users table (i.e.
wprincipiante_users) and insert each match into the multi-site’s
One of the things you should take into account when merging content from different sources is identifier collisions. For example, if we have a user named john.wayne in
abtesting_users (whose ID is 3) and another user named mary.smith in
neliosoftware_users (whose ID is also 3), when I try to insert the second (mary) I’ll get an error telling me that there’s already a user (john) with that same ID.
In order to overcome this problem, we used offsets, which simply added a certain number to source IDs so that they didn’t collide with already used IDs. For instance, assume we first move
abtesting_users, and assume the highest ID used in that table is 7. Well, that’s the offset we have to apply to the next table (
neliosoftware_users). By applying this idea to the previous example, we’d first moved john.wayne, whose ID is 3, and then we’d move mary.smith, whose ID would be 10 (her source ID was 3, but we added an offset: 7+3=10).
You have to be very careful when working with offsets. As I stated a few lines above, elements are related with each other (e.g. posts hold a reference to their authors). You always need to know the exact offsets you applied to a particular table, so that you can apply the same offset to the references. For example, when we moved Nelio Software posts to the destination table, we had to add the offset we computed in the previous paragraph to the
post_author column, so that the author was linked to the proper user. Thus, for example, a post written by mary.smith had “user 3” as its author in the source database, but should have “user 10” as its author in the destination database (which is the new ID of our beloved Mary).
Finally, we had to move the following three tables:
wprincipiante_usermeta, which contain meta-information related to our users such as social profiles, description, personal data, and so on. There was plenty of data here that we wanted to keep, but there was also a lot of stuff that we wanted to get rid of.
So, how do you tell what’s relevant and what isn’t? Well, I’m afraid you have to decide by yourself. In our case, we looked at the
meta_key column, which contains the unique name of the meta-field, and we “guessed” whether that field could be relevant or not. My advice here is: focus on one user only and look at all the meta-data you have for him (the previous screenshot shows an example). When you see something that clearly makes sense, write down the meta’s name and move the data. If it doesn’t, just get rid of it. Once you’re done, just move the relevant metas. Oh! And don’t forget to add the proper offset to
Another important detail (“god is in the details”) when moving users is their permissions. Permissions are defined in two different meta-fields in
wp_user_level. Source permissions are automatically inherited for the main sub-site (in our case, neliosoftware.com). But if we wanted a user to have the same permissions in one of our sub-sites (for example, neliosoftware.com/es/, whose site ID is 3), we had to define two extra meta-fields:
And that’s all we had to do to move our users! I know it’s been slightly complex, but I really hope you got all the concepts right ?. As you’ll see shortly, we applied the same ideas over and over again, so don’t be afraid of reading the remaining paragraphs ?. Just bear this ind mind: look at the source tables and save the relevant data, whilst getting rid of anything else.
Post data is stored in the following tables:
Since we wanted to run a multi-language site, we had to move content to one sub-site or the other depending on its language. In our case, this was quite straightforward. On the one hand, we moved English content (from Nelio A/B Testing and Nelio Software) to
wp_postmeta. On the other hand, we moved Spanish content (from WPrincipiante) to
The process was exactly the same we just applied. First, we selected the published posts (
'publish') in the source
posts tables (i.e.
wprincipiante_posts). Since we were not interested in moving pages, we only focused on the entries that were posts (that is, entries whose
It’s also worth mentioning that the posts table does not only contain pages and posts, but also meta-data of the files you uploaded to your media library. Since we were also interested in moving files to our destination WordPress site (we didn’t want to lose any images, did we?), we had to select them too. Therefore, our SQL query also had to include entries whose
SELECT * FROM prefix_posts WHERE (post_type = 'post' AND post_status = 'publish') OR (post_type = 'attachment')
* Simply replace
prefix_posts with the appropriate value (i.e.
wprincipiante_posts, as required).
Again, don’t forget to apply offsets. In this particular case, we had to compute new offsets for post IDs and we had to fix
post_author‘s values applying the offsets we computed when we moved users.
Another important aspect while migrating posts is column
guid. WordPress uses the value of this column to generate your RSS feeds properly. You have to change its value so that there are no collisions in your destination database. In our case, we applied the following rules:
foreach row r in wprincipiante_posts: if r.post_type = 'post' then r.guid = 'https://neliosoftware.com/es/?p=' + r.ID if r.post_type = 'attachment' then r.guid = r.guid->replace( 'http://wprincipiante.es/wp-content/uploads/', 'https://neliosoftware.com/es/wp-content/uploads/sites/3/' ) foreach row r in abtesting_posts or neliosoftware_posts: if r.post_type = 'post' then r.guid = 'https://neliosoftware.com/?p=' + r.ID si r.post_type = 'attachment' then r.guid = r.guid->replace( '*/wp-content/uploads/', 'https://neliosoftware.com/wp-content/uploads/' )
Don’t be scared! ? The previous fragment is actually quite simple ?. Let’s break it down. If we’re fixing a post’s
post_type = 'post') we simply need to use the post’s short URL, which is built by simply appending the query string
?p=post.ID to your domain.
If, on the other hand, we’re fixing an file (
post_type = 'attachment') we simply need to make sure that the path in the URL points to the real destination path. In our case, all images from WPrincipiante would then be under the uploads of one sub-site (
https://neliosoftware.com/es/wp-content/uploads/sites/3/) and all the images from the other two sites would be under
Once this is clear, we could then move the content from the old tables to the new ones. As I said, English content was moved to
wp_posts and Spanish content was moved to
wp_3_posts. Now let’s take a look at
These tables contain meta-information of your
posts. They’re actually quite similar to user meta-information, which we already analyzed in depth. Again, the easiest way to tell what to move and what to get rid of is to focus on one post and look at each meta-field individually. An important (and standard) example of this meta-information is
_thumbnail_id, which tells WordPress the featured image used by a post.
When moving meta-data, don’t forget to add the offsets you just computed for post IDs to
post_id. Also, remember to apply this offset to your
_thumbnail_ids, or your posts might end up using either an incorrect image or none at all.
After this, you know everything you need to know to move post-related content. Once you’re done, you’ll have effectively merged all your content into a single blog! ?
Once we already moved users and posts, it was time we moved comments too. Luckily, this part was quite easy (at least, easier than the previous ones). These were our source tables:
In our case,
commentmeta tables had no relevant information, so I completely ignored them. This is the SQL query I run to get the comments I wanted to move:
SELECT * FROM prefix_comments WHERE comment_approved <> 'spam'
which basically returns all comments, excluding those marked as spam. We had to move every single comment in the result to the final tables… but, again, we couldn’t forget to apply the offsets we computed so far ?
Just as we did in the
posts tables, we first computed new offsets for our comment IDs (in the column named
comment_ID). Then, we added post offsets to
comment_post_ID (which links the comment to a certain post) and user offsets to
user_id (which links the comment to its author).
And there you go! We just moved comments and we got rid of all the spam we had in just a couple of lines!
Moving Categories and Tags
We followed a completely different approach for moving categories and tags. When we looked at our source categories and tags, we quickly realized that there were too many of both. As you may already know, categories and tags classify your content so that it can be easily found by your visitors. If there are too many (for instance, we had a lot of tags used by one post only), they aren’t very helpful.
We decided to simplify both taxonomies: our new blog would only have 4 categories (Community, Online Marketing, WordPress, and Inside Nelio) and we’d only use up to 25 tags. We also decided we wouldn’t create any new categories or tags from now on; we should be able to classify all our content within these categories and tags.
Once we knew the “final” categories and tags we wanted, we simply needed to map the old ones with the new ones. This wasn’t easy (as I said, there were a lot of categories and tags), but I think the final result is pretty good. There’s no general rule for deciding this mapping—it’s up to you! But, regardless of the mapping you decide, the steps for applying it are the same.
In the end, the mapping rules you’ll have will look similar to:
if post had category C1, C2, C3, ..., or Cn then add new category X if post had tag T1, T2, T3, ..., or Tn then add new tag Y
which tell us how posts have to categorized and tagged in the new website. In our case, we defined over 200 rules (don’t worry, I’m not sharing them ?).
To apply all these rules efficiently, you’ll need a function in your code that retrieves the current categorization and tagging of your posts (by looking at the following tables:
term_relationships) and creates the new rows in the destination tables (
wp_3_term_relationships, in our case).
Taxonomy management in WordPress is slightly complex, so make sure you read the Codex first and learn how it works before attempting to automatically generate database entries.
That’s all, folks! Well, almost…
It wasn’t that hard, was it? I warned you—you must care for details if you want to complete a successful migration. If you read the whole post, congrats! ? I’m happy you invested some of your time in learning how complex a migration can be! And I do hope you also learned how to overcome some of the common issues you might encounter ?.
In the third (and last, I promise ?) post of this series we’ll talk about post-processing. Essentially, we’ll see how to smooth all the edges and end up with an awesome website! And I’ll also detail some of the problems we encountered (yes, we had some ?), and how we overcame them.
I honestly think it’ll be an interesting post, probably easier than this one. So I want you to be there! And, as always, if you have any questions you’d like me to answer, just add a comment and let me know.
Featured image by Marc Wieland.