1

301 redirects in a large site migration

Posted December 27th, 2009 in SEO tools by Tim

Today I migrated one of my sites which has around 150k pages in the Google index and thought I’d share a relatively quick and easy way to check the migration went smoothly.

Migrating to a new platform or server is always a risky time for any site that relies on organic traffic. There’s a big risk of pages going missing and redirects not working properly.

The new site featured a new design plus a different CMS platform but essentially had an identical URL structure, so ensuring existing URLs still worked was the primary goal.

I wanted to do the following on both the staging site and post-migration on the live site:

  1. Get the URLs indexed by search engines
  2. Batch testing redirections

Indexed URLs

On a database driven site with thousands of pages, it’s not always possible to get a complete list of possible URLs, so we need to prioritise the URLs that search engines are aware of.

For smaller sites (under 1,000 pages), GSiteCrawler does a reasonable job. The downsides are that it puts unnecessary load on you web server and secondly I find it crashes for larger sites.

My preferred method is to get it from a search engine index. Grabbing index data from the major engines can be a hassle. Scraping the engines is cumbersome and it’s a hassle when you get thrown a captcha.

I prefer to use Majestic SEO which provides data from a smaller search engine they run. It uses similar crawl algorithms to Google, so it’s going to be a very similar dataset, and best of all it’s free to use on your own site.

Once you’ve validated your site, go to Domain URLs > Download All and all the URLs you’ll need to redirect will be in the first column.

Note: I recommend against using the sitemap XML as it’s likely to be an incomplete picture.

Batch testing URLs

When migrating a site, the kinds of errors you don’t want to see previously working URLs giving are 404 not found, 401 unauthorized and 500 internal server errors.

I was using a sub-domain on the staging site, so once I had my list of URLs all I needed to do was search and replace “http://www.” with “http://dev.” in Excel then get a good cross section and run it through an HTTP header checker.

I put through 500 URLs at a time through my own batch HTTP header checker and fixed up any pesky 404s I found.

Post migration, I picked another set of URLs to test and again got positive results. To be 100% sure, I will be logging into Google Webmaster Tools tomorrow morning to check for 404s.

Good luck with your site migration!

One Response so far.

  1. [...] implemented 301 redirects to handle the old site’s URLs. Tip: Read this tutorial on implementing 301 redirects on your new [...]

Leave a Reply