Optimising load time for visitors and robots

Posted April 14th, 2010 in SEO by Tim

Over the past week, many SEO blogs have reported that Google is now using page load time as a factor in their rankings algorithm. I wanted to follow up and explain specifically how to optimise for visitors and robots, because there is a distinct difference between the two.

Optimising for robots

When search robots crawl your site, they aren’t downloading all the external files your HTML references which includes images, scripts, and stylesheets. Below are a few easy ways to speed up how quickly your HTML is served are to robots.

Optimise your database queries

Inefficient joins and not using indexes correctly can dramatically slow page load times. If you have a CMS or database driven site, logging the time to serve each template for a few days can give you valuable information on which pages are being served the slowest so you can focus on the really slow ones first.

Ensure scripting and stylesheets are in external files

Keeping your scripts and stylesheets in external files is really important in minimising the amount Google has to index and cache. These are also typically in the <head> tags which push your unique content further down the HTML document.

Use GZIP compression

This will use more server processor time, but if you are serving a lot of content in each HTML document, it’s can deliver huge HTML file size savings.

Optimising for visitors

All of the above also helps visitors, but there are few extra things you can do to improve the user experience:

Local web hosting

Be sure to host your site domestically in the country where most of your visitors reside. This will decrease the latency making pages load faster which also benefits your SEO.

Client side caching

Setting longer cache periods for photos and external scripts and stylesheets can significantly improve the user’s experience. Whether it’s the user’s first visit and they browse multiple pages or if they’re returning visitors you can often save them from re-downloading hundreds of kilobytes per page view.

Consolidating external scripts & stylesheets

You can take externalising scripts and stylesheets one step further by consolidating all your CSS and scripts into a single file (one for each). This can save multiple HTTP queries which will reduce the HTTP requests per HTML document.

Further reading:

Avoiding propagation issues when migrating web sites

Posted February 15th, 2010 in SEO by Tim

If you’ve ever migrated a domain to a new server and you’ve received errors in your browser about not being able to resolve the domain (“DNS not found”) then you’ve experienced a propagation issue.

What is happening is that your ISP hasn’t been told about the new record you’ve created or it’s using cached information which is pointing to an old server.

This is a big problem for sites with a lot of traffic is you don’t want half your visitors seeing the old site and the other half seeing your new one.

This is especially important for ecommerce sites because you want to avoid handling orders from both the old and new site. So how can we mitigate this issue?

Where is your staging site?

Before you read on, I want to highlight that if your staging site and live site are on the same server, this article isn’t relevant to you.

In this case all you need to do is to change the document root path to where the new staging site is being stored. All your visitors will see the new site the moment you update your configuration and restart the web server.

If your staging and live sites are on different servers, keep reading..

Let’s assume you have your staging site at http://www2.yourcompany.com and you’ve had it up and running for a few weeks or more. It’s safe to assume that it will resolve for everyone because it has had time to propagate.

Let’s also assume that the URLs on the new site have probably changed somewhat as you assumed it would be a good time to roll out more SEO friendly URLs and you’ve implemented 301 redirects to handle the old site’s URLs.

Tip: Read this tutorial on implementing 301 redirects on your new site.

When you make the DNS changes to make the new site live, be sure your new site has DNS A records for both “www” and “www2″. We need both to be in play.

You’ll then need to implement some redirections should anyone see your old site because the new details haven’t propagated to their ISP.

Just to re-iterate, all of the below 302 temporary redirections should only be implemented on your old site and not on your new site where 301 redirections should be in place for only URLs that changed.

Case 1: URLs that have changed

You can basically copy your 301 rules from your new site but make them 302 and absolute URLs via .htaccess.

The end result should be something like www.yourcompany.com/category-15 302 redirecting to www2.yourcompany.com/lcd-tvs

Case 2: URLs that haven’t changed

For all the URLs which remain unchanged, we will need to redirect them to www2.

e.g.
RewriteRule (.*) http://www2.yourcompany.com/$1 [L,R=302]

Avoiding duplicate content issues on www2

Because we have both www and www2 in play, we need to ensure search engine robots don’t start indexing www2 URLs.

The quickest way to fix this is to implement the rel canonical tag for every URL.

e.g. both www.yourcompany.com/lcd-tvs and www2.yourcompany.com/lcd-tvs should output

Note: With all the redirections being put in place, be careful not to create 301 redirection loops.

Cleaning up: Decommissioning www2

Once you’re confident no one is landing on your old site, you need to fix the canonical issues you’ve created by having www2 as an alias.

e.g. RewriteCond %{HTTP_HOST} ^www2\.yourcompany\.com [NC]
RewriteRule (.*) http://www.yourcompany.com/$1 [L,R=301]

A good way to check is looking at the old site’s web server logs to see if there’s been any activity.

Troubleshooting: Issues with SSL certificates

If your site has certain pages on SSL, then this method will cause SSL certificates to complain about not matching the domain for visitors who are accessing ‘www2′.

The upshot is that forms asking for personal/payment information should be the only URLs using SSL, so not everyone will see the errors. Furthermore, it may be a lesser evil than having to deal with orders from your legacy site.

Review: Advanced Web Ranking

Posted January 19th, 2010 in SEO, SEO tools by Tim

This is the first of a series of posts where I’ll be explaining what free and paid SEO tools I use on a regular basis.

Advanced Web Ranking is a client based application I typically use on a weekly basis to store and record search engine rankings for my own sites and client web sites.

Here’s an overview of the features:

Interface

The way rankings are presented and compared is well presented in AWR, especially compared to Raven SEO Tools SERP Tracker which only lets you compare rankings to one competitor at a time.

Advanced Web Ranking screenshot

Note: That is a standard screenshot as I didn’t want to give away what keywords and sites I track!

On demand rankings

Unlike online tools out there such as Raven SEO Tools SERP Tracker or SEOmoz PRO Rank Tracker you can run reports when you like.

This comes in handy when you are pitching to a new client and want to compare their web site rankings to competitors straight away. AWR lets you can create the profile and run it immediately.

Multiple proxy servers

Another great feature on AWR Enterprise is the ability to run multiple proxies in parallel. I subscribe and have 10 proxy servers to myself which allows me to run reports 10x faster than normal.

Look back on historic rankings

Month on month or week on week ranking comparisons aren’t always what I need to see. AWR has a great feature to compare two date ranges you’ve run rankings reports for in the past.

This is handy for when I need to compare year on year to make sure we can deliver equal or more traffic for seasonal events such as Christmas, Valentine’s Day, etc.

Tracking new competitors

When I notice new competitors achieving strong rankings gains across a few competitive keywords, I have been sure to add them to the list of domains to track in AWR.

After adding the new competitor, I was pleasantly surprised to see that AWR stores all URLs, not just those from tracked domains.

This meant that I could get historic rankings for that new competitor for as long as I’d been tracking my client’s domain!

Visibility score: Room for improvement

I’ve never really been a fan of the visibility score which AWR uses.

Firstly, you need to manually weight the search engines, because by default they are treated equally so a big change in rankings on Yahoo will have the same effect as one on Google.

Now factor in that most web sites I run get around 2% of their organic traffic from Yahoo, the visibility score it isn’t really accurately reflecting a traffic change.

Secondly, it doesn’t factor in competition on a keyword. The term “life insurance Australia” is no where near as competitive as “life insurance” yet is treated the same in terms of visibility.

I’m not sure how to best solve this, but one way could be to measure the PageRank of pages in the SERPs, i.e. the higher the aggregate PR, the more competitive the keyword.

Keyword research: Yet to try

For most of my clients I’ve been fortunate enough to be able to rely on SEM data. Search query reports along + conversion rates can’t be beat.

That being said, AWR does have its own keyword research tool which I’ve yet to try, but it’s hard to go past a quick visit to the Google Keyword Tool.

Conclusion

There are plenty of other features I haven’t explored in AWR so please feel free to comment on what features you like or dislike about AWR! Future reviews will include Raven SEO Tools, SEOmoz PRO, Majestic SEO and more!

Snagit for Mac: Screenshots for OS X

Posted January 7th, 2010 in SEM tools, SEO tools by Tim

I was pleased to have gotten an e-mail today from TechSmith announcing have finally released a Mac OS X version of Snagit!

Macbooks are so popular amongst web developers and search marketers, yet up until today I haven’t found anything like Snagit for Windows for web site audits, which I use in documents and e-mails on a daily basis.

Download Snagit Mac

Tracking Google index performance with XML sitemaps

Posted January 1st, 2010 in Google Webmaster Tools, SEO by Tim

I recently updated a script that generates sitemap XMLs for a web site I run. It’s a local review business ditectory with hundreds of thousands of pages, but with only around 150k in the Google index.

Like any database driven site, those hundreds of thousands of pages can be boiled down to a few distinct templates, for me they are:

  1. Home page
  2. “Article” pages (FAQs, online help, about us, contact us, etc.)
  3. Locality page (e.g. best rated Sydney metro businesses)
  4. Industry + locality page (e.g. Sydney hairdressers)
  5. Business listing page (e.g. Toni & Guy Bondi Beach)

The bottom three templates (in reverse order) represent the most unique pages per template and make up 99% of organic traffic to the site.

Currently, my URL structure is pretty systematic, so using Google search filters such as “site:abc.com inurl:business_listing” I am able to get total number of indexed pages on a template by template basis.

However, in the next few months I intend to improve some URLs from say /business-listing/toni-guy-bondi-beach/12345-54321.html to simply /toni-guy-bondi-beach/ which will make make it impossible to track total indexed pages using my current method.

But, if you’re in the same position there is a solution, which so simple that I feel like a post dedicated to it is overkill. Because it’s a database driven site, I store all the page aliases (e.g. “toni-guy-bondi-beach”) in my locality, industry and business listing database tables.

This allows me to update my script to separate each template into different sitemaps. Google Webmaster Tools will then show you the total # of URLs in the sitemap vs. the number indexed:

By dividing the numbers, I can easily see which templates aren’t performing as well and look to see if there are any obvious factors causing it to be considered duplicate content or if it’s an internal linking issue, etc.

If you’re interested in learning more about large site indexation, there is an SEOmoz post by Rand on Google’s indexation cap which is an interesting read.

301 redirects in a large site migration

Posted December 27th, 2009 in SEO tools by Tim

Today I migrated one of my sites which has around 150k pages in the Google index and thought I’d share a relatively quick and easy way to check the migration went smoothly.

Migrating to a new platform or server is always a risky time for any site that relies on organic traffic. There’s a big risk of pages going missing and redirects not working properly.

The new site featured a new design plus a different CMS platform but essentially had an identical URL structure, so ensuring existing URLs still worked was the primary goal.

I wanted to do the following on both the staging site and post-migration on the live site:

  1. Get the URLs indexed by search engines
  2. Batch testing redirections

Indexed URLs

On a database driven site with thousands of pages, it’s not always possible to get a complete list of possible URLs, so we need to prioritise the URLs that search engines are aware of.

For smaller sites (under 1,000 pages), GSiteCrawler does a reasonable job. The downsides are that it puts unnecessary load on you web server and secondly I find it crashes for larger sites.

My preferred method is to get it from a search engine index. Grabbing index data from the major engines can be a hassle. Scraping the engines is cumbersome and it’s a hassle when you get thrown a captcha.

I prefer to use Majestic SEO which provides data from a smaller search engine they run. It uses similar crawl algorithms to Google, so it’s going to be a very similar dataset, and best of all it’s free to use on your own site.

Once you’ve validated your site, go to Domain URLs > Download All and all the URLs you’ll need to redirect will be in the first column.

Note: I recommend against using the sitemap XML as it’s likely to be an incomplete picture.

Batch testing URLs

When migrating a site, the kinds of errors you don’t want to see previously working URLs giving are 404 not found, 401 unauthorized and 500 internal server errors.

I was using a sub-domain on the staging site, so once I had my list of URLs all I needed to do was search and replace “http://www.” with “http://dev.” in Excel then get a good cross section and run it through an HTTP header checker.

I put through 500 URLs at a time through my own batch HTTP header checker and fixed up any pesky 404s I found.

Post migration, I picked another set of URLs to test and again got positive results. To be 100% sure, I will be logging into Google Webmaster Tools tomorrow morning to check for 404s.

Good luck with your site migration!