Monday 11 August 2014

How to copy or backup a web site – for free

Finding that someone has hacked, cracked or maliciously exploited your website is not something guaranteed to put you in a good mood. When this recently happened to the Bitwise site, I have to admit to the fact that I responded with a few choice words, many of them containing precisely four letters.

Swearing at the unknown perpetrators of the aforementioned hack, crack or exploit did not, however, fix the problem. More radical measures were needed. I decided to convert the site from one that is dynamically generated by a PHP-based Content Management System (CMS) into one that is made up of static HTML pages. The problem was: how was I going to do that conversion? By their very nature, dynamically-generated pages don’t actually exist until someone tries to log onto a certain web address; this causes the CMS to go into action by getting data from a database and a page-layout from a template file; it then stuffs these together along with assorted styles and images to create the requested page.


Initially I wasn’t even sure if it would be possible to create a static site from one that was dynamically generated. And then I stumbled upon a wonderful piece of software called HTTrack Website Copier. This lets you log onto a web site and download all the pages, with the links, styles and images intact. Better still, it can transform dynamic pages into static HTML ones. If you ever want to make a backup of your (or someone else’s) web site, I recommend this program.

But be careful! It is so good at following links – both to pages inside and outside the current site - that you may end up downloading half the Internet by accident. I was halfway through downloading Wikipedia (which was certainly not my intention!) before I noticed the problem. HTTrack does let you set options to download links only to a specific ‘depth’ or to avoid downloading certain file types but you may need to experiment with these before you commit yourself to downloading a site. HTTrack is free (I used the Windows version but there are also Linux and OS X versions) and can be downloaded here: http://www.httrack.com/