Creating a backup snapshots of my website over time using Page2Images and Google Apps Script

Recently I wanted to dive deeply into some trends I had spotted on my business website. I wanted to see if changes I had made to various pages over time were affecting the way users interacted with the website.

Now, I do use source control and back up the website daily, but all the content is trapped inside a CMS. Restoring from a backup just to take a look at button text from 6 months ago is too much of a hassle. Instead, I wanted an easy way to browse changes over time.

Enter the Page2Images service. It lets you take a screenshot of your website and even lets  you set parameters like screen size, image size, and so on. It’s free to use for the first couple thousand screenshots every month. There’s even an API to let you access everything from inside a program.

Great. Now I just had to set up a script to grab the images a couple times a week. Rather than maintaining this on my own infrastructure, I found that Google has a service called Google Apps Script that lets you write Javascript code that runs on Google’s servers. It allows you to access and manipulate your Google Drive files. And usefully, in my case, it features time based triggers that, in effect, let you kick off your scripts like cron jobs.

I wrote a little script that grabs screenshots and HTML of certain key pages from the target website. I set up time based triggers to run 3 times a week. Everything gets written to my Google Drive account. The script is shown below:

function getWebsiteImage() {
  storeOneUrlImage("http://www.example.com");
  storeOneUrlImage("http://www.example.com/2.html");
 // ... add more URLs here if you want
}

function storeOneUrlImage(url) {
 var url1 = "http://api.page2images.com/restfullink?p2i_url="+encodeURIComponent(url)+"&p2i_device=6&p2i_screen=1200x1024&p2i_size=1200x1024&p2i_key=YOUR_PAGE2IMAGES_API_KEY";
  var response = UrlFetchApp.fetch(url1);
  var responseText = response.getContentText();
  
  var responseHtml = UrlFetchApp.fetch(url);
  var responseHtmlText = responseHtml.getContentText();
  
  var data = JSON.parse(responseText);
  if (data.estimated_need_time) {
    Utilities.sleep(data.estimated_need_time * 2000);
    response = UrlFetchApp.fetch(url1);
    responseText = response.getContentText();
    data = JSON.parse(responseText);
  }
  if (data.image_url) {
    var response2 = UrlFetchApp.fetch(data.image_url);
    var blob = response2.getBlob();
    var d = new Date();
    var folder1 = createOrReturnFolder(DriveApp.getRootFolder(), "Website Image Backups");
    var folder2 = createOrReturnFolder(folder1, d.getYear()+"-"+(d.getMonth()+1));    
    folder2.createFile(blob);
    folder2.createFile(d.getYear()+"-"+(d.getMonth()+1)+"-"+d.getDate()+"-"+blob.getName()+".html" , responseHtmlText);
  }
}

function createOrReturnFolder( parent,  folderName) {
  var folders1 =  parent.getFoldersByName(folderName);
    if (!folders1.hasNext()) {
      var newFolder = parent.createFolder(folderName);
      return newFolder;
    }
  return folders1.next();
}

					
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s