1

Steph W. from SEOPressor

Hi there!

Would you like to 10x your website visitors within 7 days? This "Definitive Guidebook To Semantic Keyword Strategy" can definitely help you!

Grab the series in PDF

SEO for JavaScript-powered websites (Google IO 18 summary)

By jiathong on May 28, 2018

SUBSCRIBE TO SEOPressor BLOG

Join 50,000+ fellow SEO marketers!
Get SEOPressor latest insights straight to your inbox.
Enter your email address below:

90

score %

SEO Score

SEOPressor SEO Score

Found us from search engine?

We rank high, you can too.

SEOPressor helps you to optimize your on-page SEO for higher & improved search ranking.

Get Your SEO Score Now

SEO for JavaScript-powered websites (Google IO 18 summary)

SEO for JavaScript-powered websites (Google IO 18 summary)

In the recent Google I/O 18, Tom Greenaway and John Muller of Google presented a session about making your modern JavaScript powered websites search friendly. A list of best practices, useful tools, and Google policy change was discussed. The duo also shed some light (which Google is usually stingy about) on how the crawl and index process work.

If you don’t want to spend 40 minutes watching the recording, here’s a quick summary of the important key points of the session.

A brief background introduction on the presenters. Tom Greenaway is a senior developer advocate from Australia. While John Mueller (aka johnmu, ring a bell?), is Google’s webmaster trends analyst from Zurich, Switzerland.

Tom Greenaway John Mueller from Google

How does crawl, render and index works for JavaScript powered websites?

Tom started the talk by sharing a little background of search engines. The purpose of search engines is to provide a relevant list to answer user’s queries. A library of web pages is compiled where answers are pulled from. That library is the index.

Building an index starts with a crawlable URL. Now, the crawler is designed to find contents to crawl. To do this, the content must be retrievable via an URL. When the crawler gets to an URL, it will look through the HTML to index the page as well as find new links to crawl.

how search works

Here’s a diagram on how search works for Google.

So how do you make sure that your content is reachable for the Googlebot?

Here’s what you need to know, Tom shared the six steps to ensure your web page will be indexed.

    1. Make sure that your URL is crawlable
    – Set up robots.txt at the top level domain of your site. Robots.txt is useful to let Googlebot know which URLs to crawl and which to ignore.

    2.Utilize canonical tags
    – In case of content syndication where a content is distributed on different sites to maximize exposure. The source document should be tagged as the canonical document.

    3. Make sure the URL is clean and unique
    – Don’t list session information on the URL.

    4.Provide a sitemap to Googlebot
    – That way the crawler has a list of URLs to crawl and you can sleep better at night knowing your website is properly crawled.

    5. Use history API
    – Which replaces the hashbang tag(#!), which, if used will no longer be indexed.

    6. Make sure your links have anchor tags with HREF attributes
    – Googlebot only recognizes links with BOTH anchor tags and HREF attributes, otherwise, they won’t be crawled therefore never indexed.

Google has encountered a list of problems with JavaScript since they have been used to build websites.

Tom shared a list of most commonly faced JavaScript problems,

which, you should take a look at so you won’t make the same mistakes.

    1. HTML delivered from the server is devoid of any content.
    – Which leads Googlebot to assume that there’s nothing to index.

    2. Lazy loading images
    – They are only sometimes indexable. To make sure that they are properly indexed,use noscript tag or structured data.
    – Take caution, images only referenced through CSS are not indexed.

    3. Any contents that are triggered via an interaction won’t be indexed
    -Googlebot is not an interactive bot, which means he won’t go around clicking tabs on your website. To make sure he can get to all your stuff either preload the content or CSS toggle visibility on and off.
    – What’s better, just use separate URLs to navigate user and Googlebot to those pages individually.

    4. Rendering timeout
    – Make sure your page is efficient and performant by limiting the number of embedded resources and avoid artificial delays such as time interstitials.

    5. API that store local information is not supported by Googlebot.
    – Instead, it crawls and renders your page in a stateless way.

Now, due to the increasingly widespread use of JavaScript, there is another step added between crawling and indexing. That is rendering. Rendering is the construction of the HTML itself. Like mentioned before, the crawler needs to sift through your HTML in order to index your page. JavaScript-powered websites need to be rendered before it can be indexed. According to Tom and John, the Googlebot is already rendering your JavaScript websites.

What we can make out of the rendering process and indexing process for a JavaScript website is as below.

    1. Googlebot uses the Chrome 41 browser for rendering
    -Chrome 41 is from 2015 and any API added after Chrome 41 is not supported.

    2. Rendering of JavaScript Websites in Search is deferred
    – Rendering web pages is a resource heavy process, therefore rendering will be delayed for a few days until Google has free resources.

    3. Two-phase indexing
    – First indexing happens before the rendering process is complete after final render arrives there will be a second indexing.
    – The second indexing doesn’t check for canonical tag so the initially rendered version needs to include the canonical link, or else Googlebot will miss it altogether.
    – Due to the nature of two-phase indexing, the indexability, metadata, canonical tags and HTTP codes of your web pages could be affected.

javascript rendering by googlebot  is put off

John Mueller takes the baton and shares with us some basic information on

different kinds of rendering and most importantly, the rendering method that Google prefers.

    1. Client side rendering
    – This is the traditional state where the rendering happens on the browser of users or on a search engine.

    2. Server side rendering
    – Your server deals with the rendering and serve users and search engine alike static HTML.

    3. Hybrid rendering (the long-term recommendation)
    – Pre-rendered HTML is sent to users and search engine. Then, the server adds JavaScript on top of that. For the search engine, they will simply pick up the pre-rendered HTML content.

    4. Dynamic rendering (the policy change)
    – This method sends client side rendered contents to users while search engines got server side rendered content.
    – This works in the way that your site dynamically detects whether its a search engine crawler request.
    – Device focused contents need to be served accordingly (desktop version for the desktop crawler and mobile version for the mobile crawler).

hybrid rendering

How hybrid rendering works.

Now that it is out in the open that Google prefers the (NEW) dynamic rendering method to help the crawling, rendering and indexing of your site.

John also gives a few suggestions on how to implement dynamic rendering.

    1. Puppeteer
    – A Node.js library, which uses a headless version of Google Chrome that allows you to render pages on your own server.

    2. Rendertron
    – Could be run as a software or a service that renders and caches your content on your side.

Both of these are open source projects where customization is abundant. John also advises that rendering is resource extensive, so do it out of band from your normal web server and implement caching where needed.

The most important key point of dynamic rendering is the ability to recognize a search engine request from a normal user request. So how could you recognize a Googlebot request? The first way is to find Googlebot in the user-agent string. While the second way is to do a reverse DNS lookup.

John stresses during the session that implementing the suggested rendering methods is not a requirement for indexing. What it does, is making the process easier for Googlebot. Considering the resource needed to run server side rendering, you might want to consider the toll before implementing.

So when do you need to have dynamic rendering?

Do you need to implement dynamic rendering

When you have a large and constantly updated website like a news portal because you want to be indexed quickly and correctly. Or, if you’re relying on a lot of modern JavaScript functionality that is not supported by Chrome 41, which means Googlebot won’t be able to render them correctly. And finally, if your site relies on social media or chat applications that require access to your page’s content.

Now let’s look at when you don’t need to use dynamic rendering. The answer is simple, if Googlebot can index your pages correctly, you don’t need to implement anything. So how can you know whether Googlebot is doing their job correctly? You can employ a progressive checking. Keep in mind that you don’t need to run tests on every single web pages. Just test perhaps two each from a template, just to make sure they are working fine.

So here’s how to check whether your pages are indexed

Are your pages properly indexed by googlebot

    1. Fetch as Google on Google Search Console after verifying ownership, this will show you the HTTP response before any rendering as received by Googlebot.

    2. Run a Google Mobile Friendly Test. Why? Because of the mobile-first indexing that is being rolled out by Google where mobile pages will be the primary focus of indexing. If the pages render well in the test, it means Googlebot can render your page for Search

    3. Keep an eye out for the new function in the mobile friendly test. It shows you the Googlebot rendered version and full information on landing issue in case it doesn’t render properly.

    4. You can always check the developer console when your page fails in a browser. In developer console, you can access the console log when Googlebot tries to render something. Which allows you to check for a bunch of issues.

    5. All the diagnostics can also be run in the rich results test for desktop version sites.

Future direction on Googlebot rendering

At the end of the session, John also mentions some changes that will happen. The first happy news, Google will be moving rendering closer to crawling and indexing. Which I assume will mean that the second indexing will happen much quicker than before. The second and last happy news, Google will make Googlebot use a more modern version of Chrome. Which means a wider support of APIs. They do make it clear that these changes will not happen until at least the end of the year.

To make things easier, here are the four steps to make sure your JavaScript-powered website is search friendly.

tldr

With that, the session is concluded. Do check out our slide show for a quick refresh. All in all, Google is taking the mic and telling you exactly what they want. Better take some note.

18 Essential SEO Tools To Optimize Your Website

    An up-to-date list of SEO tools for every marketer to optimize your website.
  • Identify 18 practical tools that save your time to optimize manually
  • Get more traffic and higher ranking with these tools
  • Discover the benefits of every tool to help strengthen your SEO strategy

Updated: 21 June 2018

Lo Jia Thong

About Lo Jia Thong

A polyglot plummeted into the deep blue world of copywriting armed with a burning passion on letters and a fascination on how thing rolls in the world wide web.

Related Articles

Free Essential Blogging and SEO ToolsHow High Page Load Time Is Affecting Your ConversionAn Evolution of SEO Techniques – What’s in and what’s…
Loading Disqus Comments ...

Your FREE Definitive Guidebook To:

Essential Guide To Writing Magnetic Content

Worth $47 NOW FREE
*We hate spams and here’s a promise to keep your email 100% safe
Shares