Posted by PierceBrelinsky
- Ensuring that web pages are discoverable by search engines through linking best practices.
- Improving page load times for pages parsing and executing JS code for a streamlined User Experience (UX).
- Rendered content
- Lazy-loaded images
- Page load times
- Meta data
This template is called an app shell and is the foundation for progressive web applications (PWAs). We’ll explore this next.
When viewed in the browser, this looks like a typical web page. We can see text, images, and links. However, let’s dive deeper and take a peek under the hood at the code:
Potential SEO issues: Any core content that is rendered to users but not to search engine bots could be seriously problematic! If search engines aren’t able to fully crawl all of your content, then your website could be overlooked in favor of competitors. We’ll discuss this in more detail later.
As a best practice, Google specifically recommends linking pages using HTML anchor tags with href attributes, as well as including descriptive anchor texts for the hyperlinks:
However, Google also recommends that developers not rely on other HTML elements — like div or span — or JS event handlers for links. These are called “pseudo” links, and they will typically not be crawled, according to official Google guidelines:
Potential SEO issues: If search engines aren’t able to crawl and follow links to your key pages, then your pages could be missing out on valuable internal links pointing to them. Internal links help search engines crawl your website more efficiently and highlight the most important pages. The worst-case scenario is that if your internal links are implemented incorrectly, then Google may have a hard time discovering your new pages at all (outside of the XML sitemap).
Googlebot supports lazy-loading, but it does not “scroll” like a human user would when visiting your web pages. Instead, Googlebot simply resizes its virtual viewport to be longer when crawling web content. Therefore, the “scroll” event listener is never triggered and the content is never rendered by the crawler.
Here’s an example of more SEO-friendly code:
This code shows that the IntersectionObserver API triggers a callback when any observed element becomes visible. It’s more flexible and robust than the on-scroll event listener and is supported by modern Googlebot. This code works because of how Googlebot resizes its viewport in order to “see” your content (see below).
You can also use native lazy-loading in the browser. This is supported by Google Chrome, but note that it is still an experimental feature. Worst case scenario, it will just get ignored by Googlebot, and all images will load anyway:
Potential SEO issues: Similar to core content not being loaded, it’s important to make sure that Google is able to “see” all of the content on a page, including images. For example, on an e-commerce site with multiple rows of product listings, lazy-loading images can provide a faster user experience for both users and bots!
- Deferring non-critical JS until after the main content is rendered in the DOM
- Inlining critical JS
- Serving JS in smaller payloads
Also, it’s important to note that SPAs that utilize a router package like react-router or vue-router have to take some extra steps to handle things like changing meta tags when navigating between router views. This is usually handled with a Node.js package like vue-meta or react-meta-tags.
What are router views? Here’s how linking to different “pages” in a Single Page Application works in React in five steps:
- When a user visits a React website, a GET request is sent to the server for the ./index.html file.
- The server then sends the index.html page to the client, containing the scripts to launch React and React Router.
- The web application is then loaded on the client-side.
- If a user clicks on a link to go on a new page (/example), a request is sent to the server for the new URL.
- React Router intercepts the request before it reaches the server and handles the change of page itself. This is done by locally updating the rendered React components and changing the URL client-side.
In other words, when users or bots follow links to URLs on a React website, they are not being served multiple static HTML files. But rather, the React components (like headers, footers, and body content) hosted on root ./index.html file are simply being reorganized to display different content. This is why they’re called Single Page Applications!
Potential SEO issues: So, it’s important to use a package like React Helmet for making sure that users are being served unique metadata for each page, or “view,” when browsing SPAs. Otherwise, search engines may be crawling the same metadata for every page, or worse, none at all!
First, Googlebot crawls the URLs in its queue, page by page. The crawler makes a GET request to the server, typically using a mobile user-agent, and then the server sends the HTML document.
Then, Google decides what resources are necessary to render the main content of the page. Usually, this means only the static HTML is crawled, and not any linked CSS or JS files. Why?
In other words, Google crawls and indexes content in two waves:
- The first wave of indexing, or the instant crawling of the static HTML sent by the webserver
The bottom line is that content dependent on JS to be rendered can experience a delay in crawling and indexing by Google. This used to take days or even weeks. For example, Googlebot historically ran on the outdated Chrome 41 rendering engine. However, they’ve significantly improved its web crawlers in recent years.
- Blocked in robots.txt
For e-commerce websites, which depend on online conversions, not having their products indexed by Google could be disastrous.
- Visualize the page with Google’s Webmaster Tools. This helps you to view the page from Google’s perspective.
- Debug using Chrome’s built-in dev tools. Compare and contrast what Google “sees” (source code) with what users see (rendered code) and ensure that they align in general.
There are also handy third-party tools and plugins that you can use. We’ll talk about these soon.
Google Webmaster Tools
The best way to determine if Google is experiencing technical difficulties when attempting to render your pages is to test your pages using Google Webmaster tools, such as:
Both of these Google Webmaster tools use the same evergreen Chromium rendering engine as Google. This means that they can give you an accurate visual representation of what Googlebot actually “sees” when it crawls your website.
There are also third-party technical SEO tools, like Merkle’s fetch and render tool. Unlike Google’s tools, this web application actually gives users a full-size screenshot of the entire page.
Site: Search Operator
Here’s what this looks like in the Google SERP:
Chrome Dev Tools
Right-click anywhere on a web page to display the options menu and then click “View Source” to see the static HTML document in a new tab.
Compare and contrast these two perspectives to see if any core content is only loaded in the DOM, but not hard-coded in the source. There are also third-party Chrome extensions that can help do this, like the Web Developer plugin by Chris Pederick or the View Rendered Source plugin by Jon Hogg.
- Server-side rendering (SSR). This means that JS is executed on the server for each request. One way to implement SSR is with a Node.js library like Puppeteer. However, this can put a lot of strain on the server.
- Hybrid rendering. This is a combination of both server-side and client-side rendering. Core content is rendered server-side before being sent to the client. Any additional resources are offloaded to the client.
- Incremental Static Regeneration, or updating static content after a site has already been deployed. This can be done with frameworks like Next.js for React or Nuxt.js for Vue. These frameworks have a build process that will pre-render every page of your JS application to static assets that you can serve from something like an S3 bucket. This way, your site can get all of the SEO benefits of server-side rendering, without the server management!
Note, for websites built on a content management system (CMS) that already pre-renders most content, like WordPress or Shopify, this isn’t typically an issue.
The web has moved from plain HTML – as an SEO you can embrace that. Learn from JS devs & share SEO knowledge with them. JS's not going away.
— ???? John ???? (@JohnMu) August 8, 2017
Want to learn more about technical SEO? Check out the Moz Academy Technical SEO Certification Series, an in-depth training series that hones in on the nuts and bolts of technical SEO.
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!