Start Exploring Keyword Ideas

Use Serpstat to find the best keywords for your website

SEO, June 25, 2020 | 20480 14 | 29 min read – Read later

The Importance Of Technical SEO: Insights Of Technical Issues Of Unibaby Project

How We Doubled Organic Traffic For An Amazon Affiliate Website: Experience Of SERPBiz

Holistic SEO Expert at Performics

In this SEO case study, I'll show how to implement an effective technical SEO and authoritative content and increase a site's organic traffic by 195% in two months. This case will be shared in a series of three articles. The first part is primarily focused on technical and analytical SEO elements and search engine theories and principles. We will examine the Unibaby project, and I will explain the most pressing problems identified and the processes we followed to resolve them. We'll also take a look at the May 2020 Core Algorithm update's effect on the brand.

Contents

1. Getting started: meet Performics Agency
2. Introducing the Unibaby SEO case
3. Analyzing sites for technical issues
4. Status code optimization and communication with search engine crawlers
4.1 Difference between 404 and 410 status code
4.2 From temporary to permanent redirect
4.3 URL typo correction and canonicalization
4.4 Index cleansing: removing duplicate and underperformed pages
4.5 Removing redirection chains (301 Inlinks) and 404 resources
5. Crawlability issues and wrong robots.txt rules
5.1 Uncertainty principle for search engines and ranking ecosystem
5.2 What we do for better crawling, rendering, and indexing
6. Conclusions and takeaways

Getting started: meet Performics Agency

My name is Koray Tuğberk GÜBÜR, and I am a Holistic SEO Expert. I also work in many different fields, from front-end to Data Science, from UX to CRO. I work in the Performics Marketing Agency, which works with big global brands in more than 57 countries, e.g., Loreal, Samsung, Turkish Airlines, HP, Toyota, and Nestle in the areas of Analytics, SEO, SEA, Media Engagement, and more. With the Forrester Wade Best Agency award, Performics is the company of Publicis Groupe, one of the oldest and largest marketing and communication holdings globally.

“

Our SEO work with Performics and Tuğberk demonstrated its benefits in the short term. We have seen how to think even the smallest SEO problems in a broad technical, theoretical, and practical view and importance of time management within a team spirit. We now see Holistic SEO work as a tradition that must be owned at every level of the company.

Ayşe Akalın, Medya Müdürü, Eczacıbaşı Holding

“

Due to the high competition in the mother and baby industry, we are delighted with the in-depth and detailed analysis that Performics and Tuğberk have provided on our content production processes and our site infrastructure beyond the SEO analysis we have received so far. Thanks to the rapid action plans, we were able to grow over 500% from the beginning of the year.

İlker Öztürk, Eczacıbaşı Consumer Products

Introducing the Unibaby SEO case

In this SEO case study, I'll tell you about the Unibaby case, one of Turkey's biggest product brands in the Mother and Baby niche. Unibaby sells its products mostly on aggregator eCommerce sites. This was mainly because of the brand's weak Search Engine Performance, but last year, they decided to change this situation.

When I started working with the project, I saw that the brand's products were being searched all over the internet, but the brand was not recorded on Google's Knowledge Graph.

As you can see, Google Knowledge Graph API gives an empty list for the "Unibaby" search. This means that Google does not recognize the brand and its website. That's why their E-A-T and brand authority were less than that of the aggregator despite their search capacity and demand.

When I search for the Eczacıbaşı on Google's Knowledge Graph API's database, there are lots of results for Eczacıbaşı Holding, Eczacıbaşı Family, and their other brands. Unibaby can assume the superior entity's authority and trust thanks to entity-based search engine algorithms.

The second point to work on was the website page speed (which is still far from perfect because of the new deployments). The third thing appeared to be technical SEO and authoritative content marketing, focusing on the better page layout.

This is the original organic traffic graphic for the same web entity for the last three years.

This graphic shows more than two years of organic traffic data. You may be able to spot the importance of technical SEO and a proper content marketing strategy implementation and its effect over a short period.

Analyzing sites for technical issues

We use technical website auditing to identify SEO problems and agreement with search engines' requirements.

Specialists often carry out a site audit when the site is under the sanctions of search engines. In this case, an inspection allows you to find errors, fix them, and return your position.

It would be best if you also had an on-page SEO audit when everything is good with the project: traffic is growing, and positions are gradually increasing. You can further speed up the growth of the website by identifying and correcting technical issues.

You should perform a detailed site check before starting the work on site's SEO. It will help to find minor issues and severe problems so that the optimizer can safely work on other ranking factors and promote it.

Try Site Audit here:

Create a project

Go to List of projects and add the domain for analysis. Press the button Add project. In the pop-up window, we enter the project name and the domain, select the group for the project, and click the Create button:

Then set the necessary parameters (check the manual for more details) and click Start.

When the scanning is over, you will see an SDO score - Serpstat internal indicator showing the optimization of your domain. It is estimated based on the ratio of the issues number and their criticality to the total number of possible issues on the site—the higher this indicator, the better the optimization.

Below you can see the list of technical issues with recommendations how to fix them.

Now let's talk about some important aspects of technical SEO in detail.

Effective Site Audit With Serpstat: Tool Overview

Personal demonstration

Our specialists will contact you and discuss options for further work. These may include a personal demonstration, a trial period, comprehensive training articles, webinar recordings, and custom advice from a Serpstat specialist. It is our goal to make you feel comfortable while using Serpstat.

Status code optimization and communication with search engine crawlers

A status code shows the response returned by a resource to search engine crawlers to perform a necessary process according to their rules.

Wrong status codes give false signals, and they create confusion for search engines and their algorithms. To create a search engine friendly web entity, you should ensure semantic, systematic, and consistent communication with search engine crawlers in every aspect of the digital environment.

Difference between 404 and 410 status code

Google has millions of small algorithms that are continuously working to provide the most accurate search results. If you don't give the right signal at the right point, these algorithms will have to work much harder to understand your website, which can negatively impact your site's performance.

You may see that when we have started to implement technical SEO

404 status code means that the resource could not be found. 410 status code indicates that the resource is not there and is permanently unavailable or gone.

If you perform log analysis, you will probably see that search engine robots crawl lots of 404 URLs. You might delete a web page 5 years ago, but Googlebot has a long memory. It scrapes the web and creates more data for its search memory. It means that Google may not index your 404 web pages, but it will continue to crawl them with decreasing frequency to confirm that they are not available.

In Unibaby, I haven't performed log analysis, because the IT team didn't store the log files. But if you want to learn more about how 410 status code affects your log file profile, you may examine one of my previous SEO case studies. I have talked about keeping log files without letting server slowdown with IIS servers, and without a log file, we have changed the old web pages, resources, and other kinds of 404 errors to 410.

Jin Liang from Google performing a presentation about robots.txt and status code's different meanings for Googlebot.

While this didn't create a huge file, more than 600 resources' status code has been changed from 404 to 410. For a small web site, this is still a significant volume.

From temporary to permanent redirect

There is a considerable difference between 301/308 redirects and 302/307 temporarily redirects in terms of SEO.

301 and 308 redirects are permanent. They have a definite meaning for the Googlebot. It means that an old resource is moved to another location, and you should stop crawling the old request. The difference between 301 and 308 is the HTTP method. In 301 permanent redirect, you may use every HTTP method for both of the redirection source and target. Still, with 308 redirects, you have to use the same HTTP method for both of the redirection sources and destinations.

302 and 307 temporary redirects have similar logic. They send the users and crawlers to the redirection target, but the meaning of the status code is that the redirection is temporary. So Googlebot assumes that the old URL is the main URL and the source of the content, and one day again, it will be served to users.

According to the Bing Webmaster Guideline, you shouldn't use a temporary redirection for more than 15 days. If your temporary redirects last longer than 15 days, it means that your methodology is wrong. For Google, we don't know the time limit, but improper redirection methods may still have different meanings and outputs in terms of crawl budget, PageRank calculation, and canonicalization.

The difference between 302 and 307 redirects is the same as 301 and 308. 302 redirection is flexible for different HTTP requests, unlike 307.

301 and 302 redirects give the same amount of PageRank score for the target URL.

301 redirect has a canonical effect, unlike 302.

301 redirect makes Googlebot and other search engine crawlers forget about old URLs, unlike 302 redirect, which has a crawl budget effect on the web entity.

With 302 redirects, the source URL can be at the search results page for months, unlike 301 redirects. (You may also understand when Googlebot changes its opinion about your redirect like Bing.)Track project progress.

Most developers tend to use 302 redirects because it is easy to use with even a meta tag or Javascript (a redirect with Javascript is not recommended). As an SEO, I need to give this information to the developer team, with proper documentation explaining the approach's reasons.

The Most Common SEO Errors That Damage Your Site

URL typo correction and canonicalization

Like 301 and 302 redirects differences, there are differences between example.com/example and example.com/eXample in terms of SEO. The best practice is to use 301 redirects for these kinds of URL typos to show search engine crawlers a consistent and semantic URL structure with strict rules to decrease calculation time and algorithmic costs.

Self-canonicalization is another crucial issue around URL typos. Every URL with a capitalized letter has a canonical URL for itself, which gives a wrong signal to Google. Canonical URL, sitemap URL, and hreflang URL should be consistent under the index and robots.txt rules.

URL's shape and meaning are important for the users

In this section, we have fixed self-canonicalization problems with URL typos with the developer team. In this instance, URL typo correction couldn't be done because of the brand's high marketing intensity.

Index cleansing: removing duplicate and underperformed pages

On every SEO project, I always recommend that you remove underperforming web pages or combine them within a more significant topic cluster. On Unibaby.com.tr, there were lots of underperforming web pages with overlapping information. There were also exact duplicate web pages because of the infrastructure mentioned above problems.

Every blog web page had an exact duplicate example as well as self-canonicalization. We redirected the duplicate ones to the original resource.

We performed the same process for duplicate web pages.

We removed more than 500 web pages from the index.

We stopped the ranking signal division in duplicate pages.

We improved our crawl efficiency.

We also redirected some of the underperformed web pages by unifying them with the related ones.

There were also other kinds of exact duplicate web pages because of the additional filters. For the different colors of the same product, we have different URLs while they don't even have any search demand. We reported on these kinds of duplications, but they couldn't be attended to before the May Core Algorithm Update.

This is the comparison between the first 24 days of both May and April. As you may see, the sudden organic traffic growth declined around %10 despite a nearly identical average position. You will see more detail about this situation in the last part.

Google Core Update: What Is It And What You Should Do To Stay Afloat

Removing redirection chains (301 Inlinks) and 404 resources

When we delete lots of content and web pages along with lots of redirection changes, we need to check our content and web pages' source code to update any remaining internal links for deleted and redirected web pages. This is critical to ensure a semantic, consistent, and transparent site structure that can be easily crawled and understood by search engine crawlers.

We cleaned several redirect chains, old and missing resources, and deleted web pages' links from the source codes. I used Serpstat Site Audit to determine these errors.

Another screenshot for the Pageview changes for the last three months.

Crawlability issues and wrong robots.txt rules

A good SEO and developer relationship is crucial to the success of an SEO project. Even the slightest difference in robots.txt file can create a significant difference for the site.

On this project, I identified an issue with the robots.txt. When I fetched, downloaded, and rendered the homepage, I started to understand why there are paid services for only checking and automatically fixing robots.txt files. There was nothing on the robots.txt file.

This example screenshot is from a mobile-friendly test. Also, you may use Chrome's network conditions tab for checking a web page's status in the eyes of specific User agents (such as Googlebot or Googlebot Smartphone.) The latter will work unless your server performs DNS reverse lookup.

The real problem here was with Unibaby's CDN address. Img-unibaby.mncdn.com address' robots.txt file was blocking Googlebot and other crawlers from checking Unibaby.com's CSS and JS assets.

Uncertainty principle for search engines and ranking ecosystem

It means that Googlebot had seen everything from Unibaby.com.tr as in the 1990s internet environment. Here, we need to understand how Google thinks and ask some questions.

Why didn't Google delete a website with such a bad design and UX from the index despite it being devoid of CSS and JS structure?

Is this the first time Google has seen firms that have closed CSS and JS resources for Googlebot over CDN?

If Google sees a web page without CSS and JS and the users see the same web page's text and image (images were 404 because Javascript assets were drawing them) content with CSS and JS version, what would Google think? Is it cloaking or any sneaky spam move?

Or does it just simulate a real user and see that the texts and images are the same, but the design and functions of the web page are different.

Google and other search engines have developed a new principle. It is the Uncertainty Principle. As time goes by, new ranking factors and metrics have been developed.

They have also developed an algorithm hierarchy that works in harmony. The algorithm hierarchy decreased the cost and increased the effectiveness and speed of creating better SERP options.

The biggest winner from May Core Algorithm Update is Pinterest. Usually, searching for a pattern between winners and losers can help you to understand the search engine

In the old days, every on-page or off-page change rapidly showed changes in Google. But today, the impact of these changes is slow and hardly noticeable. When you lose a ranking, you can't always immediately identify the reason. That's why an auto-monitoring system for every digital aspect is essential.

In our example, Google didn't clean all domains from the index due to pure HTML content and design because the Uncertainty Principle gives Google Algorithms the ability to allow for gray areas.

In the old days, a bad design could have been a reason for deindexing, but now one of the baby algorithms of Google RankBrain Ecosystem can tell that this is not spam or bad design, it is a web developing error that shouldn't hurt the user.

An example of subintents, neural matching via Uncertainty Principles on Google patents for the given query on the screenshot.

Why didn't Google give the full Rank Power to these pages if they don't hurt the user? Because of the same reason. Uncertainty Principle is a methodology for analyzing gray areas. It may not be spam or an intentional move and doesn't hurt the users, but it is still an error for crawling, rendering, and indexing along with other kinds of suspicious possibilities.

That's the beauty of the hierarchy of algorithms and millions of baby algorithms of Google. The reasoning is always vital for SEO, especially when trying to interpret a secret ranking system.

What we do for better crawling, rendering, and indexing

In this example, we have seen that using pure HTML links is a much better option than Javascript-based links. With a minor rendering or resource fetching error, Googlebot couldn't crawl any links on the page and couldn't index the landing pages. Because of this situation, I recommended that any Javascript powered resources (in this example, the Images) be in HTML without the need for rendering.

Even if we don't render JS on Unibaby.com.tr, we always show the main content and internal links.

I also reduced the size of web page elements, which is still an optimization, albeit a minor one. Along with that, Unibaby.com.tr address has started to use a service worker for repeated visits. These are mostly about Pagespeed, but I'm briefly mentioning them because they are also related to crawling efficiency for search engine robots.

About the CDN with problematic robots.txt file:

The development team had also copied all of the websites with the same content, images, links, and URL structure into the CDN. All of the duplicated sites were open to the index. So, Google encountered an exact match copy of Unibaby in its CDN address and indexed them for the same queries. Imagine if your CDN address is competing with you with the same content!

That's why I have told you about the Uncertainty Principle of search engines. What would you think if you saw this as a search engine? I always ask this question to myself, and that's why I always read about Google Patents and SEO case studies together.

Our CDN always has the most of the content on our web pages

Google indexed the exact match copy web pages because they were open to index. Also, these correct match copy addresses from the CDN address were creating ranking signal delusion. If your content, design, logo, and other page components are the same with another domain's pages and are competing against each other, Google will redistribute the ranking signals among copy web pages.

This is being called content clustering by Google. They collect similar or duplicate content to understand the source and determine the most influential brand for that content cluster. In this way, they also create a more efficient SERP with unique content and publishers.

An example of slow 3G connection, 4x CPU throttling PageSpeed metrics

The most significant risk of content clustering is content hijacking (a blackhat technique, that's why I prefer not to give more details.) Content hijacking is a method to decrease the ranking signal of a domain for a query with specific content.

If your CDN address has links that point to you and duplicate content, Google might think you are doing this on purpose. Because of this situation, the CDN of Unibaby was taking impressions and clicks from the same queries with Unibaby.

Google chooses to show CDN addresses at search results, but it didn't give all the CDN ranking signals. Again, thinking in the gray area helped the Search Engine.

That's why the Uncertainty Principle is essential for the web. There are tons of different aspects of every page and situation.

After fixing the Robots.txt file, I tested with a mobile-friendly test and Google Search Console. The web pages have now returned to 2020 in terms of design and functionality.

We also deleted all of the exact match copy web pages from the Unibaby's CDN.

What would happen if Unibaby's CDN has all of the CSS and JS resources without blocking Googlebot, and Unibaby's main web page wouldn't have any of those? I wonder how Google's opinions would shape the same situation by its millions of baby algorithms in the hierarchy of algorithms.

This diagram shows what the purpose of a query can be according to Google's Uncertainty Principle and a mindset for designing a SERP accordingly. Source: Evaluating semantic interpretations of a search query, Google Patents

Conclusions and takeaways

In the first article, we dealt with this case study's perspective, subject matter, and approach to SEO. We have also seen the different meanings that even the smallest SEO changes can express to a search engine and how a web entity can affect SERP performance.

You can also see the click difference compared to the same period last year. The click data of the previous year is expressed in the gray area. The click increase obtained from the beginning of the year is 758%.
Design source: Databox, data source: Google Search Console

A successful SEO project aims to give the search engine precise, consistent, and organized signals by optimizing the more basic SEO requirements. If you omit this section, you may not be able to deliver more complex and deep messages to the hierarchy of algorithms run by the search engine as quickly, inexpensively, and efficiently.

In the second article, many topics such as Page Speed optimization, User Experience, internal linking, site structure, semantic HTML usage, and PageRank distribution will be covered using the same perspective and attitude.