Google’s Search Quality Rater Guidelines provide an invaluable look into how Google judges page design, user experience, and content quality best practices for recommending search results. If you work in SEO in some capacity, it really is a must-read.
I went through the entire document (175+ pages) to help summarize the most important takeaways.
Jump Ahead to a Specific Section:
What are the Google Search Quality Rater Guidelines?
Google’s Search Quality Rater Guidelines (QRG) is a dense document of 175 pages and counting with yearly updates. The document was created and is frequently updated for the purpose of analyzing and validating Google’s search algorithm. Human Quality Raters are hired to manually rate search results based on this guidance from Google.
In order to get something useful out of the Quality Rater Guidelines, you need to pay attention to nuance shared within explanations and examples. It’s easy to think you can generalize based on information shared in the main sections of the document but the example details paint a more precise picture of best practices.
Here are a few fun facts about the Quality Rater Guidelines and how they’ve changed over time:
Armed with an understanding of why the Google Quality Rater Guidelines exist and why you should read them, let’s dive into the major sections and what they teach us about the Google search algorithm:
Google Quality Rater Guidelines Sections
The bulk of the QRG is helpful examples to explain concepts. The rest involves explanations and descriptions for reference.
There are three major sections to the Quality Rater Guidelines:
There’s an additional section that goes through how Quality Raters submit tasks, which can probably be omitted from your study without losing understanding.
Let’s go through each section and its components:
Determining Page Quality Ratings
The basis of a quality rating is purpose.
It’s hard to get a high-quality rating without:
Note that E-A-T stands for Expertise, Authority, and Trustworthiness.
Therefore, to achieve the highest ratings, your main content has to be of the highest quality, with a very positive reputation, and super high E-A-T. The content has to demonstrate a high degree of time, effort, expertise, talent, and skill.
It’s worth noting that when it comes to establishing authority, you can be a hobbyist or have everyday expertise. Your everyday expertise with a topic can satisfy aspects of E-A-T — as long as you’re not giving medical advice.
Furthermore, Google says in section 11.0:
In cases where the content creator is not demonstrating formal or everyday expertise but is not doing any harm, Medium is an appropriate rating.
In section 6.3, it’s determined that page length should relate to purpose but too little content can result in a low page quality score.
In section 7.3, Google also recommends:
Use the Lowest rating for pages that promote hate or violence against a group of people, including but not limited to those grouped on the basis of race or ethnic origin, religion, disability, age, nationality, veteran status, sexual orientation, gender or gender identity. Websites advocating hate or violence can cause real-world harm.
How to Create High-Quality Content
In section 11.0, Google defines high-quality content as “content that takes time, effort, expertise, and talent/skill.” In other words, if it’s missing one of these elements, it’s probably not high enough quality.
The good news is that humans aren’t replaceable by robots as content creators just yet.
In section 7.0, Google talks about auto-generated (AI) content as worthy of a low page quality rating at this point in time.
In the same section, copied content is considered as low-quality main content (MC) — save for legitimate syndication. Similarly, in section 7.7, fluffy content (incorporating word bloat) is also considered as low quality.
Are you noticing a trend?
If you’re whipping up new blog posts in an hour or less, they’re probably not the level of quality that Google is looking to recommend at the top of relevant search.
What Google Considers as Low Quality Content
You probably could’ve guessed this, but Google shares several specifics that confirm that they think bad spelling and grammar is unprofessional.
In section 6.7, they share an example stating that meaningless statements don’t add value, contribute to a low page quality rating. In the same section, they discuss a need to cite sources.
In section 7.2.9, we’re giving one definition of what they consider spam:
We’ll consider a comment or forum discussion to be “spammed” if there are posts with unrelated comments that are not intended to help other users, but rather to advertise a product or create a link to a website.
In other words, if you’re going to be promoting yourself, make sure you’re adding value.
E-A-T stands for Expertise, Authority, and Trustworthiness. It’s something that comes up multiple times in the Google Quality Rater Guidelines as an important facet of page quality. Essentially, E-A-T is a helpful acronym for remembering important content elements.
It’s interesting to note how Google measures authority. In section 13.3.1, Google shares examples regarding how websites can create their own authority.
Google considers this band website uniquely authoritative when it comes to their own lyrics.
By contrast, their explanations find that a website like A-Z Lyrics is less authoritative about those lyrics because they may be inaccurate due to scraping (a sign of low moderation/spam) from other sites.
From section 13.4.1:
For example, a brand can be the most authoritative about the backpacks they sell that no one else does.
Here’s another example of a brand domain creating its own authority in section 5.4:
According to section 5.2, comments and reviews are considered to be good for E-A-T because they prove engagement when other reputational info isn’t available:
That said, in section 6.1, Google shares that a focus on encouraging user-generated content (UGC) paired with low oversight can cause low E-A-T.
Reputation and YMYL
YMYL stands for “your money, your life.” YMYL sites discuss issues that have great importance in the average person’s life where advice can have a major impact. As such, content creators have a major responsibility to present expert advice they can stand behind. YMYL involves topics such as going to college, moving up in your career, and financial decisions.
Because these topics can be very impactful to readers, Google pays extra attention to reputation when determining page quality.
According to section 6.5, in most situations, if a website shares reputational info, it probably has a good reputation.
I found it interesting that despite how often my teachers drilled it in my head to not use Wikipedia as a source, Wikipedia is often used as a recommended source for determining reputation within the Google Quality Rather Guidelines.
Besides Wikipedia, reviews can help establish reputation. Furthermore, Google makes sure to point on in section 6.5 that low ratings are not equal to a bad reputation on their own.
Another pattern worth commenting on is that Pulitzer Prize-winning outlets are often shared as examples of how to prove high authority. But it’s kind of unrealistic — we can’t all have a Pulitzer.
I wish there were more examples sharing other, more realistic ways to determine reputation.
Another somewhat unattainable example is a hobbyist whose reputation is established thanks to having 20 years of experience. But what does Google think about someone with a “mere” 5 or 10 years of experience?
From a page design and information architecture note, section 6.6 concludes that multiple contact options are important, especially for YMYL sites.
In section 13.4, content that is less up-to-date than other results is encouraged to be given a rating of “moderately meets criteria.”
In section 18.0, Google defines freshness as:
Think of freshness as important for things that change often, like digital marketing.
Mobile Searcher Considerations
User Experience Considerations
User experience (UX) concerns are perhaps most impactful and obvious in mobile search. This is probably why Google focuses guidance around UX in terms of the mobile perspective.
Eroding Zero-Click Search
The Diversity algorithm update on June 3, 2019 resulted in single domain results in top search. In other words, one website can’t rank more than once, so it’s not worth competing against yourself for keywords. It also impacted sites with a low number of backlinks — they’re “underrewarded.”
Then, there was the BERT algorithm update in October 25, 2019. It impacted both search rankings and featured snippets and BERT.
In other words, a lot has been changing with the specific visual merchandising of top search results and the diminishing influence one domain can have within them.
But what’s more prevalent — brand-generated featured snippets or Google-generated generic features with answers?
In section 12.8.2, Google shares many examples of how they’re implementing Special Content Result Blocks (SCRBs) that appear in the search engine results page, along with Web Search Result Blocks. They are frequently, but not always, the first result on the search engine results page.
Then, in section 13.1, when talking about mobile results, Google’s guidelines seem to allude to a move to zero-click search.
In section 13.2.1, it seems like zero-click search is their end goal with featured snippets (a conflict of interest when building an industry around creating content) based on examples such as this:
In section 13.2, Google states that the Fully Meets rating should be reserved for results that are the “complete and perfect response or answer.”
You can probably see the theory I’m starting to develop — that increasingly more future queries related to results blocks are likely to be cannibalized by Google. That is, of course, unless antitrust investigation efforts make an impact.
This is something I’ve noticed too that relates to zero-click search:
Maps results blocks on Google make it harder to find a website link.
Understanding Mobile User Intent
In section 12.7 – 12.7.1, Google defines mobile user intent by the following categories:
In section 16.1, Google shares that some queries may have two possible strong intents:
- Go to the website intent: in order to, for example, find out information, buy something online, make a reservation, schedule an appointment, interact with customer support, or fulfill some other need that can be satisfied online
- Visit-in-person intent: user wants to visit the store, business, etc. in person
Section 12.5 discusses how there are dominant interpretations for some/many queries.
Section 13.3.1 shares an interesting comparison of two “Trader Joes” queries/intents.
Google seems to contradict itself by allowing multiple different results for the same query with none of them fully meeting needs.
Here’s my theory:
The Google Quality Rater Guidelines seems to be studying the impact behind dominant/less dominant intents from the user perspective.
Broad user intent means you can’t fully meet needs with a query.
In section 19.2, Google shares this Jon Stewart vs John Stuart example:
In section 19.1, Google clarifies:
- For obviously misspelled or mistyped queries, you should base your rating on user intent, not necessarily on exactly how the query has been spelled or typed by the user.
- For queries that are not obviously misspelled or mistyped, you should respect the query as written, and assume users are looking for results for the query as it is spelled.
Determining if a Result Meets Needs
Needs Met is a score that Quality Raters assign to a search result in order to determine how effectively it met the perceived user intent of a query. It challenges the Quality Rater to consider if there was a quicker or more efficient way to get searchers to their answer.
And just as it’s hard to get a fully needs met rating, it’s just as hard to get the other extreme — fails to meet needs.
In section 14.0, Google clarifies that the Page Quality rating slider does not depend on the query. They say, “Do not think about the query when assigning a Page Quality rating to the landing page (LP).”
Here’s the basics of how to determine search intent, in section 16.0:
When giving Needs Met ratings for results involving different query interpretations, think about how likely the query interpretation is and how helpful the result is.
Furthermore, when it comes to assigning needs met ratings:
- A very helpful result for a dominant interpretation should be rated Highly Meets, because it is very helpful for many or most users. Some queries with a dominant interpretation have a FullyM result.
- A very helpful result for a common interpretation may be Highly Meets or Moderately Meets, depending on how likely the interpretation is.
- A very helpful result for a very minor interpretation may be Slightly Meets or lower because few users may be interested in that interpretation.
- There are some interpretations that are so unlikely that results should be rated FailsM. We call these “no chance” interpretations.
Quality Raters have the opportunity to flag results for:
In section 15.5.2, Google shares that true Did Not Load pages are useless. Google doesn’t want them in their index.
They go on to add that, “Sometimes the page partially loads or has an error message. Give Needs Met ratings based on how helpful the result is for the query. Error messages can be customized by the webmaster and are part of a well-functioning website. Sometimes these pages are helpful for the query.”
Furthermore, in section 15.6.1, they add that you should assign the Upsetting-Offensive flag, “based on the purpose, type, and/or presentation of the content on the page—not because the topic itself is sensitive or potentially upsetting.”
“For example, a result with content that encourages child abuse should be flagged as Upsetting-Offensive. However, an accurate informational page about child abuse (such as child abuse statistics, prevention, how to recognize signs of abuse, etc.) should not be flagged, even though child abuse itself is a sensitive topic that users may find upsetting.”
Porn Flag and Results
Google has special guidelines for porn. It’s certainly interesting that there’s a whole section of the QRG dedicated to rating porn results for quality/needs met.
It’s complicated because some porn sites may be fronts for phishing.
In section 15.1, we’re told that an image may be considered porn in one culture or country, but not another.
Additionally complicating things, in section 15.2.2, some queries have both non-porn and porn interpretations.
For example, the following English (US) queries have both a non-porn and an erotic or porn interpretation: [breast], [sex]. We will call these queries “possible porn intent” queries.
Google goes on to add:
For “possible porn intent” queries, please rate as if the non-porn interpretation were dominant, even though some or many users may be looking for porn. For example, please rate the English (US) query [breast] assuming a dominant health or anatomy information intent.
Final Thoughts — Google's Search Quality Rater Guidelines: Lessons Learned by Reading 175+ Pages
There’s certainly a lot to take away from a close reading of the Google Quality Rater Guidelines. While this article can serve as a useful summary, I challenge you to read through them, yourself.
Read my highlighted copy of the Quality Rater Guidelines, then let me know what you learned in the comments below!