What is Considered Duplicate Content by Crawlers?
-
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense.
For example, the following URL's were grouped together as duplicate content:
|
https://www.firefold.com/contact-us
|
| https://www.firefold.com/sale |
|
|
How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content.
Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data.
Any help would be greatly appreciated.
Ryan
-
Yea I just started working on this site. I haven't used Moz Analytics much so just wanting to see how their crawler crawls pages.
And yes I agree, there are a lot of BIG BIG BIG issues with this site.
I got a large workload over the next few months haha.
-
I would add that there's is no text on any of those three pages - any "text" one would see there is actually just embedded in an image - which is a huge issue for a number of reasons:
- Search engines see that there's no text - a big no-no.
- You're getting practically no SEO value from the content that would be there, even if there isn't much.
- It's heavier this way - which makes load times slower.
I want to clarify that there are many, bigger issues with these pages - but as your question concerns only duplicate content, I'll leave all of that out for the time being. To summarize, Google, Yahoo, and Bing are just seeing some duplicate banners, sidebars, etc. and then some images in the body of your pages. Hence, duplicate content.
-
Thanks for that information.
It makes sense looking at the data and pages from that perspective.
-
Hi Ryan!
Our crawler will flag pages that have at least 90% similarity in the entire source code of the site so not just the body.
The way you want to interpret the report is the contact-us page has 35 duplicates, so "gabe" and "sale" are not dupes of each other in this section but are only each a duplicate of "contact-us". Those URLs might appear with their own duplicates of the same pages further down in the report.
While on the front end the pages do not appear to be similar. The issue is likely with the amount of javascript code on those pages.
Our crawler cannot read javascript so we are likely only able to see the template of the page. Other search tools are probably seeing the same thing as it returns 79% similarity using this tool: http://www.freebulkseotools.com/similar-page-checker-tool.php
I can't provide much insight from a dev perspective but hope this helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content suggestions
Hello, I am looking at the content suggestions in the page optimisation section and was wondering why the system says that some pages cover some topics and some don't cover it ? Is it based on finding "noun phrases" that co-occurre (twice the same one) ? or "noun phrases" that are used in the same context (even though they are different) ? Thank you,
Moz Bar | | seoanalytics0 -
Related topics / content suggestion
Hello, In the related topic feature now called content suggestions https://mza.seotoolninja.com/blog/related-topics-in-moz-pro. Are the words indicated words to include in my content or are they "topics" to talk about using words that would describe those words turning these words into concepts ? Thank you,
Moz Bar | | seoanalytics0 -
MozBot Finding Duplicate Pages That Aren't Duplicate
I've been reviewing the technical audits for my campaign in Moz, and noticed I had a number of duplicate content issues that I'm not really sure how to address. When I click on the links of what the duplicates are, they are all different links that have different content/images. Based on what I was seeing other's wrote in the forum, this could be because the code base is really the same between these pages, and many of these were using query parameters (I'm assuming that is why the code is almost exactly the same across these pages), so example: website.com/tags/KEYWORD1?type=KEYWORD2 is a duplicate of website.com/tags/KEYWORD3?type=KEYWORD4 I was reading that I can use that URL Parameters area in google search console, but my search console says that the googlebot isn't experiencing issues, so I wasn't sure if that was the right move. I can't do the canonicals because these pages all have different content on them, and I know duplicate content is a big SEO issue, so I really wasn't sure what my next steps should be. Thanks for the help!
Moz Bar | | amaray4030 -
Crawl Diagnostics: How many pages (deep) will it crawl for dup content
Does anyone know how deep the crawl diagnostics will crawl when searching for dup content? Will it crawl the entire site, or will it only crawl "x" amount of pages? Thanks!
Moz Bar | | tdawson090 -
Duplicate Page and Title Issues
On the last crawl, we received errors for duplicate page titles and some duplicate content pages. Here is the issue: We went through our page titles that were marked as duplicate and changed them to make sure their titles were different. However, we just received a new crawl this week and it is saying there are even more duplicate page title errors detected than before. We're wondering if this is a problem with just us or if it has been happening to other Moz users. As for the duplicate content pages, what is the best way to approach this and see what content is being looked at as a "duplicate" set?
Moz Bar | | Essential-Pest0 -
Duplicate page titles
Hi -- A crawl tells me I have 200 duplicate page titles. Unfortunately, it doesn't tell me what those pages are duplicating. What do I do with this information? How do I begin to respond? Thanks
Moz Bar | | skipperdoodle0 -
Moz Dupe content crawl anomaly
Hi Moz has completed a crawl for a site i'm working on which also has a development area (hence with lots of dupe content) on a sub domain (and this dev area hasn't been hidden from crawlers via password, robots, gwt etc etc). Moz dupe content report is not showing any of these urls though even though my campaign setting is on 'root' domain so i would have thought report should be listing the subdomain urls as dupe content (because they are dupe content). Any ideas ? Cheers Dan
Moz Bar | | Dan-Lawrence0 -
Duplicate Page Content Report on MOZ
Hi, I am just wondering as to the accuracy of this report - does it pick up all the duplicate on page content? Or is there a limit? We have an ecommerce store with a lot of copied and pasted descriptions - just wondering if there is a limit on how much the moz crawler picks up? In other words, once we fix what MOZ has detected, will there be more detected because it is limited to display say up to 200?? Hope you understand what I mean. Thanks
Moz Bar | | bjs20100