What is Considered Duplicate Content by Crawlers?
-
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense.
For example, the following URL's were grouped together as duplicate content:
|
https://www.firefold.com/contact-us
|
| https://www.firefold.com/sale |
|
|
How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content.
Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data.
Any help would be greatly appreciated.
Ryan
-
Yea I just started working on this site. I haven't used Moz Analytics much so just wanting to see how their crawler crawls pages.
And yes I agree, there are a lot of BIG BIG BIG issues with this site.
I got a large workload over the next few months haha.
-
I would add that there's is no text on any of those three pages - any "text" one would see there is actually just embedded in an image - which is a huge issue for a number of reasons:
- Search engines see that there's no text - a big no-no.
- You're getting practically no SEO value from the content that would be there, even if there isn't much.
- It's heavier this way - which makes load times slower.
I want to clarify that there are many, bigger issues with these pages - but as your question concerns only duplicate content, I'll leave all of that out for the time being. To summarize, Google, Yahoo, and Bing are just seeing some duplicate banners, sidebars, etc. and then some images in the body of your pages. Hence, duplicate content.
-
Thanks for that information.
It makes sense looking at the data and pages from that perspective.
-
Hi Ryan!
Our crawler will flag pages that have at least 90% similarity in the entire source code of the site so not just the body.
The way you want to interpret the report is the contact-us page has 35 duplicates, so "gabe" and "sale" are not dupes of each other in this section but are only each a duplicate of "contact-us". Those URLs might appear with their own duplicates of the same pages further down in the report.
While on the front end the pages do not appear to be similar. The issue is likely with the amount of javascript code on those pages.
Our crawler cannot read javascript so we are likely only able to see the template of the page. Other search tools are probably seeing the same thing as it returns 79% similarity using this tool: http://www.freebulkseotools.com/similar-page-checker-tool.php
I can't provide much insight from a dev perspective but hope this helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why isn't the Moz crawler getting all of my item pages?
I am stumped and Moz is being terrible to work with. This site has about 40k pages 39,800 of them are item pages roughly. Moz is only finding about 2400 of my pages. So they are missing most but not all of my item pages. I do not know which item pages they are missing. The fact that they are finding about 2k but not the rest leads me to believe the crawler is struggling with pagination. The site is built on Magento 2 and uses the Amasty Layered Navigation extension. Does anyone have any ideas?
Moz Bar | | Tylerj0 -
Duplicate page found with MOZ crawl test?
When I crawl my website www.radiantguard.com, the crawl test comes back with what appears to be a duplicate of my home page: http://www.radiantguard.com and http://www.radiantguard.com/ Does the crawler indeed see two different pages and therefore, are my search engine rankings potentially affected, AND is this because of how my rel canonical is set up?
Moz Bar | | rhondafranklin0 -
Site Crawl report show strange duplicate pages
Beginning in early in Feb, we got a big bump in duplicate pages. The URLs of the pages are very odd: Example URL:
Moz Bar | | Neo4j
http://[email protected]/dir/page.php
is duplicate with http://website.com/dir/page.php I checked though the site, nginx conf files, and referral pages, and could not find what is prefixing the pages with 'http://firstname.lastname@'. Any ideas? The person whose name is 'Firstname Lastname' is stumped as well. Thanks.0 -
I update content and then craw but the MOZ spider still shows old content. Do I need to update something else?
"This shows but was replaced a day before I ran Moz crawer: | We provide a full service for low cost automated phone calls, robocalls, Bulk SMS service, Political robo calls without needing computer skills | "
Moz Bar | | ThomasDaBomb
I look in the link on website and see:
<title>Our customers talk about: Currently the tremendous growth of organi</title> Why does the craw not reflect the current content? Thanks.
Thomas0 -
Moz crawler
I have a site which is in a non production status. Crawlers are blocked vis robot txt. User-agent: *
Moz Bar | | Emanuele_Ricci
Disallow: / I WANT TO MAKE A CRAWLING TEST WITH MOZ CRAWLER (RogerBot) ,
how can I allow your crawler to get in and prevent other crawlers from indexing the site? Thanks memok0 -
Duplicate page content/page titles on tages
Hello everyone. New to the community and loving it already. Question, I am receiving an error of 6 pages with duplicate content and page titles. A majority of these are tag pages. Should I be worried about these? IN the column listed duplicate urls it is listing 0 ( screen shot: http://screencast.com/t/azvuVk0ucWt) Are these tags a problem? Will SEO be hurt because of this? What are TAG pages? Actually pages, categories, should I eliminate these?
Moz Bar | | Jasonalanmagic0 -
Duplicate content errors
Hi I am getting some errors for duplicate content errors in my crawl report for some of our products www.....com/brand/productname1.html www.....com/section/productname1.html www.....com/productname1.html we have canonical in the header for all three pages rel="canonical" href="www....com/productname1.html" />
Moz Bar | | phes0 -
Duplicate Page Content Report on MOZ
Hi, I am just wondering as to the accuracy of this report - does it pick up all the duplicate on page content? Or is there a limit? We have an ecommerce store with a lot of copied and pasted descriptions - just wondering if there is a limit on how much the moz crawler picks up? In other words, once we fix what MOZ has detected, will there be more detected because it is limited to display say up to 200?? Hope you understand what I mean. Thanks
Moz Bar | | bjs20100