What is Considered Duplicate Content by Crawlers?
-
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense.
For example, the following URL's were grouped together as duplicate content:
|
https://www.firefold.com/contact-us
|
| https://www.firefold.com/sale |
|
|
How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content.
Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data.
Any help would be greatly appreciated.
Ryan
-
Yea I just started working on this site. I haven't used Moz Analytics much so just wanting to see how their crawler crawls pages.
And yes I agree, there are a lot of BIG BIG BIG issues with this site.
I got a large workload over the next few months haha.
-
I would add that there's is no text on any of those three pages - any "text" one would see there is actually just embedded in an image - which is a huge issue for a number of reasons:
- Search engines see that there's no text - a big no-no.
- You're getting practically no SEO value from the content that would be there, even if there isn't much.
- It's heavier this way - which makes load times slower.
I want to clarify that there are many, bigger issues with these pages - but as your question concerns only duplicate content, I'll leave all of that out for the time being. To summarize, Google, Yahoo, and Bing are just seeing some duplicate banners, sidebars, etc. and then some images in the body of your pages. Hence, duplicate content.
-
Thanks for that information.
It makes sense looking at the data and pages from that perspective.
-
Hi Ryan!
Our crawler will flag pages that have at least 90% similarity in the entire source code of the site so not just the body.
The way you want to interpret the report is the contact-us page has 35 duplicates, so "gabe" and "sale" are not dupes of each other in this section but are only each a duplicate of "contact-us". Those URLs might appear with their own duplicates of the same pages further down in the report.
While on the front end the pages do not appear to be similar. The issue is likely with the amount of javascript code on those pages.
Our crawler cannot read javascript so we are likely only able to see the template of the page. Other search tools are probably seeing the same thing as it returns 79% similarity using this tool: http://www.freebulkseotools.com/similar-page-checker-tool.php
I can't provide much insight from a dev perspective but hope this helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Related topics / content suggestion
Hello, In the related topic feature now called content suggestions https://mza.seotoolninja.com/blog/related-topics-in-moz-pro. Are the words indicated words to include in my content or are they "topics" to talk about using words that would describe those words turning these words into concepts ? Thank you,
Moz Bar | | seoanalytics0 -
Community Discussion - What's Been Your Experience With Moz Content?
When the content developed Moz Content, I was excited as can be about having another tool in the content marketing and content strategy repertoire. I knew it could and would help marketers better identify the content they should be creating and make it easier for them to move the needle for their brands. Since it's been available, I've had fun using Moz Content, seeing it as a great vehicle for flattening the learning curve for content ideation and creation. In a recent post, Here's How I'm Using Moz Content for Mining Local Link Opportunities, David Farkas described how brands can use Moz Content to better create localized content. I'd like to know how you're using it, or if you're using it: Have you tried Moz Content? And if not, what's stopping you? If you have used it, what are you really liking? What would you change? What, if any, additional features you'd like to see added? What tips can you share for helping others get the most out of the tool? Looking forward to reading the comments below.
Moz Bar | | ronell-smith3 -
Moz crawler finding my homepage multiple times
Hi and thank you in advance for your help! I have a Moz Pro campaign running (I am a complete Moz novice by the way) for one of my websites (balloonsutah.com). After crawling my site, the Moz crawler informed me that I have 3 pages with duplicate content. While I am not sure why exactly this is happening, the crawler indexed my homepage 3 times under different url's. -balloonsutah.com
Moz Bar | | Keenan-Price
-balloonsutah.com/
-balloonsutah.com/index.html I checked my FTP server and I cannot figure out for the life of me why the crawler is finding anything other than the index.html file. I suppose I need to do something regarding a rel="Canonical" but I am not terribly familiar with that either. Any suggestions would be greatly appreciated!
Keenan0 -
Moz Crawl Test Tool - SEO Web Crawler showing up with no details
So basically I have ran the Moz Crawl Test tool twice for this url "bubblingwithenergy.info" and both times the report has listed 1 URL when there is obviously a lot more if you check the site. My question is, why is the Moz Crawl only reporting 1 URL when there are heaps? Is there a possibility it is being blocked and if so what would be blocking it? This website is using a CMS called Infusion and it is based off CMSMS (CMS Made Simple). Any answers would be greatly appreciated. Cheers
Moz Bar | | KBB_Digital0 -
OnPage Reports - Duplicate titles and meta descriptions
Hi Moz, I know you guys changed your interface awhile back but I have a question about the new reports. On the old interface, I used to use a report that would automatically run when I created a new account letting me know where the dup titles and meta descriptions were on an entire site. Where can I find this report on the new interface? Thanks Carla
Moz Bar | | Carla_Dawson1 -
Duplicate Page Title query in the PRO Campaign tool
Can someone help me on this. I am seeing duplicate page titles on the PRO Campaign Crawl tool on an ecommerce site for example MOZ is saying that these two pages have a duplicate page title: http://www.cheapsnapframes.co.uk/colour-25mm-snap-frames/25mm-green-snap-frame/a0-traffic-green-snap-frame-25mm/prod_1730.html http://www.cheapsnapframes.co.uk/snap-picture-poster-frames/colour-25mm-snap-frames/green-25mm-snap-frame/a0-traffic-green-snap-frame-25mm/prod_1730.html They are the the same product in two categories. When I view the source of both pages the this link is the same in the meta: <link rel="<a class="attribute-value">canonical</a>" href="[http://www.cheapsnapframes.co.uk/colour-25mm-snap-frames/25mm-green-snap-frame/a0-traffic-green-snap-frame-25mm/prod_1730.html](view-source:http://www.cheapsnapframes.co.uk/colour-25mm-snap-frames/25mm-green-snap-frame/a0-traffic-green-snap-frame-25mm/prod_1730.html)" /> So is there something else I need to have done to erradicate this or is it not an issue? Thanks in advance Tracy
Moz Bar | | dashesndots0 -
Crwal errors : duplicate content even with canonical links
Hi I am getting some errors for duplicate content errors in my crawl report for some of our products www.....com/brand/productname1.html www.....com/section/productname1.html www.....com/productname1.html we have canonical in the header for all three pages <link rel="canonical" href="www....com productname1.html"=""></link rel="canonical" href="www....com>
Moz Bar | | phes0 -
Duplicate Page Content Report on MOZ
Hi, I am just wondering as to the accuracy of this report - does it pick up all the duplicate on page content? Or is there a limit? We have an ecommerce store with a lot of copied and pasted descriptions - just wondering if there is a limit on how much the moz crawler picks up? In other words, once we fix what MOZ has detected, will there be more detected because it is limited to display say up to 200?? Hope you understand what I mean. Thanks
Moz Bar | | bjs20100