Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
In Wordpress getting marked as duplicate content for tags
Moz is marking 11 high priority items for duplicate content. Just switched to wordpress and publishing articles for the site but only have a few. The problem is on the tag pages. Since there aren't very many articles so when you go to the tag pages it lists one or two articles and hence there are pages with duplicate content. Most of the articles have the same tags / categories. Perhaps I'm using too many tags and categories? I'm using about 7 tags and around 2 categories for each post / event. I've read the solution is using canonical tags but a little confused on which page I should use for the tag and then I believe I need to point the duplicate pages to the correct page. For example, I have two events that are for dances and both have the same tags. So when you visit, site.com/tags/dance or site.com/events both pages have the same articles listed. Which page do I select as having the original content? Does it matter? Does that make sense? Someone was also saying I could use the Yoast plugin to fix, but not really seeing anything in the Yoast tools. I also see 301 redirects mentioned as a solution but the tag pages will be changing as we add new articles and they have a purpose so not really seeing that as a solution.
Web Design | | limited70 -
Duplicate content on websites for multiple countries
I have a client who has a website for their U.S. based customers. They are currently adding a Canadian dealer and would like a second website with much of the same info as their current website, but with Canadian contact info etc. What is the best way to do this without creating duplicate content that will get us penalized? If we create a website at ABCcompany.com and ABCCompany.ca or something like that, will that get us around the duplicate content penalty?
Web Design | | InvoqMarketing0 -
How To Avoid Duplicate Content
We are an eCommerce site for autoparts. It is basically impossible to avoid duplicate content, and I think we are getting penalized by Google for it. Here is why it is impossible. Let's say I sell a steering rack for a 2000 Honda Accord. I need an SEO rich page for 2000 Honda Accord Steering Rack. I sell steering racks for more than 25 years of Honda Accords. I can try and make the copy different but there is no way to spin the copy that many times and make it seem like it is not duplicate copy. This even gets more complicated because I sell hundreds of parts for each year of a Honda Accord, plus a lot of times you even have to go down to the engine size of the car for the right part. I can't use a redirect, ie 301 redirect because they are not the same pages. One is for a 2000 Honda Accord and the other a 2001 Honda Accord, and so on. Is their a redirect out there that I do not know about that would help me out in this case? Also, if their is no way around this and I am getting penalized would it be better to eliminate all these pages, possibly losing my ability to rank high on searches such as "2000 Honda Accord Steering Rack," and just replace with a page that has a Year Make Model, and Part dropdown which just takes the customer a checkout page?
Web Design | | joebuilder0 -
Wordpress Pages not indexing in Google
Hi, I've created a Wordpress site for my client. I've produced 4 content pages and 1 home page but in my sitemap it only says I have 1 page indexed. Also SEOmoz only finds 1 page. I'm lost on what the problem could be. The domain name is www.dobermandeen.co.uk Many thanks for any help. Alex
Web Design | | SeoSheikh0 -
SEO tricks for a one page site with commented html content
Hi, I am building a website that is very similar to madebysofa.com : means it is one page site with entire content loaded (however are commented in html) and by clicking on sections it modify the DOM to make specific section visible. It is very interesting from UX point of view but as far as I know, since this way most of my content is always commented and hidden from crawlers, I will loose points regarding SEO. Is there any workaround you can recommend or you think sites like madebysofa.com are doomed to loose SEO points by nature? Best regards,
Web Design | | Ashkan10 -
Usual time to index and rank a new site
Hi Just wondering if anyone knew how long it usually takes for a brand new site to get indexed and ranked? I launched a new site about 5 weeks ago. So far I have had 96,000 pages indexed but the majority haven't ranked particularly well or appeared. The ones that have ranked aren't ranking high even though they have better content than competitors sites... And my old domain. Do I just need to hang tight and wait till my domain authority improves? Is there anything I can do to speed up this process? cheers
Web Design | | DavidLenehan0 -
Using tables in html
I have a question about tables in html.I heard that you shouldnt use tables in html,you should should use css instead.Ive used free html templates that use tables but those tables are styled through css:td,th,table and other table elements are ale styled through css.I'm curious is this ok for SEO or should tables should be dropped altogether? Thanks for your response
Web Design | | PCTechGuy20120 -
How do I identify what is causing my Duplicate Page Content problem?
Hello, I'm trying to put my finger on what exactly is causing my duplicate page content problem... For example, SEOMoz is picking up these four pages as having the same content: http://www.penncare.net/ambulancedivision/braunambulances/express.aspx http://www.penncare.net/ambulancedivision/recentdeliveries/millcreekparamedicservice.aspx http://www.penncare.net/ambulancedivision/recentdeliveries/monongaliaems.aspx http://www.penncare.net/softwaredivision/emschartssoftware/emschartsvideos.aspx As you can tell, they really aren't serving the same content in the body of the page. Anybody have an idea what might be causing these pages to show up as Duplicate Page Content? At first I thought it was the photo gallery module that might be causing it, but that only exists on two of the pages... Thanks in advance!
Web Design | | BGroup0