Duplicate content
-
I have just ran a report in seomoz on my domain and has noticed that there are duplicate content issues, the issues are:
www.domainname/directory-name/
www.domainname/directory-name/index.php
All my internal links and external links point to the first domain, as i prefer this style as it looks clear & concise, however doing this has created duplicate content as within the site itself i have an index.php page inside this /directory-name/ to show the page.
Could anyone give me some advice on what i should do please?
Kind Regards
-
Hi Gary,
Here's some code from an htaccess file I've used before that solves the issue you've got with index.php at the end of all your urls:
#remove /index.php and ensure admin works okay
RewriteCond %{REQUEST_URI} !^/administrator
RewriteCond %{THE_REQUEST} ^.*/index.php\ HTTP/
RewriteRule ^(.*)index.php$ /$1 [R=301,L]
notice the line that contains ^/administrator , in Joomla, admin login is usuall on http://site.com/administrator/index.php
so, removing the index.php from the admin url would prevent any access to the admin screens! If your cms has a similar url, be sure to replace 'administrator' with the relevant url.
-
Hi Ade,
Thanks for all your help.
I will post a new question on the Q&A Forum regarding the .htaccess rule.
Kind Regards
-
Hi Gary,
That one is a bit beyond me I'm afraid and I am not familiar with WebEdition at all.
With most CMS there are normally either built-in or add-on extensions to help with re-writing your urls but you need to be really careful that you don't end up with a completely new set of urls that don't match either of your originals.
A .htaccess rewrite rule may be your best option but I don't know what the coding for it would be.
-
We are using a CMS, its called WebEdition, is there a technical question i should ask them in what i need to do?
Kind Regards
-
Ahhhh. No definitely not practical, I thought that it was just the one url.
Are you using a content management system for your site such as Joomla?
-
Do you think that's practical to do that?
As i will need to do a 301 on literally every page if i don't want to show the /index.php
Is this what seomoz.org website does? for example:
-
In that case you can just add a 301 redirect in to your .htaccess file below the code you added earlier.
redirect 301 /football-teams/index.php http://www.mydomain.com/football-teams/
-
Hi Ade,
Yes, i tested http://www.mydomain.com/football-teams//index.php however it did not resolve to http://www.mydomain.com/football-teams/
Any ideas?
-
Hi Gary.
Have you tried visiting the url http://www.mydomain.com/football-teams/index.php to see if it now resolves to http://www.mydomain.com/football-teams/ ?
If it does then the issue is fixed, the next time SEOMoz crawls your site the error will dissapear.
Cheers.
Ade.
-
Hi Ade,
Thanks for the speedy reply.
I have now implemented this and works fantastic on the http://www.mydomain.com/
Thank you very much.
There is another issue however, i hope i can make sense here, here goes:
seomoz tool gives me back duplicate content on both these URL's
http://www.mydomain.com/football-teams/
http://www.mydomain.com/football-teams/index.php
I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean.
What would be best practice to fix this issue?
Kind Regards
-
Hey Gary.
Here's the solution that I use.
All my sites are hosted on a linux server so this won't be relevant if your site is hosted on a windows server.
1. create/modify your .htaccess file in your site's root directory.
2. Add the following code to the top of the file:-
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.php\ HTTP/
RewriteRule ^index.php$ http://www.yourdomain.com/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]RewriteCond %{HTTP_HOST} ^yourdomain.com [NC]
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]This will ensure that any requests sent to http://yourdomain.com are redirected to http://www.yourdomain.com and that the index.php part of the url is removed.
If you need more help on creating or modifying your .htaccess file then you can find more info here - http://httpd.apache.org/docs/1.3/howto/htaccess.html
All the best.
Ade.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content Mystery
Hi Moz community! I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question. We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites. Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it. All the duplicates look to be coming from category pages specifically. For example:
Technical SEO | | Ecom-Team-Access
This link re-writes: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html?cat=407&color=152&price=20- To: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html The rel canonical tag looks like this: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" /> The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other. Take a look: Link identified as duplicate: http://www.incipio.com/cases/smartphone-cases/htc-smartphone-cases/htc-windows-phone-8x-cases.html?color=27&price=20- Link this is a duplicate of: http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html Any idea as to what could be happening here?0 -
Hreflang and possible duplicate content SEO issue
| 0 <a class="vote-down-off" title="This question does not show any research effort; it is unclear or not useful">down vote</a> favorite | Hey community, my first question here 🙂 Imagine there is a page with video, it has hreflang tags setup, to lead let's say German visitors to /de/ folder... So, on that German version of page, everything like menus, navigation and such are in German, but the video is the same, the title of the video (H1 tag) is the same, <title></code></strong> and <strong><code>meta description</code></strong> is the same as on the original English page. It means that general (English) page and German version of it has the same key content in English.</p> <p>To me it seems to be a SEO duplicate content issue. As I know, Google doesn't think that content is duplicate, if it is properly translated to other language.</p> <p>Does my explained case mean that the content will be detected by Google as duplicate?</p> </div> </div> </td> </tr> </tbody> </table></title> |
Technical SEO | | poiseo0 -
Drupal SEO help - Duplicate content but very similar URLS?
Hi, This is a very strange problem and not sure how it has happened. I am adding packages to my website and a duplicate page & almost identical URL is being picked up by Google. E.g. the page I make is http://www.ukgirlthing.co.uk/hen-party/bristol-spa-rty-lunch-pampering-h... but then also appearing is http://www.ukgirlthing.co.uk/hen-party/bristol-spa-rty-pampering-hen-party. The node's are exactly the same, and if i edit one of them, the other also updates. You will notice that the URL's are almost exactly the same, except the words are re-organised slightly? Shall i just delete the URL alias of the duplicate entry or is there something else which is making this happen? These URL's are being picked up as duplicate content, although it's the same node! Hope you can help, Thank you!
Technical SEO | | Party_Experts0 -
Duplicate Content
SEOmoz is reporting duplicate content for 2000 of my pages. For example, these are reported as duplicate content: http://curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158
Technical SEO | | jplill
http://curatorseye.com/Name=âHolster-Atlasâ---Used-by-British-Officers-in-the-Revolution&Item=4158 The actual link on the site is http://www.curatorseye.com/Name=“Holster-Atlas”---Used-by-British-Officers-in-the-Revolution&Item=4158 Any insight on how to fix this? I'm not sure where the second version of the URL is coming from. Thanks,
Janet0 -
Duplicate Page Content Report
In Crawl Diagnostics Summary, I have 2000 duplicate page content. When I click the link, my Wordpress return "page not found" and I see it's not indexed by Google, and I could not find the issue in Google Webmaster. So where does this link come from?
Technical SEO | | smallwebsite0 -
Duplicate content error - same URL
Hi, One of my sites is reporting a duplicate content and page title error. But it is the same page? And the home page at that. The only difference in the error report is a trailing slash. www.{mysite}.co.uk www.{mysite}.co.uk/ Is this an easy htaccess fix? Many thanks TT
Technical SEO | | TheTub1 -
Strange duplicate content issue
Hi there, SEOmoz crawler has identified a set of duplicate content that we are struggling to resolve. For example, the crawler picked up that this page www. creative - choices.co.uk/industry-insight/article/Advice-for-a-freelance-career is a duplicate of this page www. creative - choices.co.uk/develop-your-career/article/Advice-for-a-freelance-career. The latter page's content is the original and can be found in the CMS admin area whilst the former page is the duplicate and has no entry in the CMS. So we don't know where to begin if the "duplicate" page doesn't exist in the CMS. The crawler states that this page www. creative-choices.co.uk/industry-insight/inside/creative-writing is the referrer page. Looking at it, only the original page's link is showing on the referrer page, so how did the crawler get to the duplicate page?
Technical SEO | | CreativeChoices0 -
50+ duplicate content pages - Do we remove them all or 301?
We are working on a site that has 50+ pages that all have duplicate content (1 for each state, pretty much). Should we 301 all 50 of the URLs to one URL or should we just completely get rid of all the pages? Are there any steps to take when completely removing pages completely? (submit sitemap to google webmaster tools, etc) thanks!
Technical SEO | | Motava0