Block search engines from URLs created by internal search engine?
-
Hey guys,
I've got a question for you all that I've been pondering for a few days now. I'm currently doing an SEO Technical Audit for a large scale directory.
One major issue that they are having is that their internal search system (Directory Search) will create a new URL everytime a search query is entered by the user. This creates huge amounts of duplication on the website.
I'm wondering if it would be best to block search engines from crawling these URLs entirely with Robots.txt?
What do you guys think? Bearing in mind there are probably thousands of these pages already in the Google index?
Thanks
Kim
-
That sounds perfect - if the user-generated URLs are getting enough traffic, make them permanent pages and 301-redirect or canonical. If not, weed them out of the index.
-
Thanks for your reply Dr. Meyers. I think you're probably right.
Yes I'm recommending they define a canonical set of pages that are the most popular searches, categories and locations which can be reached via internal links and we'll get all those duplicates re-directed back to that canonical set.
But for pages that fall outside those categories and locations, I'll recommend a meta-no-index tag.
-
It can be a complicated question on a very large site, but in most cases I'd META NOINDEX those pages. Robots.txt isn't great at removing content that's already been indexed. Admittedly, NOINDEX will take a while to work (virtually any solution will), as Google probably doesn't crawl these pages very often.
Generally, though, the risk of having your index explode with custom search pages is too high for a site like yours (especially post-Panda). I do think blocking those pages somehow is a good bet.
The only exception I would add is if some of the more popular custom searches are getting traffic and/or links. I assume you have a solid internal link structure and other paths to these listings, but if it looks like a few searches (or a few dozen) have attracted traffic and back-links, you'll want to preserve those somehow.
-
Sure, check below and some of the duplication I mean:
Capitalization Duplication
http://yellow.co.nz/yellow+pages/Car+dealer/Auckland+Region
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland+Region
With a few URL parameters
And with location duplication
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland
Let me know if you need any more info!
Cheers
Kim
-
Whats the content look like on the new url? Can you give us an example?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How To Shorten Long URLS
Hi I want to shorten some URLs, if possible, that Moz is reporting as too long. They are all the same page but different categories - the page advertises jobs but the client requires various links to types of jobs on the menu. So the menu will have: Job type 1
Intermediate & Advanced SEO | | Ann64
Job type 2
Job Type 3 I'm getting the links by going to the page, clicking a dropdown to filter the Job type, then copying the resulting URL from the address bar. Bu these are really long & cumbersome. I presume if I used a URL shortener, this would count as redirects and alsonot be good for SEO. Any thoughts? Thanks
Ann0 -
Google WMT/search console showing thousands of links in "Internal Links"
Hi, One of our blog-post has been interlinked with thousands of internal links as per search console; but lists only 2 links it got connected from. How come so many links it got connected internally? I don't see any. Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
Internal Linking - Can You Over Do It?
Hi, One of the sites I'm working on has a forum with thousands of pages, amongst thousands of other pages. These pages produce lots of organic search traffic... 200,000 per month. We're using a bit of custom code to link relevant words and phrases from various discussion threads to hopefully related discussion pages. This generates thousands of links and up to 8 in-context links per page. A page could have anywhere from 200 to 3000 words in one to 50+ comments. Generally, a page with 200 words would have fewer of these automatically generated links, just because there are fewer terms naturally on the page. Is there any possible problem with this, including but not limited to some kind of internal anchor text spam or anything else? We do it to knit together pages for link juice and hopefully user experience... giving them another page to go to. The pages we link to are all our pages that produce or we hope to produce organic search traffic from. Thanks! ....Darcy
Intermediate & Advanced SEO | | 945010 -
Should we use URL parameters or plain URL's=
Hi, Me and the development team are having a heated discussion about one of the more important thing in life, i.e. URL structures on our site. Let's say we are creating a AirBNB clone, and we want to be found when people search for apartments new york. As we have both have houses and apartments in all cities in the U.S it would make sense for our url to at least include these, so clone.com/Appartments/New-York but the user are also able to filter on price and size. This isn't really relevant for google, and we all agree on clone.com/Apartments/New-York should be canonical for all apartment/New York searches. But how should the url look like for people having a price for max 300$ and 100 sqft? clone.com/Apartments/New-York?price=30&size=100 or (We are using Node.js so no problem) clone.com/Apartments/New-York/Price/30/Size/100 The developers hate url parameters with a vengeance, and think the last version is the preferable one and most user readable, and says that as long we use canonical on everything to clone.com/Apartments/New-York it won't matter for god old google. I think the url parameters are the way to go for two reasons. One is that google might by themselves figure out that the price parameter doesn't matter (https://support.google.com/webmasters/answer/1235687?hl=en) and also it is possible in webmaster tools to actually tell google that you shouldn't worry about a parameter. We have agreed to disagree on this point, and let the wisdom of Moz decide what we ought to do. What do you all think?
Intermediate & Advanced SEO | | Peekabo0 -
Blog Not Ranking Well at All in Search Engines, Need Help!
Hi Mozers, Need some help on a CMS I've been working with over the last year. The CMS is built by a team of guys here in Washington State. Basically, I'm having issues with clients content on the blog system not getting ranking correctly at all. Here's a few problems I've noticed: Could you confirm and scale these problems based upon being, "not a problem" "a problem" and "critical must fix" 1. The title tag is pulling from the title of the article which is also automatically generating a URL with underscores instead of dashes. Is having a duplicate URL, Title, and Title tag spammy looking to search engines? Are underscores on long URL's confusing google? Where shorter one's are fine (i.e. domain/i_pad/
Intermediate & Advanced SEO | | Keith-Eneix
(i.e.http://www.ductvacnw.com/blog/archives/2013/05/20/5_reasons_to_hire_a_professional_to_clean_your_air_ducts_and_vents), 2. The CMS is resolving all URL's with a canonical instead of a 301 redirect (I've told webmaster tools which preferred url should be indexed). Does using a canonical over a 301 redirect cause any confusion with Google? Is one better practice then the other? 3. The H1 tags on the blog pull from "blog category" instead of the title of the blog post. Is this is a problem? 4. The URl's are quite long with the added "archives/2013/05/20/5". Does this cause problems by pushing the main target keyword further away from the domain name? 5. I'm also noticing the blog post is actually not part of the breadcrumbs where we normally would expect that to populate after the blog category name, Problem? These are some of the things I've noticed and need clarification on. If you see anything else please let me know?0 -
Internal page ranking
If I have a domain: Example.com I want this domain to rank for several keywords. I build a page on the domain called Example.com/glasses If I SEO the page Example.com/glasses with backlinks etc... will that URL come up on google or will it simply bring Example.com up on the SERPS. If I have three keywords, should I make a subpage for each page and SEO that page? Will that make the domain rank for all three keywords?
Intermediate & Advanced SEO | | JML11790 -
How can I penalise my own site in an international search?
Perhaps penalise isn't the right word, but we have two ecommerce sites. One at .com and one at .com.au. For the com.au site we would like only that site to appear for our brand name search in google.com.au. For the .com site we would like only that site to appear for our brand name search in google.com. I've targeted each site in the respective country in Google Webmaster Tools and published the Australian and English address on the respective site. What I'm concerned about is people on Google.com.au searching our brand and clicking through to the .com site. Is there anything I can do to lower the ranking of my .com site in Google.com.au?
Intermediate & Advanced SEO | | Benj250 -
Should I change wordpress urls?
Should I change my wordpress permalinks to include the keyword? For examples at the minute my url is http://www.musicliveuk.com/home/wedding-singer. Is it better to be http://www.musicliveuk.com/live-bands/wedding-singer. 'home' is not relevant so surely 'live-bands' would be better? If I change the urls won't I lose 'link juice' as external links will all point to a url that no longer exists? Or will wordpress automatically redirect the old url to the new one? Finally, if I should change the url as described how do I do it on wordpress? I can only see how to edit the last bit of the url and not the middle bit.
Intermediate & Advanced SEO | | SamCUK0