I have removed over 2000+ pages but Google still says i have 3000+ pages indexed
-
Good Afternoon,
I run a office equipment website called top4office.co.uk.
My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages.
Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages.
I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done.
But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700).
Does anyone have any ideas why this is happening and more importantly how i can fix it?
Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings?
Look forward to your responses!
-
I agree with DrPete. You cant have the pages within the robot.txt otherwise Google will not crawl the pages and "see" the 301s to then update the index.
Something else to consider is on the new pages, have them canonical to themselves. We had a site that Google was caching old URLs that had 301 redirects that had been up for 2 years. Google was finding the new pages and new titles and new content, but were referencing the old URLs. We were seeing this in the SERPs and also in the GWT. GWT was reporting duplicate content for titles and descriptions for sets of pages that were 301ed. Adding the canonical to self helped get that cleaned up.
Cheers.
-
This process can take a painfully long time, even done right, but I do have a couple of concerns:
(1) Assuming I understand the situation, I think using Robots.txt on top of 301-redirects is a bad idea. If Google doesn't recrawl the pages, they won't process the 301s, and Robots.txt is bad for removal (good for prevention, but not once something is in the index). Basically, you're telling Google not to re-crawl these pages, and if they don't re-crawl, they won't process the 301s. So, I'd drop the Robots.txt blocking for now, honestly.
(2) What's your internationalization strategy? You could potential try rel="alternate"/hreflang to specify US vs. UK English, target each domain in webmaster tools, and leave the duplicates alone. If you 301-redirect, you're not giving the UK site a chance to rank properly on Google.co.uk (if that's your objective).
-
It sounds like you have done pretty much everything you could do to remove those pages from Google, and that Google has removed them.
There are two possibilities that I can think of. First, Google is finding new pages or new URLs at least. These may be old pages that have some sort of a parameter on them or something like that that are causing Google to find some new pages even though you're not adding any new pages.
Another possibility is that, I found that the site:search is not entirely accurate. So, it's more like anything else that Google gives us words this kind of estimate of the actual figure. It's possible that Google was giving you a smaller number of pages if in that original 3700 they said they had. And now they're just reporting more of the pages that they had had in their index, which they weren't showing before.
By the way, when I do a search for site:top four office.co.uk, I only get 2600 results.
-
I no longer see the pages. No chance Google has seen any additional pages as we spend every day looking at new pages indexed by using the filter and site:top4office.co.uk.
Any ideas?
-
Just a quick question, do you see the URLs you "removed" still in the index? Or is it possible that Google has found a different set of 3000 URLs on your site?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Google slow to index pages
Hi We've recently had a product launch for one of our clients. Historically speaking Google has been quick to respond, i.e when the page for the product goes live it's indexed and performing for branded terms within 10 minutes (without 'Fetch and Render'). This time however, we found that it took Google over an hour to index the pages. we found initially that press coverage ranked until we were indexed. Nothing major had changed in terms of the page structure, content, internal linking etc; these were brand new pages, with new product content. Has anyone ever experienced Google having an 'off' day or being uncharacteristically slow with indexing? We do have a few ideas what could have caused this, but we were interested to see if anyone else had experienced this sort of change in Google's behaviour, either recently or previously? Thanks.
Intermediate & Advanced SEO | | punchseo0 -
Google not indexing images
Hi there, We have a strange issue at a client website (www.rubbermagazijn.nl). Webpage are indexed by Google but images are not, and have never been since the site went live in '12 (We recently started SEO work on this client). Similar sites like www.damenrubber.nl are being indexed correctly. We have correct robots and sitemap setup and directions. Fetch as google (Search Console) shows all images displayed correctly (despite scripted mouseover on the page) Client doesn't use CDN Search console shows 2k images indexed (out of 18k+) but a site:rubbermagazijn.nl query shows a couple of images from PDF files and some of the thumbnails, but no productimages or category images from homepage. (product page example: http://www.rubbermagazijn.nl/collectie/slangen/olie-benzineslangen/7703_zwart_nbr-oliebestendig-6mm-l-1000mm.html) We've changed the filenames from non-descriptive names to descriptive names, without any result. Descriptive alt texts were added We're at a loss. Has anyone encountered a similar issue before, and do you have any advice? I'd be happy to provide more information if needed. CBqqw
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Google de-indexed a page on my site
I have a site which is around 9 months old. For most search terms we rank fine (including top 3 rankings for competitive terms). Recently one of our pages has been fluctuating wildly in the rankings and has now disappeared altogether from the rankings for over 1 week. As a test I added a similar page to one of my other sites and it ranks fine. I've checked webmaster tools and there is nothing of note there. I'm not really sure what to do at this stage. Any advice would me much appreciated!
Intermediate & Advanced SEO | | deelo5550 -
Google Is Indexing The Wrong Page For My Keyword
For a long time (almost 3 mounth) google indexing the wrong page for my main keyword.
Intermediate & Advanced SEO | | Tiedemann_Anselm
The problem is that each time google indexed another page each time for a period of 4-7 days, Sometimes i see the home page, sometimes a category page and sometimes a product page.
It seems though Google has not yet decided what his favorite / better page for this keyword. This is the pages google index: (In most cases you can find the site on the second or third page) Main Page: http://bit.ly/19fOqDh Category Page: http://bit.ly/1ebpiRn Another Category: http://bit.ly/K3MZl4 Product Page: http://bit.ly/1c73B1s All links I get to the website are natural links, therefore in most cases the anchor we got is the website name. In addition I have many links I get from bloggers that asked to do a review on one of my products, I'm very careful about that and so I'm always checking the blogger and their website only if it is something good, I allowed it. also i never ask for a link back (must of the time i receive without asking), and as I said, most of their links are anchor with my website name. Here some example of links that i received from bloggers: http://bit.ly/1hF0pQb http://bit.ly/1a8ogT1 http://bit.ly/1bqqRr8 http://bit.ly/1c5QeC7 http://bit.ly/1gXgzXJ Please Can I get a recommendation what should you do?
Should I try to change the anchor of the link?
Do I need to not allow bloggers to make a review on my products? I'd love to hear what you recommend,
Thanks for the help0 -
Software to monitor indexed pages
Dear SEO moz, As a SEO marketer on a pretty big website I noticed a HUGE amount of dropping pages indexed by google. We did not do anything to block googleblot in the past 6 months, but since November the number of indexed pages decreased from 3.4 milion (3,400.000) to 7 hundred thousand (700,000). Obviously I want to know which pages are de-indexed. Does anyone you know a tool which can do this?
Intermediate & Advanced SEO | | JorisHas1 -
Google Not Indexing Description or correct title (very technical)
Hey guys, I am managing the site: http://www.theattractionforums.com/ If you search the keyword "PUA Forums", it will be in the top 10 results, however the title of the forum will be "PUA Forums" rather than using the code in the title tag, and no description will display at all (despite there being one in the code). Any page other than the home-page that ranks shows the correct title and description. We're completely baffled! Here are some interesting bits and pieces: It shows up fine on Bing If I go into GWT and Fetch as Google Bot, it shows up as "Unreachable" when I try to pull the home-page. We previously found that it was pulling 'index.htm' before 'index.php' - and this was pulling a blank page. I've fixed this in the .htaccess however to make it redirect, however this hasn't solved the problem. I've disallowed it from pulling the description .etc from the Open Directory with the use of meta tags - didn't change anything. It's vBulletin and is running vBSEO Any suggestions at all guys? I'll be forever in anyones debt who can solve this, it's proving to be near impossible to fix. Here is the .htaccess file, it may be a part of the issue: RewriteEngine On DirectoryIndex index.php index.html Redirect /index.html http://www.theattractionforums.com/index.php RewriteCond %{HTTP_HOST} !^www.theattractionforums.com
Intermediate & Advanced SEO | | trx
RewriteRule (.*) http://www.theattractionforums.com/$1 [L,R=301] RewriteRule ^((urllist|sitemap_).*.(xml|txt)(.gz)?)$ vbseo_sitemap/vbseo_getsitemap.php?sitemap=$1 [L] RewriteCond %{REQUEST_URI} !(admincp/|modcp/|cron|vbseo_sitemap/)
RewriteRule ^((archive/)?(..php(/.)?)?)$ vbseo.php [L,QSA] RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !^(admincp|modcp|clientscript|cpstyles|images)/
RewriteRule ^(.+)$ vbseo.php [L,QSA]
RewriteRule ^forum/(.*)$ http://www.theattractionforums.com/$1 [R=301,L]0 -
My page has fallen off the face of the earth on Google. What happened?
I have checked all of the usual things. My page has not lost any links or authority. It is not black listed or any other obvious sign. What's going on? This has just happened within the past 3 days.
Intermediate & Advanced SEO | | Tormz0