How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Please help us undertsand the things we need to improve so that google crawler visit us more often to reindex pages from our domain
we are currently in the process of a massive project which involves us migrating our domain, we realised that Google crawlwer has not been crawling our pages Quiet often. i have observed some cases where google crawled these pages about 6 months back and then never visited the pages again
Intermediate & Advanced SEO | | bhaskaran
and we had to manually submit these pages for reindexing in some geographies. can you please help us undertsand the things we need to improve so that google crawler visit us more often to reindex pages from our domain0 -
Google Detecting Real Page as Soft 404 Error
We've migrated my site from HTTP to HTTPS protocols in Sep 2017 but I noticed after migration soft 404 granularly increasing. Example of soft 404 page: https://bit.ly/2xBjy4J But these soft 404 error pages are real pages but Google still detects them as soft 404. When I checked the Google cache it shows me the cache but with HTTP page. We've tried all possible solutions but unable to figure out why Google is still indexing to HTTP pages and detecting HTTPS pages as soft 404 error. Can someone please suggest a solution or possible cause for this issue or anyone same issue like this in past.
Intermediate & Advanced SEO | | bheard0 -
Why does Google display the home page rather than a page which is better optimised to answer the query?
I have a page which (I believe) is well optimised for a specific keyword (URL, title tag, meta description, H1, etc). yet Google chooses to display the home page instead of the page more suited to the search query. Why is Google doing this and what can I do to stop it?
Intermediate & Advanced SEO | | muzzmoz0 -
Weird Page switch for a keyword in Google Rankings
Over this past weekend Google switched the page which usually showed in search results for keyword benchmarking. It went from from http://www.apqc.org/benchmarking to http://www.apqc.org/benchmarking-portal/osb. Also on Google the Rankings for the keyword 'benchmarking' sank from 15 to 47 for http://www.apqc.org/benchmarking Just looking for some theories or ideas or anyone that has had this happen to them.
Intermediate & Advanced SEO | | inhouseninja0 -
Duplicate content within sections of a page but not full page duplicate content
Hi, I am working on a website redesign and the client offers several services and within those services some elements of the services crossover with one another. For example, they offer a service called Modelling and when you click onto that page several elements that build up that service are featured, so in this case 'mentoring'. Now mentoring is common to other services therefore will feature on other service pages. The page will feature a mixture of unique content to that service and small sections of duplicate content and I'm not sure how to treat this. One thing we have come up with is take the user through to a unique page to host all the content however some features do not warrant a page being created for this. Another idea is to have the feature pop up with inline content. Any thoughts/experience on this would be much appreciated.
Intermediate & Advanced SEO | | J_Sinclair0 -
Google Places Listing Active In Two Seperate Google Places Accounts?
Hi is there any issues with having a google places listing in two seperate google places accounts. For example we have a client who cannot access their old google places account (ex-employee had their login details which they can't get) and want us to take control over the listing. If we click the "is this your listing" manage this page button - and claim the listing, will this transfer the listing to our control? Or will it create a duplicate? Are there any problems having the listing in different separate accounts. Is it a situation in which the last person who manages the listing takes control? And the listing automatically deactivates from the old account? Do all the images remain aswell? Thanks,
Intermediate & Advanced SEO | | MBASydney
Tom0 -
Getting Google in index but display "parent" pages..
Greetings esteemed SEO experts - I'm hunting for advice: We operate an accommodation listings website. We monetize by listing position in search results, i.e. you pay more to get higher placing in the page. Because of this, while we want individual detailed listing pages to be indexed to get the value of the content, we don't really want them appearing in Google search results. We ideally want the "content value" to be attributed to the parent page - and google to display this as the link in the search results instead of the individual listing. Any ideas on how to achieve this?
Intermediate & Advanced SEO | | AABAB0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0