Prevent Google from crawling Ajax
-
With Google figuring out how to make Ajax and JS more searchable/indexable, I am curious on thoughts or techniques to prevent this.
Here's my Situation, we have a page that we do not ever want to be indexed/crawled or other. Currently we have the nofollow/noindex command, but due to technical changes for our site the method in which this information is being implemented if it is ever displayed it will not have the ability to block the content from search. It is also the decision of the business to not list the file in robots.txt due to the sensitivity of the content. Basically, this content doesn't exist unless something super important happens, and even if something super important happens, we do not want Google to know of its existence.
Since the Dev team is planning on using Ajax/JS to pull in this content if the business turns it on, the concern is that it will be on the homepage and Google could index it. So the questions that I was asked; if Google can/does index, how long would that piece of content potentially appear in the SERPs? Can we block Google from caring about and indexing this section of content on the homepage?
Sorry for the vagueness of this question, it's very sensitive in nature and I am trying to avoid too many specifics. I am able to discuss this in a more private way if necessary.
Thanks!
-
Toby, thanks for the suggestion! I believe that this will help accomplish what we need. My Dev gave the "oh S" I should've thought of that response.
-
You may find that you have to wrap the code that gets called when Ajax fires in something to catch the user agent. I.e. if your making an Ajax request to a php script in order to return data, you could wrap that php code in something like this (please excuse the Sudo code):
if(in_array($_SERVER['HTTP_USER_AGENT'], $knownagents){
//known webspider, or blocked agent, return nothing.
return "";
} else {
//not a known spider so continue.
}
?>
Thats very generalised but you get the idea. I put a short list together in JSON format a while back, you can find it here if its of any use: https://www.source-control.co.uk/knownspiders/spiders.php
PM me if you need any more specific help than that with development, hopefully someone else will have a slightly easier way of dealing with this though heh
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is my name. Overnight it went from being the number one google search to not showing up at all when you google my name. Why would this happen?
I built my website via square space. It is my name. If you google my name it was the number one hit. Suddenly 2 weeks ago it doesn't show up AT ALL. I went through square spaces SEO check list, secured my site etc. Still doesn't show up. Why would this happen all of the sudden and What can I do? Thank you!
Intermediate & Advanced SEO | | Jbark0 -
Google News error in Google Search Console
My google search console states some errors as below: 1. Article fragmented Some of the urls in this error are the category urls. How to make google bot understand it is a category not an article? 2. Article too short In fact the article is quite long. I do not know why this is happen... 3. No sentence found In fact, there are a lot of sentences Please help!
Intermediate & Advanced SEO | | binhlai0 -
Google ignoring Canonical and choosing its own
Hey Mozzers, We have several products that all have upto 6 different versions, they are the same product but in a different specification. As users search via these specifications (within our website) it is beneficial to keep all 6 products as different listings on the website. In google however it is not. So we kept all 6 listing but chose 1 to be the google landing page, the only different between them all is the technical specification + occasionally size. But 95% of the pages are the same. Let call the products A, B, C, D, E, F, we made all the canonicals point to C because this is out best selling version of the product. However, google has chosen E to rank instead. What is my best move here? Should i accept the page google has chosen and change the canonicals the point to that version or should I be stubborn and try to get google to change which version it ranks. As always many thanks.
Intermediate & Advanced SEO | | ATP0 -
How to leverage Google Images?
My Google search rankings are improving rapidly at the moment, but a lot of my rankings are for images (presume that means the images are appearing near the top in Google Images). How do I capitalise on that? It's not really much help to me that my images are popular unless it results in traffic to the pages where those images are used. I am running Wordpress so I have the option to have images embed as "no link", "link to attachment page", "link to original image", etc. Is there any advantage of using one of these over the other? I'd really like to set it up so that when a Google Images user clicks "View Image" it loads the attachment page or the host content page rather than the image. Bad SEO? I'm not sure if the fact that I'm using Jetpack Photon CDN image hosting will make this more complicated or not. Tony
Intermediate & Advanced SEO | | Gavin.Atkinson0 -
Google Tag Manager
Has anyone used Google Tag Manager and do you feel it is worth it?
Intermediate & Advanced SEO | | ChristinaRadisic0 -
Effect of I-Frame on Google Rank
My commercial real estate web site (www.nyc-officespace-leader.com) allows visitors to search for office space listings. The site sources listings through a third party and they are displayed in an i-frame. The i-frame directs visitors to listing pages such as: http://listings.nyc-officespace-leader.com/getspace.mpl?sp_id=A0173921&cust_id=offspldr Atleast 10,000 of these pages have backlinks to my site. My question is the following: Could these tens of thoudands of alpha numeric URLs be detrimental to my sites ranking on Google after the Panda/Penguin updates? SIte traffic dropped from 7,000 per month to about 3,300 after the April Google update. Rewriting content for dozens of pages and adding a blog have only somewhat mitigated the negative effects of Panda/Penguin. Could Google be viewing these links from the third party lisitng provider as a negative when they viewed these links as a plus before? Any downside to removing the third party links and parsing these listings from landlord websited and displaying them as part of my site with their own URL, title tag, description tag? Obviously the new URLS would not be alphanumeric. If these links have not caused the drop in traffic last April, what could be responsible? Thanks in advance for your opinion!!! Alan
Intermediate & Advanced SEO | | Kingalan10 -
Why does google keep shortening my title?
IRS Problems, Tax Problems <cite>www.taxproblem.org/</cite> We are a Local Houston CPA Firm in Harris County Texas, dedicated to helping taxpayers resolve their tax problems. We mean, “actually resolve their IRS ... Checking the source code doesn't reveal any reason I can see why they would do that. It happens most often, if not all the time, and only Google results So if anyone would check the source code of my main page and can see why and what needs to be done, I can fix it. Thanks
Intermediate & Advanced SEO | | joemas990 -
Working out exactly how Google is crawling my site if I have loooots of pages
I am trying to work out exactly how Google is crawling my site including entry points and its path from there. The site has millions of pages and hundreds of thousands indexed. I have simple log files with a time stamp and URL that google bot was on. Unfortunately there are hundreds of thousands of entries even for one day and as it is a massive site I am finding it hard to work out the spiders paths. Is there any way using the log files and excel or other tools to work this out simply? Also I was expecting the bot to almost instantaneously go through each level eg. main page--> category page ---> subcategory page (expecting same time stamp) but this does not appear to be the case. Does the bot follow a path right through to the deepest level it can/allowed to for that crawl and then returns to the higher level category pages at a later time? Any help would be appreciated Cheers
Intermediate & Advanced SEO | | soeren.hofmayer0