WEBMASTER console: increase in the number of URLs we were blocked from crawling due to authorization permission errors.
-
Hi guys,I received this warning in my webmaster console: "Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors." So i went to "Crawl Errors" section and i found such errors under "Access denied" status:
?page_name=Cheap+Viagra+Gold+Online&id=471
?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
and many happy URLs like these. Does anybody know what this is and where it comes from?
Thanks in advance!
-
Thank you Tom!
-
Hi
to removed any chance of infection and I am not telling you that I am 100% sure it's infected
You must be certain that the regional infection was removed. If it was not and you had links created by a third party other than yourself you are better off getting it completely cleaned
use Sucuri.net to remove any chance of a hack.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
i used deepcrawl.com to create the audit I you referenced.
&
Screaming frog SEO to create the site map
I hope that helps,
Tom
-
Hello Thomas,
I really appreciate your help! You said i can look at your site's structure. What is your site address?
Unfortunately, i still don't know what i need to do in order to remove those pharma hack from my site. If you know where to point me to get the answer, i'll be very grateful.
Also, what tool you used to generate this report http://crawl.blueprintmarketing.com/projects/reports/215533?ro=75ad0c6e4afacc428b553d449dfd281f82ec2ad6 ?
Also, what tool you used to create XML site map?
Thanks
-
No site map from checking multiple configurations of XML site maps and coming up with nothing no redirects either e.g. /sitemap_index.xml might exist separately or redirect to /sitemap.xml
http://www.davidandsonsjewelers.com/sitemap.xml shows a 404
Tool's
deepcrawl.com https://varvy.com/mobile/ & https://varvy.com/tools/
-
detect mobile issues
-
If I were you I would look at my site structure make sure that it was built in a certain manner for the right reasons.
If your traffic is all right you really do not want to change the site that much. If you do change the site change it slowly.
( A great example of this is how FireHost.com it is becoming Armor.com)
the tools I used to find out whether or not you had a site map primarily was deepcrawl.com
to detect mobile issues
https://varvy.com/mobile/ & https://varvy.com/tools/
http://i.imgur.com/W7BDaq7.png
http://www.screamingfrog.co.uk/seo-spider/
http://i.imgur.com/LbCBmmW.png
I used screaming frog to create a XML site map for you here
I would definitely add an XML site map.
Sincerely,
Thomas
-
Also, do you say that the mobile site is blocked? Also, how do you see that the site doesn't have XML? What tool shows you all this info?
Thanks
-
Hi Thomas,
I really appreciate your help! Can you advise me what i should do? I see all these reports but i don't know how i need to clean the site.
Thank you!
-
As you are showing certain URLs that are definitely Pharma hack their are certain things Sucuri is unable to detect because of it being a front-end tool not the PHP tool that would be needed for the two-part WordPress and PHP version of your site.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
https://www.virustotal.com/en/ip-address/216.120.237.225/information/
http://dnsbl.inps.de/query.cgi?lang=en&ip=216.120.237.225&action=check&quick=0
-
and switch everything to WordPress
view-source:http://www.davidandsonsjewelers.com/
-
some of you are links are really not supposed to be there
Here is your report please use the URL below to navigate the entire report.
All of you are URLs are relative to the most part that should be fixed. You have a Java redirect that definitely needs to be fixed.
PDF & XML outline
- http://cl.ly/d6Sv/www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.pdf
- http://cl.ly/d6S7/public-report_files-215533-www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.xls
You have roughly 108 indexed URLs according to Google
https://marketing.grader.com/report/www.davidandsonsjewelers.com/overall
you do not have an XML site map unfortunately I found that out in the first five minutes but you can also find out if these things using
https://mza.seotoolninja.com/researchtools/crawl-test
upon a quick check with another tool I found
http://i.imgur.com/Y60WnIc.png
I love deepcrawl however your site is not large you can learn a lot about it with
http://www.screamingfrog.co.uk/seo-spider/ free
I hope this is a help, with analytics access and webmaster tool like this I cannot obviously give you a much better picture.
Tom
-
I will run the audit now sorry for the delay
-
-
The best way to solve this problem is to use
Or http://screamingfrog.co.uk Seo spider
If you give me the URL I will do it quick check for you.
-
Thank you Thomas,
My site is clean though according to sucuri. I spoke to owner of this website and they said that they were hacked in the past and they blocked those pages themselves. So now google detects those pages again? Or what exactly is happening? Anybody knows?
Thanks
-
Remember that not every URL is in Googles index. It does not mean that your back link is not in
https://mza.seotoolninja.com/researchtools/ose/
You should very quickly make sure that your website is not still completely full of malware like it sounds it is
use this tool to determined what has happened to your site if it is infected it is free.
If it is hacked as I believe it may be dependent on what you have described I would then purchase the malware removal and web application firewall
https://sucuri.net/website-antivirus/
if you would like a much more secure hosting environment https://armor.com is the best.
Once you have removed your site from the blacklists and removed all the bad where/malware make sure to crawl it with Google in Webmaster tools using fetch as a Google bot
your nightmare should be short-lived sorry to hear that your site was hacked hopefully this will get you back on track quickly.
-
Hi Dirk,
In webmaster tools if i click one by one those links, i can see "Linked from" URLs. There are URLs like this:
http://schwagginwagon.com/?page_name=Buying+Tadalis+SX+Safely+No+Prescription+Tadalis+SX&id=1810
and also there is one URL is coming from my domain. Not sure what it means.
I went through every single URL in Google index but all of them are normal URLs. Nothing related to spam. Any ideas?
Thanks
-
Try to do a search of type viagra site:yourdomain.com - and see if there are any pages of suspicious nature that are listed.
In the crawl error section in webmaster tools you could also check where these url's are coming from (external/internal links)
If your site is hacked - you can find more info here http://www.google.com/webmasters/hacked/ on what to do next.
rgds,
Dirk
-
Hello Dirk,
Thank you for fast reply! I thought it too right away. So all of these URLs are forbidden when i try to access them. This is the message from google webmaster tools "Googlebot couldn't crawl your URL because your server either requires authentication to access the page, or it is blocking Googlebot from accessing your site."
Any ideas? Thanks
-
Hi
On first sight I would guess your site has been hacked - do these url's exist when you try them?
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Google crawl parameter URLs?
Hi SEO Masters, Google is indexing this parameter URLs - 1- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-6109-4191-6110&mode=li_23&p=2&filterable_stone_shapes=4114 2- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-4169-4195&mode=li_23&p=2&filterable_stone_shapes=4115&filterable_metal_types=4163 I have handled by Google parameter like this - jewelry_styles= Narrows Let Googlebot decide mode= None Representative URL p= Paginates Let Googlebot decide filterable_stone_shapes= Narrows Let Googlebot decide filterable_metal_types= Narrows Let Googlebot decide and Canonical for both pages - xyz.com/f1/f2/page?p=2 So can you suggest me why Google indexed all related pages with this - xyz.com/f1/f2/page?p=2 But I have no issue with first page - xyz.com/f1/f2/page (with any parameter). Cononical of first page is working perfectly. Thanks
Technical SEO | | Rajesh.Prajapati
Rajesh0 -
Google Webmaster Tools is saying "Sitemap contains urls which are blocked by robots.txt" after Https move...
Hi Everyone, I really don't see anything wrong with our robots.txt file after our https move that just happened, but Google says all URLs are blocked. The only change I know we need to make is changing the sitemap url to https. Anything you all see wrong with this robots.txt file? robots.txt This file is to prevent the crawling and indexing of certain parts of your site by web crawlers and spiders run by sites like Yahoo! and Google. By telling these "robots" where not to go on your site, you save bandwidth and server resources. This file will be ignored unless it is at the root of your host: Used: http://example.com/robots.txt Ignored: http://example.com/site/robots.txt For more information about the robots.txt standard, see: http://www.robotstxt.org/wc/robots.html For syntax checking, see: http://www.sxw.org.uk/computing/robots/check.html Website Sitemap Sitemap: http://www.bestpricenutrition.com/sitemap.xml Crawlers Setup User-agent: * Allowable Index Allow: /*?p=
Technical SEO | | vetofunk
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /lib/
Disallow: /magento/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /aitmanufacturers/index/view/
Disallow: /blog/tag/
Disallow: /advancedreviews/abuse/reportajax/
Disallow: /advancedreviews/ajaxproduct/
Disallow: /advancedreviews/proscons/checkbyproscons/
Disallow: /catalog/product/gallery/
Disallow: /productquestions/index/ajaxform/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt Paths (no clean URLs) Disallow: /.php$
Disallow: /?SID=
disallow: /?cat=
disallow: /?price=
disallow: /?flavor=
disallow: /?dir=
disallow: /?mode=
disallow: /?list=
disallow: /?limit=5
disallow: /?limit=10
disallow: /?limit=15
disallow: /?limit=20
disallow: /*?limit=250 -
Crawl Results
How fresh is SEOMOZ crawl results ?. On my report for today I can see that my website ranking for several keywords run manually and individually on Google, Yahoo and bing to be better than the actual SEOMOZ report. Also have been noticing that Back link count on SEOMOZ report to be significantly less than counted with other sites and software.Can someone advise me on this ?
Technical SEO | | sherohass0 -
What if 404 Error not possible?
Hi Everyone, I get an 404 error in my page if the URL is simply wrong, but for some parameters, like if a page has been deleted, or has expired, I get an error page indicating that the ID is wrong, but no 404 error. It is for me very difficult to program a function in php that solve the problem and modify the .htaccess with the mod_rewrite. I ask the developer of the system to give a look, but I am not sure if I will get an answer soon. I can control the content of the deleted/expired page, but the URL will be very similar to those that are ok (actually the url could has been fine, but now expired). Thinking of solutions I can set the expired/deleted pages as noindex, would it help to avoid duplicated title/description/content problem? If an user goes to i.e., mywebsite.com/1-article/details.html I can set the head section to noindex if it has expired. Would it be good enough? Other question, is it possible anyhow to set the pages as 404 without having to do it directly in the .htacess, so avoiding the mod_rewrite problems that I am having? Some magical tag in the head section of the page? Many thanks in advance for your help, Best Regards, Daniel
Technical SEO | | te_c0 -
Google webmaster errors
**If you know what these google webmasters errors mean, and you can explain it to me in simple english and tell me how I can locate the problem, I would really appreciate it!. <colgroup><col width=""><col width=""><col width=""><col width=""><col width="*"><col width="124"><col width="54"></colgroup>
Technical SEO | | Joseph-Green-SEO
| | | | | Server error | | | | Soft 404 | | | | Access denied | | Not found | | | Not followed | | | |** I have many of these errors, is it harming SEO?Yoseph0 -
Duplicate content with same URL?
SEOmoz is saying that I have duplicate content on: http://www.XXXX.com/content.asp?ID=ID http://www.XXXX.com/CONTENT.ASP?ID=ID The only difference I see in the URL is that the "content.asp" is capitalized in the second URL. Should I be worried about this or is this an issue with the SEOmoz crawl? Thanks for any help. Mike
Technical SEO | | Mike.Goracke0 -
Google Webmaster tools error?
So I am trying to set the URL preference in google webmaster tools for my site. However when I try to save it it tells me to verify that I own the site. I have already done this so where can I go to verify I own the site exactly? Maybe I am wrong and I have not done this already but even on the homepage of webmaster tools I don't see an option to "verify".
Technical SEO | | ENSO0 -
Number of Indexed Pages in Webmaster Tools
My # of indexed pages in Webmaster Tools fluctuates greatly. Compared to the # of URLs submitted (4700), we have 3000 indexed. The other day, all 4700 were indexed. Why does it keep changing? I obviously want all of them indexed right? What can I do to make that happen?
Technical SEO | | kylesuss0