RegEx help needed for robots.txt potential conflict
-
I've created a robots.txt file for a new Magento install and used an existing site-map that was on the Magento help forums but the trouble is I can't decipher something. It seems that I am allowing and disallowing access to the same expression for pagination. My robots.txt file (and a lot of other Magento site-maps it seems) includes both:
Allow: /*?p=
and
Disallow: /?p=&
I've searched for help on RegEx and I can't see what "&" does but it seems to me that I'm allowing crawler access to all pagination URLs, but then possibly disallowing access to all pagination URLs that include anything other than just the page number?
I've looked at several resources and there is practically no reference to what "&" does...
Can anyone shed any light on this, to ensure I am allowing suitable access to a shop?
Thanks in advance for any assistance
-
Hey James
It looks to me like you are just disallowing access to any URLs that have more than the initial p= variable. So, you are reducing the impact of potential duplication through searches and the like.
Good
?p=1
Bad
?p=1&q=search string
I am no magento expert but this seems to be a simple attempt to reduce the myriad duplication that can happen with search pages and the like inside a complex CMS like Magento.
The SEOMoz crawler tool should give you some good insight and to be sure, try removing the 'Disallow: /?p=&' and see if you get a buckletload of duplicate content warnings.
Ultimately, the thing to remember here is that the & is part of the URL and not part of the regex.
Hope that helps!
Marcus
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need Solution Related Wordpress Site
Hi, Everyone I started my new website on WordPress but I face some error on my website like sitemap indexing, Sidebar not showing so anyone how can help me to check my website Etrends News to explain to me how to solve this solution I am very helpful to you for your time. Thanks,
Technical SEO | | Sonumahan7270 -
Homepage 301 and SEO Help
Hi All, Does redirecting alternate versions of my homepage with a 301 only improve reporting, or are there SEO benefits as well. We recently changed over our servers and this wasn't set-up as before and I've noticed a drop in our organic search traffic. i.e. there was no 301 sending mywebsite.com traffic to www.mywebsite.com Thanks in advance for any comments or help.
Technical SEO | | b4cab0 -
Do i need to use proxy when i ping my backlinks?
I just create 50+ backlinks i would like to know when i ping those like do i need to use proxy? Thank you so much
Technical SEO | | locoto00071 -
Yoast WordPress SEO settings please help
Hello 🙂 Can you please look at these screenshots of my Yoast WordPress SEO settings http://www.zaslike.com/files/h5149mi5435dspiswfm.jpg http://www.zaslike.com/files/5dlhmjxfh2j0hqswesha.jpg http://www.zaslike.com/files/fmx1pwih240gwiofh86s.jpg http://www.zaslike.com/files/w7tyvlhgr5vhv149b9a.png http://www.zaslike.com/files/l9lo37jfpeqmrpufke8.png are they good ? Do i need to change something or correct ? please help Thank you !!!! :))))
Technical SEO | | wolfinjo0 -
Help optimising this site
Hi I have been optimising this site http://seakayakdevon.co.uk/ which is a wordpress site since making changes to it recently the site is now indexed and appearing among its competitors. trouble is they still are placed higher rn the SE rankings. i wish to optimise for local search i.e on Google places etc. but the trouble is there is'nt a physical address for the business it is run from various coastal locations. any ideas how i can still market for local search- maps etc. I have done the following optimisation: sitemaps title tag, description tag improved content removed duplicate content an blocker pages replace image text and replaced with header tag improved page names - making them static any advice of guidance would be greatly appreciated- will the fatc its built in wordpress limit its ability to gain better ranking in the SE? Thanks
Technical SEO | | Bristolweb0 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Blocking other engines in robots.txt
If your primary target of business is not in China is their any benefit to blocking Chinese search robots in robots.txt?
Technical SEO | | Romancing0