Mod rewrite question
-
Sorry in advance if this isn't the best place to ask this question.
Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end.
For example, suppose an actual page on my blog is:
(A) http://www.example.com/blog/2012/09/my-post-title/
This page works just fine.
However, GWT has identified the following page as a "not found" page:
(B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010
This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page.
I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week.
As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL.
Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance.
PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins.
Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
-
I've found a solution and am posting it here in case anybody else is having the same problem:
RewriteRule ^([0-9]{4})/([0-9]{2})/([^/]+)/[0-9]+ /blog/$1/$2/$3/ [L,R=301]
-
I hadnt seen the update over Disquss at the end of the post.
Please, post all your advances on this topic Ahirai
Best regards!
-
Hi ahirai,
I was gonna say you should check the linked from tab in GWT but since you actually did it, for me its pretty sure that a plugin that drives content is creating this issue from scratch.
Since i´m neither an apache expert, i can´t give you a method to do the dirty work, but i can tell you the problem is created by some 3rd party plugin driving content of site.
Please, post your advances in the topic!
Good luck!!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question about Unpredictability with the Knowledge panel showing up for the same search
The people in my client's office get different results when they search for their company name in Google. For example one person ALWAYS gets the right rail knowledge panel with full details about the company while her boss NEVER sees it. They are both on desktop search. Rosemary
Technical SEO | | RosemaryB0 -
Page Speed/Website Optimization Question
We recently relaunched our website and after running multiple page speed tests (GT Metrix, Google, etc.) our results aren't great. We would love any suggestions on how to improve our site as we are not experts in what exactly these results mean - https://gtmetrix.com/reports/loyalty360.org/DKRN0hKg. Thanks!
Technical SEO | | carlystemmer0 -
Meta Title Tags - Quick question!
Hi all, Our category Meta Title Tags are a little woeful and so I'm in the process of rewriting them. Let's say you have a product for sale.... some inkjet cartridges for a Canon BJ10V printer for example. In an effort to keep things concise I was thinking that for this category I should have the meta title set simply as: 'Canon BJ10V Inkjet Cartridges' and perhaps our company name after this text (and a pipe delimiter) This takes us just under 50 characters which is ideal but doesn't include any real keyword variation and will result in the company name being duplicated at the tail of the title tag on 6,000 odd pages. A large number of my competitors have title tags along the lines of: 'Canon BJ10V Cheap Inkjet Cartridges for Canon BJ-10V Ink Printers' I understand the reasoning behind this but does the variation of keywords compensate for the fact that the title looks spammy (to both humans and Search Engines). What would you do? Keep it clean and concise or stuff the title full of keywords. In the event of the former would you include the company name in each title in the knowledge they would be well under 50 characters without? Thanks for your help.
Technical SEO | | ChrisHolgate1 -
Redirecting a questionable domain to a trusted domain
I have a question!
Technical SEO | | FDFPres
We have 2 domains operating within the same retail sector. One of them is for our bricks and mortar business and the other is a new brand we launched as a nationwide e-retailer. We aggressively built links for the new one and achieved some very good search positioning, where we remained for about 4 months until the google updates of the first half of this year started biting. The domain never received a warning from google or anything, but the links have clearly been devalued to a point where the domain is now virtually buried for the most competitive terms. However, the domain does still get around 100-200 visitors per day, and has a DA of 38. We're thinking about a reshuffle that would involve putting the products in to our brick and mortar business website, and redirecting the brand domain to the bricks and mortar domain. Thank you for reading this far! the question is then, is there a danger of the bricks and mortar domain being tarnished by this? as i said the brand domain hasn't had any notices of penalty from google but it has definitely been hit by updates.0 -
Wordpress Categories and Over-Optimization Question
I would like to switch my sidebar from listing Category Name with posts listed below each- to a concise custom menu. This custom menu would list the top three products I am promoting first, and then go on to list the categories on my site. Currently it looks like this (but with 6 categories, with between 7-10 items in each - this is on EVERY page) Widgets
Technical SEO | | PrivatePartners
-Green Widget
-Blue Widget Gidwets
-Big Gidwet
-Small Gidwet I rank well in google right now, but I am concerned that changing my sidebar will result in a penalty. Maybe for over-optimizing my top three products I promote, or possibly for trying to control the flow of link juice. Can anyone chime in here who has adjusted their site structure within wordpress, and tell me what you found worked best? ** Before anyone asks**, this structure does work much better for the user. My sidebar now is massive, and is confusing even to me.0 -
How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.
We recently changed hosting companies for our ecommerce website. The hosting company added some functionality such that duplicate content and/or mirrored pages appear in the search engines. To fix this problem, the hosting company created both canonical URLs and URL rewrites. Now, we have page A (which is the original page with all the link juice) and page B (which is the new page with no link juice or SEO value). Both pages have the same content, with different URLs. I understand that a canonical URL is the way to tell the search engines which page is the preferred page in cases of duplicate content and mirrored pages. I also understand that canonical URLs tell the search engine that page B is a copy of page A, but page A is the preferred page to index. The problem we now face is that the hosting company made page A a copy of page B, rather than the other way around. But page A is the original page with the seo value and link juice, while page B is the new page with no value. As a result, the search engines are now prioritizing the newly created page over the original one. I believe the solution is to reverse this and make it so that page B (the new page) is a copy of page A (the original page). Now, I would simply need to put the original URL as the canonical URL for the duplicate pages. The problem is, with all the rewrites and changes in functionality, I no longer know which URLs have the backlinks that are creating this SEO value. I figure if I can find the back links to the original page, then I can find out the original web address of the original pages. My question is, how can I search for back links on the web in such a way that I can figure out the URL that all of these back links are pointing to in order to make that URL the canonical URL for all the new, duplicate pages.
Technical SEO | | CABLES0 -
Oh! best community of seo the seomoz team! question:
Hi the best of the best on the seo world. I like to ask something, i like to know: So, from your strongest-bestest-proffessionallll experience, the web-scripts you use,must be the best of best scripts, i like to ask: for your newsletter, are you using external service? client managment,pro members,are you using any external service from another website or have you do the programming itself? and dont forget to say, your services without any comment are number one on the world,but your community now and your design,and your big heart to leave us talking together here and unlimited questiond to ask, you have attested that you are the best. But tell us more of your services what you use on your encyclopedy of seo, we like to know more from you. Thanks Meti.
Technical SEO | | leadsprofi0 -
Pages not ranking - Linkbuilding Question
It has been about 3 months since we made some new pages, with new, unique copy, but alot of pages (even though they have been indexed) are not ranking in the SERPS I tested it by taking a long snippet of the unique copy form the page and searching for it on Google. Also I checked the ranking using http://arizonawebdevelopment.com/google-page-rank
Technical SEO | | Impact-201555
Which may no be accurate, I know, but would give some indication. The interesting thing was that for the unique copy snippets, sometimes a different page of our site, many times the home page, shows up in the SERP'sSo my questions are: Is there some issue / penalty / sandbox deal with the pages that are not indexed? How can we check that? Or has it just not been enough time? Could there be any duplicate copy issue going on? Shouldn't be, as they are all well written, completely unique copy. How can we check that? Flickr image details - Some of the pages display the same set of images from flickr. The details (filenames, alt info, titles) are getting pulled form flickr and can be seen on the source code. Its a pretty large block of words, which is the same on multiple pages, and uses alot of keywords. Could this be an issue considered duplication or keyword stuffing, causing this. If you think so , we will remove it right away. And then when do we do to improve re-indexing? The reason I started this was because we have a few good opportunities right now for links, and I was wondering what pages we should link to and try to build rankings for. I was thinking about pointing one to /cast-bronze-plaques, but the page is not ranking. The home page, obviously is the oldest page, and ranked the best. The cast bronze plaques page is very new. Would linking to pages that are not ranking well be a good idea? Would it help them to get indexed / ranking? Or would it be better to link to the pages that are already indexed / ranking? If you link to a page that does not seem to be indexed, will it help the domains link profile? Will the link juice still flow through the site0