Wordpress URL weirdness - why is google registering non-pretty URLS?
-
I've noticed in my stats that google is indexing some non-pretty URLs from my wordpress-based blog.
For instance, this URL is appearing google search:http://www.admissionsquest.com/onboardingschools/index.php?p=439
It should be:
Last week I added the plugin Redirection in order to consolidate categories & tags. Any chance that this has something to do with it? Recs on how to solve this?
Fyi - I've been using pretty URLS with wordpress from the very beginning and this is the first time that I've seen this issue. Thanks in advance for your help!
-
An additional thought. In addition to the plugin Redirection, last week I also added platinum seo pack. Any chance either is causing the issue?
-
Thanks, I checked the file and this is what we have:
Begin WordpressRewriteBase /onboardingschools/
RewriteCond %{REQUEST_METHOD} !=POSTRewriteCond %{QUERY_STRING} !.=.RewriteCond %{HTTP_COOKIE} !^.(comment_author_|wordpress|wp-postpass_).$RewriteCond %{HTTP:Accept-Encoding} gzipRewriteCond %{HTTP_user_agent} !^.(2.0\ MMP|240x320|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|hiptop|IEMobile|iPhone|iPod|KYOCERA/WX310K|LG/U990|MIDP-2.0|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|Playstation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|Windows\ CE|WinWAP).RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz -fRewriteRule ^(.) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz [L]
RewriteCond %{REQUEST_METHOD} !=POSTRewriteCond %{QUERY_STRING} !.=.RewriteCond %{QUERY_STRING} !.attachment_id=.RewriteCond %{HTTP_COOKIE} !^.(comment_author_|wordpress|wp-postpass_).$RewriteCond %{HTTP_user_agent} !^.(2.0\ MMP|240x320|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|hiptop|IEMobile|iPhone|iPod|KYOCERA/WX310K|LG/U990|MIDP-2.0|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|Playstation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|Windows\ CE|WinWAP).RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html -fRewriteRule ^(.) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html [L]
RewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule . index.php [L]END WordPress .
-
You appear to have a duplicate content issue on your hands. If you visit both URLs, they both resolve to a unique resource. Not sure why your site is creating duplicate URLs, but do you have this directive included in your .htacess?
# BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On RewriteBase / RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L]</ifmodule> # END WordPress
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there an easy way to hide one of your URL's on google search?, rather than redirecting?
We don't want to redirect to a different page, as some people still use it, we just don't want it to appear in search
Technical SEO | | TheIDCo0 -
Huge number of crawl anomalies and 404s - non- existent urls
Hi there, Our site was redesigned at the end of January 2020. Since the new site was launched we have seen a big drop in impressions (50-60%) and also a big drop in total and organic traffic (again 50-60%) when compared to the old site. I know in the current climate some businesses will see a drop in traffic, however we are a tech business and some of our core search terms have increased in search volume as a result of remote-working. According to search console there are 82k urls excluded from coverage - the majority of these are classed as 'crawl anomaly' and there are 250+ 404's - almost all of the urls are non-existent, they have our root domain with a string of random characters on the end. Here are a couple of examples: root.domain.com/96jumblestorebb42a1c2320800306682 root.domain.com/01sportsplazac9a3c52miz-63jth601 root.domain.com/39autoparts-agency26be7ff420582220 root.domain.com/05open-kitchenaf69a7a29510363 Is this a cause for concern? I'm thinking that all of these random fake urls could be preventing genuine pages from being indexed / or they could be having an impact on our search visibility. Can somebody advise please? Thanks!
Technical SEO | | nicola-10 -
Google Serps Not Showing HTTPS in Front of URL
Hi Everyone, We implemented the HTTPS change to our four websites about 6 months ago. I have found something that I feel is strange. The homepage of each website shows www.domain.com, but all the internal pages show https://www.domain.com/page. If you click through it shows it as secure, but I feel that because it is happening on all four websites, that something was done incorrectly. Here is one Google SERP: https://www.google.com/search?client=firefox-b-1&biw=1920&bih=947&ei=gq9GWpizBuuF_Qa_p5e4Bw&q=tanzanite+jewelry+designs&oq=tanzanite+jewelry+designs&gs_l=psy-ab.3..0l2.130446.136028.0.136152.29.17.4.7.9.0.207.2214.7j9j1.17.0....0...1c.1.64.psy-ab..1.28.2350...0i131k1j0i22i30k1.0.BA5-meGmuA0 As you can see, our site displays with no https, but all the internal pages do. It just worries me as I have seen our internal pages increasing in positioning, but not our homepage. Any ideas?
Technical SEO | | vetofunk0 -
Wordpress BackupBuddy adding ?doing_wp_cron= in URLS
Hi Has anyone found WordPress Backup Buddy causing a problem with SEO. I understand why it does it, but wondered if anyone experienced issues with this? Only sometimes it adds /?doing_wp_cron=****** on to the end of a URL Thanks Tom
Technical SEO | | TomPryor831 -
Wordpress and Redirects?
I want to update my permalinks - actually I want to change the URL's to fit the content and keywords better. I can choose "edit" the URL, but don't I need a redirect? I don't see any htaccess Plugin installed.......is that what I need to be able to change my URL's in Wordpress?
Technical SEO | | cschwartzel0 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
Authorship and Publisher on WordPress
I successfully enabled rel=publisher on our WordPress blog, and as a test I also enabled rel=authorship for a set of blog posts. (Tested both in Google's Rich Snippets Tester.) However, on the individual blog posts the publisher credit disappears. Is there a way to enable both to appear on blog posts?
Technical SEO | | ufmedia0 -
Canonical URL
In our campaign, I see this notices Tag value
Technical SEO | | shebinhassan
florahospitality.com/ar/careers.aspx Description
Using rel=canonical suggests to search engines which URL should be seen as canonical. What does it mean? Because If I try to view the source code of our site, it clearly gives me the canonical url.0