Help with site structure needed - any assistance welcomed!
-
Hi all,
I am currently tasked with finding a better way to optimise our website ukdocumentstorage dot com.
For starters, I would like to know what our site structure actually is at present. So I would like to be able to see which pages are linking to what at the moment & which pages have broken links on which I need to remove from the content. Hopefully I'd then be able to tidy up any errors that the site already has in its internal linking.
Is there a way to do this easily? Or to have a graphical representation of the sites structure?
I have just signed into our Webmaster Tools account and I am faced with a list of 10 'Crawl Errors' which are all 404 errors. Some of them do not actually exist anymore, but are still being linked to from a few pages according to WMT.
For example, /industries_served_legal.htm is still being linked to from 5 of our pages (including /industries_served_local_authority.htm)
However, this doesn't seem to be a case at all on the page as I can't find a link to /industries_served_legal.htm on /industries_served_local_authority.htm. Any advice as to why this is happening? Is there a way to find out easily where these broken links are situated on the page? And if I do actually manage to find our broken links, how would I go about removing them?
The page /document_security.htm doesn't exist in our Sitewizard list of pages anymore, yet still exists online. How do I go about deleting this unecessary page properly? And does this harm our rankings?
The document_security page also has an extra link on the top toolbar to a Document Management page, an addition which is no longer present on our up to date pages. Now this page (and the extra dropdown page when you hover over it) still exist on our list of Sitewizard pages at the moment, but we obviously no longer want to have these online anymore. How should I remove these?
I understand that this is a lot of information, and so I would appreciate any help that can be given on these!
Many thanks
-
Perfect sense thank you! I'll now research how to actually do this re-direct.
-
If this is an internal link on your website, you would want to change the actual path to point to the newer secure-document-storage page.
If this is an external link from another website, you'd create a redirect that will take the incoming request for the old document-security page and push the visitor to the new secure-document-storage page.
Make sense?
Mike
-
So even though the text is different, I should re-direct people clicking on the link to the old document-security to the newer secure-document-storage page?
-
Here is an example that may help:
You have the following pages on your site - /product1.html, /product2.html, and /product3.html.
An external site (externalsite.com) links to the product 2 page on your site (yoursite.com/product2.html).
You decide to no longer sell product 2, so your remove /product2.html from your website; however, externalsite.com is still linking to yoursite.com/product2.html. You see a 404 warning in Google Webmaster Tools referencing this error.
You then have two options:
-
You recently started selling product 4, which is not the same product, but still offers the same solution to a potential customer. You create a /product4.html page and set up a 301 redirect from externalsite.com to yoursite.com/product4.html.
-
You no longer sell this product or solutions like it, because it was not needed by visitors. There is a link from externalsite.com is no longer applicable to your site; therefore, you disregard the warning in Google Webmaster Tools and the link will eventually not be followed by Google.
Now, if the /product2.html page was still accessible online, but you no longer linked to it via yoursite.com, that is kind of a problem, because if externalsite.com is still linking there, visitors could stumble upon your old/outdated/not-used page. You do not need to actively worry about removing the link, but you should work on removing the page if it is no longer used.
Does that help and did I understand your question correctly?
Mike
-
-
Apologies for the overload!
So my take-way from this is that any pages that I have deleted but are still able to be found the internet (e.g. /document_security) I don't need to worry about actively trying to remove from the internet as it will be removed by Google automatically in the future? And having these pages still existing on the internet (despite not having any current links going to them from pages I haven't deleted) will not harm my site?
Thank you for all of your help so far!
-
To add to Mike's answer
2: If the page is deleted and isn't coming back you may want to 301 it to its new equivalent of possible even return a 410 a status code to tell search engines the pages has been permanently removed
For more info on Status codes see the following article
http://www.seomoz.org/learn-seo/http-status-codes -
Whoa! Information overload!!!
-
I don't know of anything that shows you a graphical representation of your site's linking structure; however, I do know of a program that will list out all of the linking pages on your site and the number of in and out links, including anchor text, etc. The number of in links can be an indicator of structurally how your site is organized.
-
404 errors or not bad as long as they are known. If you no longer have a page and you decide not to redirect from the old page to a new one, that is fine. Google is just giving you a heads up that your site or someone else's is linking to a non existent page. If you do nothing to fix these 404 errors, the page will eventually be removed from Google's index and not be a problem.
-
/document_security.htm looks like it is being linked to from /services_storage_fast_retrieval.htm and /services_archive_storage.htm
I would recommend downloading and installing Screaming Frog that is the program I was referencing in my response to #1 and that is how I found the issue in #3.
Seer Interactive also wrote a great blog on all of the things this tool can do.
Hope this helps.
Mike
-
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Main Site and eCommerce Site URLs for SEO
My client currently has a main website on a url and an eCommerce site on a subdomain. The eCommerce site is currently not mobile friendly, has images that are too small and are problematic - and I believe it negates some of the SEO work we do for them. I had to turn off Google Shopping ads because the quality score was so low. That being said, they are rebuilding a shopping cart on a new platform that will be mobile friendly BUT the images are going to be tiny until they slowly replace images over several months. Would you keep the shopping cart on a subdomain, or make it part of the main website URL? Can it negatively impact the progress we have made on the main site SEO.
Technical SEO | | jerrico10 -
Feedback needed on possible solutions to resolve indexing on ecommerce site
I’ve included the scenario and two proposed fixes I’m considering. I’d appreciate any feedback on which fixes people feel are better and why, and/or any potential issues that could be caused by these fixes. Thank you! Scenario of Problem I’m working on an ecommerce website (built on Magneto) that is having a problem getting product pages indexed by Google (and other search engines). Certain pages, like the ones I’ve included below, aren’t being indexed. I believe this is because of the way the site is configured in terms of internal linking. The site structure forces certain pages to be linked very deeply, therefore the only way for Googlebot to get to these pages is through a pagination page (such as www.acme.com/page?p=3). In addition, the link on the pagination page is really deep; generally there are more than 125 links on the page ahead of this link. One of the Pages that Google isn’t indexing: http://www.getpaper.com/find-paper/engineering-paper/bond-20-lb/430-20-lb-laser-bond-22-x-650-1-roll.html This page is linked from http://www.getpaper.com/find-paper/engineering-paper/bond-20-lb?p=5, and it is the 147<sup>th</sup> link in the source code. Potential Fixes Fix One: Add navigation tags to the template so that search engines will spend less time crawling them and will get to the deeper pages, such as the one mentioned above. Note: the navigation tags are for HTML-5; however, the Magento site in which this is built does not use HTML 5. Fix Two: Revised the Templates and CSS so that the main navigation and the sidebar navigation is on the bottom of the page rather than the top. This would put the links to the product pages in the source code ahead of the navigation links.
Technical SEO | | TopFloor0 -
Internal Ads on A Site
We serve ads on our site using a sub-domain. All ads use a re-direct from ads.domain before redirecting users to the proper, normal, internal url. Most the content on our home page is ad block driven. Is it possible and does it make sense to enter the sub-domain as url parameter in Google Webmaster tools, letting Google know that this is something to be ignored. Many thanks
Technical SEO | | CeeC-Blogger0 -
Linking out to authoritive sites from my ecommerce site
Good afternoon SEOmoz community. I was looking for a specific answer or advice or opinion about linking out to other sites. My Site www.tacticalbootstore.com has been undergoing a complete content rewrite. In the process we have been told and read where it can be good to link out to other authoritive sites. One of the pages we have rewritten is here. http://www.tacticalbootstore.com/belleville-boots-sizing-chart-a-97.html We have not added the graphics yet as they are being built now. This is just an informational page about sizing of a particular manufacturers boots. Once you get to the bottom of the text we have added a link to the actual manufacturers page. Is this helpful for us in the SERPS or not? Thank you for your time. Chris
Technical SEO | | scamper0 -
How to handle mobile site with less pages than the main site?
We are developing a mobile version of our website that will utilize responsive design/dynamic serving. About 70% of the main website will be included in the mobile version. What (if anything) should be the redirect for pages not included in the mobile version of the site? Also - for one specific section users will be redirected from that page to the homepage, what is the redirect that should be used for this? Thanks!
Technical SEO | | theLotter0 -
How Often is Site Crawled
Good morning- I saw some errors in my first crawl and immediately removed the pages from my website. I then re-created my XML sitemap and uploaded to Google. The question I have is will the site be crawled to recognize the changes in the next day or so? The pages were just placed on the site as test pages and never removed. The initial crawl that notified me it was done found the errors and were removed. Thanks for your help. Peter
Technical SEO | | VT_Pete0 -
Canonicalization - Some advice needed :)
Hi guys, To be honest, it's a little bit embarrassing to throw out this question but it's one of the weakest points of knowledge at the moment for me. I've tried to get a grasp of canonical URLs and what it all means. From my understanding, it's informing Google which page to take into consideration when there's the possibility for duplicate content. Right? However, with the site I'm working on I'm not sure if it would be worth putting site-wide and the impact it would have. Site I'm working on - http://bit.ly/N7eew7 With the nature of the site, there would be a lot of duplicated content as there's the possibility that several properties listed could have a similar address due to being in the same building etc. From what I can see, no canonical URL was setup on the homepage. The other variations of the homepage URL are 301 redirecting to thee http:/www. version. Can someone explain it all to me in simple terms? Honestly believe that I'm getting more confused by the minute. Thanks guys for your patience 🙂
Technical SEO | | MarkScully1 -
Mobile site rank on Google S.E. instead of desktop site.
Hello, all SEOers~ Today, I would like to hear your opinion regarding on Mobile site and duplicate contents issue. I have a mobile version of our website that is hosted on a subdomain (m instead www). Site is targeting UK and Its essentially the same content, formatted differently. So every URL on www exists also at the "m" subdomain and is identical content. (there are some different contents, yet I could say about 90% or more contents are same) Recently I've noticed that search results are showing links to our mobile site instead of the desktop site. (Google UK) I have a sitemap.xml for both sites, the mobile sitemap defined as follows: I didn't block googlebot from mobile site and also didn't block googlebot-mobile from desktop site. I read and watched Google webmaster tool forum and related video from Matt Cutts. I found many opinion that there is possibility which cause duplicate contents issue and I should do one of followings. 1. Block googlebot from mobile site. 2. Use canonical Tag on mobile site which points to desktop site. 3. Create and develop different contents (needless to say...) Do you think duplicate contents issue caused my mobile site rank on S.E. instead of my desktop site? also Do you think those method will help to show my desktop site on S.E.? I was wondering that I have multi-country sites which is same site format as I mentioned above. However, my other country sites are totally doing fine on Google. Only difference that I found is my other country sites have different Title & Meta Tag comparing to desktop site, but my UK mobile site has same Title & Meta Tag comparing to desktop. Do you think this also has something to do with current problem? Please people~! Feel free to make some comments and share your opinion. Thanks for reading my long long explanation.
Technical SEO | | Artience0