How do I diagnose a site that has had a corrupted database restored?
-
Here's the situation:
Downloaded a backup of a full database from CPanel, because we needed to perform some queries on it.
Found out that after restoring it, CPanel had not been able to zip up the full database because the server memory was set so low (some posts weren't showing up after the restore).
SO, how would I go about determining exactly what content is missing from the site? What search engine queries would you perform? Is there a plugin I could use to establish the inconsistencies between the database content and the search results?
Your help is appreciated in advance!
-
Thanks a lot, Ryan. This response was really helpful.
-
Prior to SEO, my time was spent as a Microsoft Database Administrator. Your site almost certainly is using a MySQL database which is a different format then T-SQL, but the comparison likely holds.
Whenever I set up a MS SQL database, a task would be scheduled to automatically shrink the database. There are temp tables which are added and removed, files within the database which can have unused pages removed, etc. A database size reduction of 5% does not indicate to me any data loss.
At a high level, you need to compare the "corrupted" database with the one you restored on a file or table level to determine any differences. That is the only reasonable way to achieve your goal. The work should ideally be performed by a professional who is highly experienced in MySQL.
If you desire further assistance, I recommend pursuing the topic on a MySQL forum as this does not really pertain to SEO nor even WordPress.
-
Hi Ryan,
Apologies for not mentioning the software. The website is built on Wordpress.
Here's a bit extra information for you regarding the issue: upon inspection after seeing that the database was corrupted, comparing file sizes of the recently backed-up database, with a database that was backed-up three days beforehand, we found that the most recently backed-up database was around 5% smaller in file size than the earlier one—if anything it should be larger.
Regarding timestamps, the latest posts are there, and the comments seem to be there, so determining what exactly IS missing is something I'm going to need help with.
I can't restore the earlier version of the database either, because important data has been added since then.
The site works fine, for now. I'm just worried somewhere down the road we're going to find that there are 100 posts missing that are now turning up as 404 pages and lost links.
Does that make sense? Thanks for your help.
-
Based on your inquiry, it seems likely you are using specific software to run your site. It is unclear what type of software is being used, which is a critical factor. It could be a CMS such as WordPress. a shopping cart such as ZenCart, a forum such as vBulletin, etc.
You would likely receive the fastest and most accurate response by using the support site of the specific software in use.
Based on your questions, you are in far over your head and should ideally step aside and find a programmer who can resolve the issue. With that said, I'll try to answer your questions.
"how would I go about determining exactly what content is missing from the site?"
In order to determine what is missing, you need a baseline. You need to understand the site's function and activity. For example, if you are running an ecommerce site, what is the timestamp of the last order placed on the site?
"What search engine queries would you perform?"
None. You have an onsite issue. That is where your attention needs to be focused.
** "Is there a plugin I could use to establish the inconsistencies between the database content and the search results?"**
No. The search results should not even be a consideration. Search engines may choose to index or not index your content based on numerous factors including the robots.txt file, the meta tags on each page, the content on each page and so forth. Asking this question indicates you are grasping at straws. If your site is important to you, hire a professional developer to fix the problem. If the site is not of great importance (i.e. it does not generate revenue) then you can visit the site of the software in use and spend a day or two reading various articles, forums and such, then making various setting changes in an attempt to restore the site.
Another option....contact the web host and request them to restore a full backup of the entire site. This option would likely be best, but you would lose all data from after the time the backup was taken.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap submission for site migration?
Hi mozzers, We're about to migrate 4 domains into 1. Is there a particular way I should generate and submit the sitemap or should I just follow the same protocol as for one domain? Should I even worry submitting a sitemap when the site has this drupal module? I have access to the webmaster tools of all domains, should I do something specific on the accounts that are migrating besides submitting a sitemap? Thanks for letting me know!
Technical SEO | | Ideas-Money-Art0 -
Should we dump the https from a client site?
We inherited a site that has both http and https. No e-commerce or data transfer...just html. Should we dump the https certificate? I think it might be causing issues with indexing and possible duplicate content. The https site has a certificate warning message...not good. The URL is www.charlottemechanical.com
Technical SEO | | theideapeople0 -
Ranking Multi-Language Site
Recently we updated our website to a new version. Our website has a structure in which the English page is our main page with about 50 subpages. All these pages are translated in 5 different languages. The different languages are divided into folders. For example www.ourdomain.com/de containts all german pages. The pages with products would be for example: www.ourdomain.com/products for english and www.ourdomain.com/de/produkte for the german page. On our previous website this used to be simililar. After the website update the SEOMoz crawls are showning duplicated page content/title errors for the pages saying that the pages in other languages have the same content/title as the basis English webpage. Any idea how I can solve these errors?
Technical SEO | | Exp0 -
Accessing a pool of blog review sites
Good afternoon, from "yes Andy Murray is into the semi finals" Wetherby Uk 🙂 "Is there an easier way to access a pool of blog writters who speacilasie in reviewing products rather that than the invevitable grunt work of searching through Google." Thanks in advance,
Technical SEO | | Nightwing
David0 -
What to do with extremely high number of URLs on your site?
Here is the situation: The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example: www.abc.com/product/um/name-here/city-name/state/lastname:3458765 Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name. Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?
Technical SEO | | TLO0 -
Basic Multi-Site Question
Newb question. We run a site in multiple cities under the same domain. Often times one city will provide content that is "syndicated" to other cites. For example, here is the master post: http://www.styleblueprint.com/food-and-entertaining/kale-salad-quick-healthy/ The content will also show up in the following domains: http://atlanta.styleblueprint.com/food-and-entertaining/kale-salad-quick-healthy/ http://birmingham.styleblueprint.com/food-and-entertaining/recipes/kale-salad-quick-healthy/ Should I be marketing the posts in Atlanta and Birmingham as "no index, no follow" for SEO purposes? Thanks in advance, Jay
Technical SEO | | SSBCI0 -
Google and QnA sites
My website has a QnA site - a bit like this one except it's not private to premium members. It is a page with a left colomn for category links and it has a list of recently asked questions, each question is a link to view the full question and answers etc. Does google know this is a QnA ? Or will it say - hey, there are far too many links on this page, tut tut. Is there anything I can do to help it understand what the page is.
Technical SEO | | borderbound0