Screaming frog Advice
-
Hi
I am trying to crawl my site and it keeps crashing.
My sys admins keeps upgrading the virtual box it sits on and it now currently has 8GB of memory, but still crashes.
It gets to around 200k pages crawl and dies.
Any tips on how I can crawl my whole site, can u use screaming frog to crawl part of a site.
Thanks in advance for any tips.
Andy
-
Thanks, I tried all the tips on the screaming frog site, but I have just tried to 2 pages a second and lets hope that work.
-
Hi Andy. There are quite a few settings you can adjust to make the server load less while the crawl is running. These can be found with descriptions here: http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
For example, by not checking Images, CSS, SWF, and Javascript you'll be able to lessen load substantially, or if you'd like to crawl just a portion of the site you can set it to not check links outside of the start folder.
To have even more control over the crawl, you can use regular expressions to exclude certain pages, or sections that match a given pattern. The page above is fairly robust, so it should help you dial back the crawler to be friendlier to your server. Cheers!
-
Hey there mate,
Sorry to hear that you are having issues. You can actually ask Screaming Frog to use more RAM. If you haven't done that yet please give it a go.
You can find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/general/
If you want to crawl part of your site it can surely do that. You can exclude pages or whole sections.
Find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
Hope this helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
[Advice] Dealing with an immense URl structure full of canonicals with Budget & Time constraint
Good day to you Mozers, I have a website that sells a certain product online and, once bought, is specifically delivered to a point of sale where the client's car gets serviced. This website has a shop, products and informational pages that are duplicated by the number of physical PoS. The organizational decision was that every PoS were supposed to have their own little site that could be managed and modified. Examples are: Every PoS could have a different price on their product Some of them have services available and some may have fewer, but the content on these service page doesn't change. I get over a million URls that are, supposedly, all treated with canonical tags to their respective main page. The reason I use "supposedly" is because verifying the logic they used behind canonicals is proving to be a headache, but I know and I've seen a lot of these pages using the tag. i.e: https:mysite.com/shop/ <-- https:mysite.com/pointofsale-b/shop https:mysite.com/shop/productA <-- https:mysite.com/pointofsale-b/shop/productA The problem is that I have over a million URl that are crawled, when really I may have less than a tenth of them that have organic trafic potential. Question is:
Intermediate & Advanced SEO | | Charles-O
For products, I know I should tell them to put the URl as close to the root as possible and dynamically change the price according to the PoS the end-user chooses. Or even redirect all shops to the main one and only use that one. I need a short term solution to test/show if it is worth investing in development and correct all these useless duplicate pages. Should I use Robots.txt and block off parts of the site I do not want Google to waste his time on? I am worried about: Indexation, Accessibility and crawl budget being wasted. Thank you in advance,1 -
Can you give me some advices to rank this domain?
Hi Moz community, I've a coleague that's working to rank this site: www.devsar.com. The selected keywords are:
Intermediate & Advanced SEO | | Gaston Riera
Mobile development
Web development
Django Development
Python Development I've checked the site: It's fast and clean. Has a good PA and DA. It's responsive and good lookking. Meta description , title, hreflang.. everything in order. Link profile a little rare (checked with ahref.com), it's because someone made a mistake redirecting some expired domain Can you help me to help my mate out?
Thanks
GR.0 -
I have an authority site with 90K visits per month. Now I have to change from non www to www. Will incur in any SEO issues while doing that? Could you please advice me on the best steps to follow to do this? Thank you very much!
Because I want to increase site speed, Siteground (my hosting) suggested I use Cloudflare Plus which needs my site to have www in order to work. I'm also using a cloud hosting. Im a bit scared of doing this, and thus decided to come to the community. I used MOZ for over 6 months now and love the tool. Please help me make the best possible decisions and what steps to follow. It would be much appreciated. Thank you!
Intermediate & Advanced SEO | | Andrew_IT0 -
Links Questions and advice?
I have a website which has a fair few link assets that are doing very well (a lot of really powerful sites have link to them with follow links) but my commercial pages are not doing as well as a lot of sites without any other investment than (mediocre) links direct to there commercial pages with at least 10% of them carrying the money anchor text. Even pages we have had a few links for with generalized real anchor text and reasonable links do not do as well as the above due to none of them carrying the money keyword? Is it me or does google still rely on links to the commercial page and keywords with anchor text to match the money term?
Intermediate & Advanced SEO | | BobAnderson0 -
Advice on Content Marketing in a Tough Niche
Hello, In our niche, nobody links to the content/information with rare exceptions. Do you guys have any good articles/ideas for cases like this? The content that is linked to is once removed in subject matter from the content of our site, like if we sold shoes and had to write on different types of clothing stores. Looking for advice on what to do and how to figure out what to write about. We've probably got a descent budget this time but we're not sure how to go about this. Any advice is appreciated.
Intermediate & Advanced SEO | | BobGW0 -
URL Question and Advice on Site Architecture
Good morning one and all, i have a specific question pertaining to my Domain Migration Website URL structure. I have a computer repair business that I am re branding and my question at this point is centrally focused on how to best handle my URL naming structure that will best suite my needs for my the Search Engines and also my customers UX while not looking SPAMMY I am a web developer and SEO and I am building a SILO Site Architecture in WordPress using Pages (not Posts) so no discussion is need on the Permalink structure. I am attaching several Images below of Screen Shots of the new site that I have designed so that you may look at them and see the Silo Architecture Layout in action for the most part. OK, here we go. Looking at the Silo Mast Head, we can see that the following Main Menu items each represent a specific Silo Theme Silo Theme # 1 - COMPUTER REPAIR Silo Theme # 2 - VIRUS REMOVAL Silo Theme # 3 - PHONE REPAIR Silo Theme # 4 - NETWORKING Silo Theme # 5 - DATA RECOVERY My specific question is, if /computer-repair/ is a main silo theme (WP -Parent Page) and /laptop-repair/ is a (Child Page) of Computer Repair is the following example below (the actual URL string) going to 'trigger' a SPAM signal to either the user or GOOGLE or both?? URL String: http://www.pcmedicsoncall.com/computer-repair/laptop-repair/ Here's another example with the VIRUS REMOVAL SILO http://www.pcmedicsoncall.com/virus-removal/malware-removal/ Seeing how computer repair is the main silo theme that cannot be changed in the URL Structure (it can) but I wont change it seeing how COMPUTER REPAIR is the single largest keyword phrase used by individuals when they are looking for computer repair. Secondly, - LAPTOP REPAIR is also a Keyword Phrase that that has HIGH search queries that I am trying to rank for and that too (ideally) should also not changed! How do I deal with this situation? Or, am I seeing this in a overly paranoid way? I currently have the site allowing only my IP Address so I am afraid that the screen shots below is all that I can do on this in lieu of actually visiting the Site Currently, I have my URL Structure where Wilmington NC immediately follows the targeted keyword phrase for the Silo Theme like below http://www.pcmedicsoncall.com/virus-removal-wilmington-nc/malware-removal/ The example above, - including the location after the keyword phrase does look much more attractive and breaks it up so it does not read SPAMMY and it will help with SEO but yet another problem exists using the location after the keyword phrase which I explain in detail Below. On top of doing a complete re-branding Domain Change I am actually going to be relocating myself and my business to Charlotte, NC at the end of the summer so I have serious doubts if using Wilmington NC within the URL structure would be a wise idea considering that I will be relocating and an internal 301 Redirect on a Newly Migrated site 2-3 months after the initial site migration and site setup may have some negative impact and confuse Google and compound the situation thus much further despite the fact that it would immediately help me bounce back up with my rankings after the migration process. Thoughts a suggestions on both explained scenarios please? I have asked this specif question once already but obviously people do not read my very detailed and well thought out questions. This can also be viewed here>http://www.seomoz.org/q/need-very-urgent-advice-on-wedsite-migration-questions-please#reply_150847> Thank you Sincerely, Marshall Thompson SEOMOZ-PC-MEDICS-ON-CALL-1.jpg SEOMOZ-PC-MEDICS-ON-CALL1.jpg
Intermediate & Advanced SEO | | MarshallThompson310 -
Please help me with your advice
Hi all, Couple years ago I started to build my business based on EMD domain. The intention was to create the source with the rich unique content. After a year of hard work the site achieved top 10 in Google and started to generate good amount of leads. Then Google announced the EMD Update and site lost the 90% of traffic (after Pandas updates our SERP was steady ) “ a new filter that tries to ensure that low-quality sites don’t rise high in Google’s search results simply because they have search terms in their domain names. ” But I don’t consider my site low-quality site, every page, every post is 100% unique and has been created only to share the knowledge with others… The site has EXCELLENT content from industry point of view.... Since the “ EMD Update “ I read hundreds , hundreds of different articles and opinions related to EMD update and finally I am confused and lost. What should I do… • Kill the site and start new one
Intermediate & Advanced SEO | | Webdeal
• Get more links, but what type of links and how I should get them
• Keep hoping and pray....
• Or do something else Please help me with your advice0 -
Advice needed on how to handle alleged duplicate content and titles
Hi I wonder if anyone can advise on something that's got me scratching my head. The following are examples of urls which are deemed to have duplicate content and title tags. This causes around 8000 errors, which (for the most part) are valid urls because they provide different views on market data. e.g. #1 is the summary, while #2 is 'Holdings and Sector weightings'. #3 is odd because it's crawling the anchored link. I didn't think hashes were crawled? I'd like some advice on how best to handle these, because, really they're just queries against a master url and I'd like to remove the noise around duplicate errors so that I can focus on some other true duplicate url issues we have. Here's some example urls on the same page which are deemed as duplicates. 1) http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Holdings-and-sectors-weighting?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE&widgets=1 What's the best way to handle this?
Intermediate & Advanced SEO | | SearchPM0