Not getting foreign characters in crawl diagnostics .csv
-
The crawl diagnostics .csv file is showing high-ascii characters instead of the correct language (foreign language website) e.g. Vietnamese, Chinese (both kinds), etc. Is there a way to get this right?
-
Glad it helped! I think the issue might be with excel more than Moz, its handling of utf8 csv's has been terrible since day 1! I think there is a way you can use the excel import data function to get the same result but I never had much luck with it and the open office trick seemed less painful.
-
Open Office did the trick! Thank you. Would be nice if the Moz app could do UTF-8 natively.
-
Hi Ash,
I had this problem too and here is how I solved it (there might be better ways).
If the characters are in the page titles, meta tags etc you can open the csv file in open office and then choose save as xls and it will save an excel file which you can then open in excel and the utf8 characters will read ok. This method works great for titles etc but does not decode foreign characters in the urls themselves.
If the characters are in the url then a way I have found is to download this pretty awesome excel addon (site is in german, I used google translate to figure out what was going on). Then you have some new functions in excel where you can create a 2nd column next to the url column, apply the url decode function to the first column and get readable urls in the second. This addon saved me sooo much time and trouble! It works for greek which I need it for, I assume it will work for chinese also. Let me know if you need more detailed instructions, it took a bit of trial and error to figure out the exact moves needed to get the results you want.
Hope that helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl tests stuck in queue
I have tried to run a number of crawl tests recently for our client's sites outside the US and they have been stuck in the queue for over a week. 3 of them completed, but then 5 are stuck. Anyone experience this? I haven't seen anything about crawl tests having issues right now.
Moz Bar | | rmcgrath810 -
How to turn off automated site crawls
Hi there, Is there a way to turn off the automated site crawl feature for an individual campaign? Thanks
Moz Bar | | SEONOW1230 -
Has anyone had to deal with Moz crawl issues on their Zendesk support site?
If so - how did you end up resolving them? For instance we have 85 "temporary redirect" errors from our Zendesk support site in our crawl error report and we don't have access to the robots.txt file through Zendesk.
Moz Bar | | zspace0 -
Canonical in Moz crawl report
I'm wondering if the moz bot is seeing my rel="canonical" on my pages. There are 2 notices that are bothering me: Overly Dynamic URL Rel Canonical Overly Dynamic URL - This notice is being generated by urls with query strings. On the main page I have the rel="canonical" tag in the header. So every page with the query string has the canonical tag that points to the page that should be indexed. So my question...Why the notice? Isn't this being handled properly with the canonical tag? I know I can use my robots.txt or the tool in Google search console but is it really necessary when I have the canonical on every page? Here is one of the links that has the "Overly Dynamic URL" notice, as you can see the the canonical in the header points to the page without the query string: https://www.vistex.com/services/training/traditional-classroom/registration-form/?values=true&course-title=DMP101 – Data Maintenance Pricing – Business Processes&date=March 14, 2016 Rel Canonical - Every page in my report has this notice "Using rel=canonical suggests to search engines which URL should be seen as canonical". I'm using the rel="canonical" tag on all of my pages by default. Is the report suggesting that I don't do this? Or is it suggesting that I should? Again...why the notice?
Moz Bar | | Brando160 -
Moz Crawl Showing Duplicate Content But It's Not?!
Unfortunately I can't give out the URL, but here's the deal... I have two URL's which have completely different content on them but are being crawled as duplicate content. Any Idea how that would happen? I'm not seeing any errors in WMT's. Has anyone seen this before? Is the duplicate content reporting based on a % of the page content matching as the same?
Moz Bar | | Swarm-SEO0 -
Crawl Diagnostics - nofollow - reducing duplicate pages
Hi I'm looking at a crawl diagnostic report, I can see I have many duplicate pages, the reason for this is that when a brand filter is applied to a page. IE
Moz Bar | | chameleondm
www.mysite.com/mycategory - lets say this is the product listing page
www.mysite.com/category/mybrand - and this is the same page but with a brand filter applied
www.mysite.com/category/myotherbrand - and this is the same page but with a different brand filter applied I had intially appendeded the meta title, description and keywords with some extra content if a brand filter was applied, because the page on the whole does have different content. IE I would have a custom meta information, H1 tag and products on that page just for that specific brand.
However I am wondering if these two pages are really just competing with each other as lots of the content will be the same. Should I scrap that approach and use either nofollow on the brand filter link, or simply use a canonical. Thanks, James1