How long does it take for customized Google Site Search to show results from pdf files?
-
The site in question is http://www.ejmh.eu
I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.
We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek
The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.
My questions:
If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?
Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?
Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?
It is vital to have a proper, high-quality search functioning on that site in the very near future.
What are your experiences? Any tips are greatly appreciated.
-
Hi, everyone: problem solved.
Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.
I updated the general sitemap.xml and linked to the new sitemap as well.
I (re)submitted both sitempas via the Webmaster Tools.
Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.
-
It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.
Otherwise it just takes too long for Google to index them.
What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.
-
Last crawl - thanks, great info.
yes, all new pdfs are linked from the html files.
This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html
In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.
-
Do you have the new PDFs Linked from pages like the old ones?
Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)
-
You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.
Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.
As for the settings...can you plase help me further: what exactly would you change?
-
if you check here all the pdf are indexed in google
so i will check the settings on CSE
reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms
-
Thanks for the tip, it's a good one. But they are all 100% texts.
-
If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.
so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How ask Google to de index scrapper sites?
While doing text Google searches for various keywords I have found two sites that have scrapped pages from my site which goes by an old URL of www.tpxcnex.com and a new URL of www.tpxonline.com www.folder.com is one of the sites and if you try to visit that site or any of the scrapped Google index listing, Chrome warns you not to. How can I ask Chrome to deindex www.folder.com or another scrapper site, or atleast deindex the URLs which have clearly scrapped my content?
Technical SEO | | DougHartline0 -
My Website disappeared from Google Search Results overnight
Hello there, I'm the owner of the Website https://cours-toujours.com/, dedicated to reviewing running shoes. My Website is pretty young and I'm currently focused on building new reviews (so I keep adding new articles, week after week, I did not really focus on the rest of the website for now).Until a few days ago, I saw growing traffic on my Website, everything seemed good and I kept adding new reviews to my website.And then suddently traffic dropped and went to 0 in 2 days (I went from 550 impressions/day to 49 impressions/day in 2 days :/)When I look in the Google Search Console, I don't see any issue: my sitemaps are submitted and the correct number of URLs are reported I don't have any Manual Action or Security Issue I don't have any Removal Request Everything seems fine... But I can barely find my website in Google Search Results.When I do a site search (site:cours-toujours.com), I find only 2 pages of results, mostly non-important pages (categories, etc.).I asked in Google Community Forums, and i got this reply about my pages being too similar to one another (https://support.google.com/webmasters/thread/44880689?hl=en). But I'm not really happy with this answer, as all my pages have ~1000 words of unique content (even if of course they have the same structure as they are all dedicatd to reviewing a running shoe...)Any idea where this might come from/how I can fix the issue?
Technical SEO | | SimonCoursToujours0 -
Search Console Indexed Page Count vs Site:Search Operator page count
We launched a new site and Google Search Console is showing 39 pages have been indexed. When I perform a Site:myurl.com search I see over 100 pages that appear to be indexed. Which is correct and why is there a discrepancy? Also, Search Console Page Index count started at 39 pages on 5/21 and has not increased even though we have hundreds of pages to index. But I do see more results each week from Site:psglearning.com My site is https://wwww.psglearning.com
Technical SEO | | pdowling0 -
Hybrid page showing in Google search results
Hello Mozzers We have two pages showing on page 1 of Google for the search term 'inset day sessions' This url is the correct page which we want site visitors to see. http://www.laughology.co.uk/teacher-workshop-s-inset-days/inset-days The other page page seems to be a strange hybrid of how the page used to look and the new content we have included. It's a mess and we don't want visitors clicking on this link. There is no menu link to this page on the site, but it is showing as a link In SH404sef http://www.laughology.co.uk/schools/teacher-workshop-s-inset-days/ What is the best way to deal with this? Thanks Ian nKOHYbn
Technical SEO | | Substance-create0 -
Google Search Results Display URL
Our urls show as www.domain.com/getproduct.aspx?productid=48376 (url #1) in Google search results. When you click on the link and go to the site the URL is www.domain.com/product-name.aspx (url #2) I checked in Google Webmaster Tools (Fetch as Google) and there is a 302 redirect from url #1 to url #2. It also shows a Set-Cookie value, ASP.NET_SessionID= If we make it a 301 redirect instead, will the url displayed in Google search results be the url #2? We need to get rid of the Set-Cookie for crawlers correct?
Technical SEO | | Guy_Huyett0 -
How to handle (internal) search result pages?
Hi Mozers, I'm not quite sure what the best way is to handle internal search pages. In this case it's for an ecommerce website with about 8.000+ products and search pages currently look like: example.com/search.php?search=QUERY+HERE. I'm leaning towards making them follow, noindex. Since pages like this can be easily abused for duplicate content and because I'd rather have the category pages ranked. How would you handle this?
Technical SEO | | Qon0 -
Google Webmaster Site Performance
In webmaster tools, under labs/site performance google provides your ave page load time. When google grades a page, does it use how long that specific page loads -or- Does google use the overall ave page load time for the domain as provided in lab/site performance
Technical SEO | | Bucky0 -
Google +1 Button on Flash sites
One of my customers is willing to add Google +1 button on their Flash websites. Is it possible? How can we add Google +1 button on a Flash site? Thanks in advance!
Technical SEO | | merkal20050