• Skip to main content
  • Skip to primary sidebar

Andrew J. Walsh | Writer, Librarian

  • Home
  • My Work
  • Speaking and Writing
  • Blog
  • Contact

What Google Can’t Find: An Intro to the Deep Web

September 13, 2011 by Andrew Walsh Leave a Comment

Due to its intuitive single search box and ever-improving algorithms, it’s easy to rely 100% on Google (or another search engine) when it comes to finding information online. You simply type in a few words to instantly view the webpages that have been identified as the most relevant to that search.

The beauty of Google lies in its automation. No employee has to manually add new websites to the database; Google’s web crawlers are programmed to process the code of mass amounts of webpages and follow all the links they encounter, adding new pages to Google’s index as they go.

These web crawlers do a great job, at least for publicly accessible webpages, which are known collectively as the visible web. But it’s important to understand that there’s more out there. A lot more.

The Deep Web Explained

Anything on the web that cannot be found by search engine bots is part of something known as the “deep web” or “invisible web.”

The two biggest categories of the deep web are private resources, content that requires a login and password, and dynamic pages that are generated on the fly, usually as a result of user input.

This should make sense, first because many content providers do not allow free access to their material but want it available online for those who pay. The login screen serves as a dead-end for all web crawlers, and they cannot index any of the content that lives within.

Secondly, due to the infinite possibilities of search terms in something like an online catalog or other database, a search engine can’t (and wouldn’t want to) index every combination.

Staggering Statistics: Deep Web vs. Visible Web

According to one study, back in 2006 Google had indexed 25 billion pages. In contrast, the deep web contained some 900 billion pages! An astonishing statistic and a strong reminder that although Google is becoming more and more powerful, it still has its limits. (Source: NYTimes Bits Tech Talk Podcast)

There are also other types of deep web resources, including websites that don’t have any links pointing to them, certain file formats that can’t be handled by search engines, and webmasters who have intentionally blocked crawler access to their sites for various reasons.

A Deep Web Example

A good example of the deep web is the body of resources available on a library website. When you search the online library catalog, for example, it queries a database and generates a list of results on-the-fly, a type of page which Google or any search engine will not be able to index.

But more importantly, when you search the site for an article you need to read, you pass through as an authenticated user of the library. Since the library has paid money to provide it for you, the full-text will be accessible. The url you see in your browser bar might seem regular enough, but an unaffiliated searcher who typed the article title into Google will not be able to read the article.

This example often becomes problematic, because many library users do not realize they are accessing a “deep web” subscription resource, since the library tries to make the integration as seamless as possible.

In an era of rising costs and more and more knowledge hiding behind subscription logins, it’s becoming increasingly clear that Google does not find everything, no matter how quickly it throws back a tidy list of results for any query.

When you’re thinking about finding information on the web today, remember that Google still can’t solve all your problems!

Related Posts:

  • How Does Google Process Search Queries So Fast?
    How Does Google Process Search Queries So Fast?
    October 22, 2011
  • Concerns Over Google's Social Features and the Loss of Objective Search
    Concerns Over Google's Social Features and the Loss…
    January 17, 2012
  • What is a Vertical Search Engine?
    What is a Vertical Search Engine?
    November 26, 2011
  • quality-content
    Google’s Content Farm Algorithm Update: Can a…
    February 27, 2011

Filed Under: Technology Tagged With: Google, Search Engines

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Connect With Me

  • Facebook
  • Instagram
  • LinkedIn
  • Twitter
  • YouTube

About My Blog

As an academic librarian, I'm interested in how emerging technologies are changing how we search, learn, and create online. I cover topics like evaluating source credibility, digital productivity, and the impact of AI.

My History Book

lost dayton walsh

View Book

Recent Posts

How ChatGPT is Affecting My Information Literacy Instruction

March 13, 2024 By Andrew Walsh Leave a Comment

Why OER Textbooks Are Important (and How AI Can Help Create Them)

February 9, 2024 By Andrew Walsh Leave a Comment

Prompt Engineers at the Library: Should Librarians Teach AI Searching?

January 2, 2024 By Andrew Walsh Leave a Comment

How AI Will Affect Web Search, Content Quality, and Authority

February 22, 2023 By Andrew Walsh Leave a Comment

Post Categories

  • Academic Librarianship (10)
  • AI and Librarians (5)
  • Digital Productivity (10)
  • Evaluating Sources (14)
  • Local History (4)
  • Open Education (8)
  • Presentations (7)
  • Technology (52)
  • Writing (11)

Copyright © 2025 · Author Pro on Genesis Framework · WordPress · Log in