Web search query

Web search query

A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages which are governed by strict syntax rules.

Contents

Types

There are four broad categories that cover most web search queries[1]:

  • Informational queries – Queries that cover a broad topic (e.g., colorado or trucks) for which there may be thousands of relevant results.
  • Navigational queries – Queries that seek a single website or web page of a single entity (e.g., youtube or delta air lines).
  • Transactional queries – Queries that reflect the intent of the user to perform a particular action, like purchasing a car or downloading a screen saver.

Search engines often support a fourth type of query that is used far less frequently:

  • Connectivity queries – Queries that report on the connectivity of the indexed web graph (e.g., Which links point to this URL?, and How many pages are indexed from this domain name?).

Characteristics

Most commercial web search engines do not disclose their search logs, so information about what users are searching for on the Web is difficult to come by.[2] Nevertheless, a study in 2001[3] analyzed the queries from the Excite search engine showed some interesting characteristics of web search:

  • The average length of a search query was 2.4 terms.
  • About half of the users entered a single query while a little less than a third of users entered three or more unique queries.
  • Close to half of the users examined only the first one or two pages of results (10 results per page).
  • Less than 5% of users used advanced search features (e.g., Boolean operators like AND, OR, and NOT).
  • The top four most frequently used terms were , (empty search), and, of, and sex.

A study of the same Excite query logs revealed that 19% of the queries contained a geographic term (e.g., place names, zip codes, geographic features, etc.).[4]

A 2005 study of Yahoo's query logs revealed 33% of the queries from the same user were repeat queries and that 87% of the time the user would click on the same result.[5] This suggests that many users use repeat queries to revisit or re-find information. This analysis is confirmed by a Bing search engine blog post telling about 30% queries are navigational queries [6]

In addition, much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually.[7] This example of the Pareto principle (or 80-20 rule) allows search engines to employ optimization techniques such as index or database partitioning, caching and pre-fetching.

Structured queries

With search engines that support Boolean operators and parentheses, a technique traditionally used by librarians can be applied. A user who is looking for documents that cover several topics or facets may want to describe each of them by a disjunction of characteristic words, such as vehicles OR cars OR automobiles. A faceted query is a conjunction of such facets; e.g. a query such as (electronic OR computerized OR DRE) AND (voting OR elections OR election OR balloting OR electoral) is likely to find documents about electronic voting even if they omit one of the words "electronic" and "voting", or even both.[8]

See also

References

  1. ^ Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze (2007), Introduction to Information Retrieval, Ch. 19
  2. ^ Dawn Kawamoto and Elinor Mills (2006), AOL apologizes for release of user search data
  3. ^ Amanda Spink, Dietmar Wolfram, Major B. J. Jansen, Tefko Saracevic (2001). "Searching the web: The public and their queries". Journal of the American Society for Information Science and Technology 52 (3): 226–234. doi:10.1002/1097-4571(2000)9999:9999<::AID-ASI1591>3.3.CO;2-I. 
  4. ^ Mark Sanderson and Janet Kohler (2004). "Analyzing geographic queries". Proceedings of the Workshop on Geographic Information (SIGIR '04). http://www.geo.unizh.ch/~rsp/gir/abstracts/sanderson.pdf. 
  5. ^ Jaime Teevan, Eytan Adar, Rosie Jones, Michael Potts (2005). "History repeats itself: Repeat Queries in Yahoo's query logs". Proceedings of the 29th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR '06). pp. 703–704. doi:10.1145/1148170.1148326. http://www.csail.mit.edu/~teevan/work/publications/posters/sigir06.pdf. 
  6. ^ http://www.bing.com/community/site_blogs/b/search/archive/2011/02/10/making-search-yours.aspx
  7. ^ Ricardo Baeza-Yates (2005). Applications of Web Query Mining. 3408. Springer Berlin / Heidelberg. pp. 7–22. ISBN 978-3-540-25295-5. http://www.springerlink.com/content/kpphaktugag5mbv0/. 
  8. ^ Vojkan Mihajlović, Djoerd Hiemstra, Henk Ernst Blok, Peter M.G. Apers. "Exploiting Query Structure and Document Structure to Improve Document Retrieval Effectiveness". http://eprints.eemcs.utwente.nl/6918/01/TR-CTIT-06-57.pdf 

Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Web search engine — Search engine redirects here. For other uses, see Search engine (disambiguation). The three most widely used web search engines and their approximate share as of late 2010.[1] A web search engine is designed to search for information on the Wo …   Wikipedia

  • Web query classification — Web query topic classification/categorization is a problem in information science. The task is to assign a Web search query to one or more predefined categories, based on its topics. The importance of query classification is underscored by many… …   Wikipedia

  • Web crawler — For the search engine of the same name, see WebCrawler. For the fictional robots called Skutters, see Red Dwarf characters#The Skutters. Not to be confused with offline reader. A Web crawler is a computer program that browses the World Wide Web… …   Wikipedia

  • Query — In general, a query is a form of questioning, in a line of inquiry. A query may also refer to:* The Queries, a set of 31 questions outlined by Isaac Newton beginning in 1704 * Query (complexity), a mapping from structures of one vocabulary to… …   Wikipedia

  • Query expansion — (QE) is the process of reformulating a seed query to improve retrieval performance in information retrieval operations. [cite journal last = Vectomova | first = Olga | coauthors = Wang, Ying | year = 2006 title = A study of the effect of term… …   Wikipedia

  • Query flooding — is a method to search for a resource on a P2P network. It is simple but scales very poorly and thus is rarely used. Early versions of the Gnutella protocol operated by query flooding; newer versions use more efficient search algorithms.OperationA …   Wikipedia

  • Index (search engine) — Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and… …   Wikipedia

  • Search engine (computing) — A search engine is an information retrieval system designed to help find information stored on a computer system. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to …   Wikipedia

  • Search advertising — In Internet Marketing, Search Advertising is a method of placing online advertisements on Web pages that show results from search engine queries. Through the same search engine advertising services, ads can also be placed on Web pages with other… …   Wikipedia

  • search engine — a computer program that searches documents, esp. on the World Wide Web, for a specified word or phrase and provides a list of documents in which this word or phrase is found. [1990 95] * * * Tool for finding information, especially on the… …   Universalium

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”