# Distributed search engine

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Distributed_search_engine
> Markdown URL: https://mediated.wiki/source/Distributed_search_engine.md
> Source: https://en.wikipedia.org/wiki/Distributed_search_engine
> Source revision: 1317792357
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

A **distributed search engine** is a [search engine](/source/Search_engine) where there is no central server. Unlike traditional centralized search engines, work such as [crawling](/source/Web_crawler), [data mining](/source/Data_mining), indexing, and [query](/source/Web_search_query) processing is [distributed](/source/Distributed_computing) among several peers in a decentralized manner where there is no single point of control.

## History

This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: Sub-section order is not chronological, but potentially promotional. Please help improve this section if you can. (February 2025) (Learn how and when to remove this message)

### Presearch

Main article: [Presearch (search engine)](/source/Presearch_(search_engine))

Started in 2017, [Presearch](/source/Presearch_(search_engine)) is an [ERC20](/source/Ethereum#Applications) powered (PRE) search engine powered by a distributed network of community operated nodes which aggregate results from a variety of sources. This powers the searches at presearch.com. This is planned to be a precursor where each node collaborates on a global decentralised index. [1] Presearch averages 5 million searches per day and has 2.2 million registered users. On Sept 1, 2021, Presearch was added as a default option to the search engine list on Android for the EU.[2] On May 27, 2022, Presearch officially transitioned from its Testnet to a Mainnet. This means all search traffic through the service now runs over Presearch's decentralized network of volunteer-run nodes.[3]

### [YaCy](/source/YaCy)

On December 15, 2003, Michael Christen announced development of a [P2P](/source/Peer-to-peer)-based search engine, eventually named [YaCy](/source/YaCy), on the [heise online](/source/Heise_online) forums.[4][5]

### [Seeks](/source/Seeks)

Seeks was an open source websearch proxy and collaborative distributed tool for websearch. It ceased to have a usable release in 2016.

### InfraSearch

In April 2000 several programmers (including [Gene Kan](/source/Gene_Kan), [Steve Waterhouse](https://en.wikipedia.org/w/index.php?title=Steve_Waterhouse&action=edit&redlink=1)) built a prototype [P2P](/source/Peer-to-peer) web search engine based on [Gnutella](/source/Gnutella) called [InfraSearch](/source/InfraSearch). The technology was later acquired by Sun Microsystems and incorporated into the [JXTA](/source/JXTA) project.[6] It was meant to run inside the participating websites' databases creating a [P2P](/source/Peer-to-peer) network that could be accessed through the InfraSearch website.[7][8][9]

### Opencola

On May 31, 2000 [Steelbridge Inc.](https://en.wikipedia.org/w/index.php?title=Steelbridge_Inc.&action=edit&redlink=1) announced development of OpenCOLA a collaborative distributive open source search engine.[10] It runs on the user's computer and crawls the web pages and links the user puts in their opencola folder and shares resulting index over its [P2P](/source/Peer-to-peer) network.[11]

### Faroo

In February 2001 Wolf Garbe published an idea of a [peer-to-peer](/source/Peer-to-peer) search engine,[12] started the Faroo prototype in 2004,[13] and released it in 2005.[14][15]

## Goals

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Distributed search engine" – news · newspapers · books · scholar · JSTOR (February 2025) (Learn how and when to remove this message)

The goals of building a distributed search engine include:

1. to create an independent search engine powered by the community;

2. to make the search operation open and transparent by relying on open-source software;

3. to distribute the advertising revenue to node maintainers, which may help create more robust web infrastructure;

4. to allow researchers to contribute to the development of open-source and publicly-maintainable ranking algorithms and to oversee the training of the algorithm parameters.

## Challenges

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Distributed search engine" – news · newspapers · books · scholar · JSTOR (February 2025) (Learn how and when to remove this message)

1. The amount of data to be processed is enormous. The size of the visible web is estimated at 5PB spread around 10 billion pages.

2. The latency of the distributed operation must be competitive with the latency of the commercial search engines.

3. A mechanism that prevents malicious users from corrupting the distributed data structures or the rank needs to be developed.

## See also

- [List of search engines § P2P search engines](/source/List_of_search_engines#P2P_search_engines)

- [Distributed processing](/source/Distributed_processing)

## References

1. **[^](#cite_ref-1)** ["Presearch is a Decentralized Search Engine"](https://www.presearch.io/).

1. **[^](#cite_ref-2)** 297shares; 4.3kreads (2021-09-01). ["Google Adds Presearch As A Default Option on Android Devices in EU"](https://www.searchenginejournal.com/google-adds-presearch-as-a-default-option-on-android-in-eu/418184/). *Search Engine Journal*. Retrieved 2021-11-10.{{[cite web](https://en.wikipedia.org/wiki/Template:Cite_web)}}: CS1 maint: numeric names: authors list ([link](https://en.wikipedia.org/wiki/Category:CS1_maint:_numeric_names:_authors_list))

1. **[^](#cite_ref-3)** Kan, Michael (2022-05-26). ["The Next Google? Decentralized Search Engine 'Presearch' Exits Testing Phase"](https://www.pcmag.com/news/the-next-google-decentralized-search-engine-presearch-exits-testing-phase). *PC Magazine*.

1. **[^](#cite_ref-4)** ["YaCy: News"](https://web.archive.org/web/20051124084140/http://www.yacy.net/yacy/News.html). Archived from [the original](http://www.yacy.net/yacy/News.html) on 2005-11-24.

1. **[^](#cite_ref-5)** Michael Christen. ["Ich entwickle eine P2P-basierende Suchmaschine. Wer macht mit?"](http://www.heise.de/newsticker/foren/S-Ich-entwickle-eine-P2P-basierende-Suchmaschine-Wer-macht-mit/forum-50682/msg-4744034/read/). [heise online](/source/Heise_online).

1. **[^](#cite_ref-6)** Justin Hibbard. ["Can peer-to-peer grow up?"](http://www.redherring.com/Home/9528). [Red Herring](/source/Red_Herring_(magazine)).[*[permanent dead link](https://en.wikipedia.org/wiki/Wikipedia:Link_rot)*]

1. **[^](#cite_ref-7)** Simon Foust. ["Move Over Yahoo, Here Comes InfraSearch"](https://web.archive.org/web/20001013141235/http://www.dmusic.com/news/news.php?id=2614). *[Dmusic](https://en.wikipedia.org/w/index.php?title=Dimension_Music&action=edit&redlink=1)*. Archived from [the original](http://www.dmusic.com/news/news.php?id=2614) on 2000-10-13.

1. **[^](#cite_ref-8)** Sean M. Dugan. ["Peer-to-peer networking is poised to revolutionize the Internet once again"](https://web.archive.org/web/20001018022633/http://www.infoworld.com/articles/op/xml/00/07/17/000717opprophet.xml). *[InfoWorld](/source/InfoWorld)*. Archived from [the original](http://www.infoworld.com/articles/op/xml/00/07/17/000717opprophet.xml) on 2000-10-18.

1. **[^](#cite_ref-9)** John Borland. ["Napster-like technology takes Web search to new level"](https://news.cnet.com/2100-1023-241223.html). [Cnet](/source/Cnet).

1. **[^](#cite_ref-10)** [David Akin](/source/David_Akin). ["Software launched with a little pop"](https://nationalpost.com/financialpost.asp?f=000531/303636.html/17/000717opprophet.xml). *[Financial Post](/source/Financial_Post)*.[*[dead link](https://en.wikipedia.org/wiki/Wikipedia:Link_rot)*]

1. **[^](#cite_ref-11)** Paul Heltzel. ["OpenCola-Have Some Code and a Smile"](http://www.techreview.com/web/12360/?a=f). *[Technology Review](/source/Technology_Review)*.

1. **[^](#cite_ref-12)** Wolf Garbe. ["BINGOOO - Die Transformation des World Wide Web zur virtuellen Datenbank"](https://web.archive.org/web/20140202093532/http://www.pubzone.org/dblp/journals/wi/Garbe01) (in German). [Wirtschaftinformatik](https://en.wikipedia.org/w/index.php?title=Wirtschaftinformatik_(magazine)&action=edit&redlink=1). Archived from [the original](http://www.pubzone.org/dblp/journals/wi/Garbe01) on 2014-02-02. Retrieved 2010-12-21. ... Wir setzen dem das Konzept einer verteilten Peer-to-Peer-Suchmaschine entgegen [We counter with the concept of a distributed peer-to-peer search engine] ...

1. **[^](#cite_ref-13)** Bernard Lunn. ["Technical Q&A With FAROO Founder"](https://web.archive.org/web/20110214194656/http://www.readwriteweb.com/start/2009/12/technical-qa-with-faroo-founder.php). [ReadWriteWeb](/source/ReadWriteWeb). Archived from [the original](http://www.readwriteweb.com/start/2009/12/technical-qa-with-faroo-founder.php) on 2011-02-14. ... When I started to work on the first prototype in 2004 ...

1. **[^](#cite_ref-14)** ["FAROO: History"](https://web.archive.org/web/20080322000927/http://www.faroo.com/english/download/history.html). Archived from [the original](http://www.faroo.com/english/download/history.html) on 2008-03-22.

1. **[^](#cite_ref-15)** ["Revisited: Deriving crawler start points from visited pages by monitoring HTTP traffic"](http://blog.faroo.com/2010/01/03/revisited-deriving-crawler-start-points-from-visited-pages-by-monitoring-http-traffic/). Faroo.

v t e Distributed search engines Distributed web search Presearch Seeks YaCy Distributed web crawlers Grub italics = defunct

---
Adapted from the Wikipedia article [Distributed search engine](https://en.wikipedia.org/wiki/Distributed_search_engine) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Distributed_search_engine?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
