{{Short description| AI startup}} {{Lowercase title}} {{Infobox company | name = deepset, makers of Haystack | logo = deepset.svg | type = [[Private company|Private]] | industry = [[AI]] | founded = {{Start date and age|2018|06|22}} | founders = {{hlist|Milos Rusic|Malte Pietsch}} | location_city = [[Berlin]] | location_country = [[Germany]] | products = Haystack (Open Source), Haystack Enterprise Platform | num_employees = > 70 | homepage = {{URL|https://www.deepset.ai/}} }} '''deepset''' is an enterprise software vendor that provides developers with the tools to build production-ready Artificial Intelligence (AI) and [[natural language processing|natural language processing (NLP)]] systems, using architectures such as agents, retrieval augmented generation (RAG) and multimodal AI. It was founded in 2018 in [[Berlin]] by Milos Rusic, Malte Pietsch, and Timo Möller.<ref name=":0">{{Cite news |last=Wiggers |first=Kyle |date=April 28, 2022 |title=Deepset raises $14M to help companies build NLP apps |work=[[TechCrunch]] |url=https://techcrunch.com/2022/04/28/deepset-raises-14m-to-help-companies-build-nlp-apps/ |access-date=August 31, 2022}}</ref> deepset authored and maintains the [[open-source software|open source software]] Haystack<ref name=":1" /> and its commercial [[SaaS]] and self-hosted (VPC, on-prem, air gapped) offering, Haystack Enterprise Platform. (formerly known as deepset Cloud and deepset AI Platform)<ref name=":2" />
== History == In June 2018, Milos Rusic, Malte Pietsch, and Timo Möller co-founded deepset in [[Berlin]], [[Germany]].<ref name=":0"/> In the same year, the company served first customers who wanted to implement [[natural language processing|NLP]] services by tailoring [[BERT (language model)|BERT]] language models to their domain.
In July 2019, the company released the initial version of the [[open-source software|open source software]] FARM.<ref name=":3">{{Cite web |title=deepset-ai/FARM |url=https://github.com/deepset-ai/FARM |access-date=August 31, 2022 |website=[[GitHub]]}}</ref>
In November 2019, the company released the initial version of the [[open-source software|open source software]] Haystack.<ref name=":1">{{Cite web |title=deepset-ai/haystack |url=https://github.com/deepset-ai/haystack |access-date=August 31, 2022 |website=[[GitHub]]}}</ref>
Throughout 2020 and 2021 deepset published several applied research papers at [[Empirical Methods in Natural Language Processing|EMNLP]], [[International Committee on Computational Linguistics|COLING]] and [[Association for Computational Linguistics|ACL]], the leading conferences in the area of [[natural language processing|NLP]]. In 2020, the research contributions comprised German language models named GBERT and GELECTRA,<ref>{{Cite book |last1=Chan |first1=Branden |last2=Schweter |first2=Stefan |last3=Möller |first3=Timo |title=Proceedings of the 28th International Conference on Computational Linguistics |chapter=German's Next Language Model |date=2020 |chapter-url=https://www.aclweb.org/anthology/2020.coling-main.598 |language=en |location=Barcelona, Spain (Online) |publisher=International Committee on Computational Linguistics |pages=6788–6796 |doi=10.18653/v1/2020.coling-main.598|doi-access=free }}</ref> and a [[question answering]] dataset addressing the [[COVID-19 pandemic]] called COVID-QA, which was created in collaboration with [[Intel]] and has been annotated by biomedical experts.<ref>{{Cite journal |last1=Möller |first1=Timo |last2=Reina |first2=Anthony |last3=Jayakumar |first3=Raghavan |last4=Pietsch |first4=Malte |date=2020-07-09 |title=COVID-QA: A Question Answering Dataset for COVID-19 |url=https://aclanthology.org/2020.nlpcovid19-acl.18 |journal=Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020 |location=Online |publisher=Association for Computational Linguistics}}</ref>
In 2021, the research contributions comprised German models and datasets for [[question answering]] and [[Document retrieval|passage retrieval]] named GermanQuAD and GermanDPR,<ref>{{Cite journal |last1=Möller |first1=Timo |last2=Risch |first2=Julian |last3=Pietsch |first3=Malte |date=2021 |title=GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval |url=https://aclanthology.org/2021.mrqa-1.4 |journal=Proceedings of the 3rd Workshop on Machine Reading for Question Answering |language=en |location=Punta Cana, Dominican Republic |publisher=Association for Computational Linguistics |pages=42–50 |doi=10.18653/v1/2021.mrqa-1.4|doi-access=free |arxiv=2104.12741 }}</ref> a semantic answer [[Similarity measure|similarity metric]],<ref>{{Cite journal |last1=Risch |first1=Julian |last2=Möller |first2=Timo |last3=Gutsch |first3=Julian |last4=Pietsch |first4=Malte |date=2021 |title=Semantic Answer Similarity for Evaluating Question Answering Models |url=https://aclanthology.org/2021.mrqa-1.15 |journal=Proceedings of the 3rd Workshop on Machine Reading for Question Answering |language=en |location=Punta Cana, Dominican Republic |publisher=Association for Computational Linguistics |pages=149–157 |doi=10.18653/v1/2021.mrqa-1.15|doi-access=free |arxiv=2108.06130 }}</ref> and an approach for multimodal retrieval of texts and tables to enable question answering on tabular data.<ref>{{Cite journal |last1=Kostić |first1=Bogdan |last2=Risch |first2=Julian |last3=Möller |first3=Timo |date=2021 |title=Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models |url=https://aclanthology.org/2021.mrqa-1.8 |journal=Proceedings of the 3rd Workshop on Machine Reading for Question Answering |language=en |location=Punta Cana, Dominican Republic |publisher=Association for Computational Linguistics |pages=82–91 |doi=10.18653/v1/2021.mrqa-1.8|doi-access=free |arxiv=2108.04049 }}</ref> Haystack contains implementations of all three contributions, enabling the use of the research through the open source framework.
In November 2021, the development of the FARM framework was discontinued and its main features were integrated into the Haystack framework.<ref name=":3" />
In April 2022, the company announced its commercial [[SaaS]] offering deepset Cloud,<ref name=":2">{{Cite web |title=deepset Cloud |url=https://www.deepset.ai/deepset-cloud |access-date=August 31, 2022 |website=deepset}}</ref> which was rebranded in 2025 as Haystack Enterprise Platform supporting SaaS and on-premise deployment options.
As of August 2023, the most popular finetuned language model created by deepset was downloaded more than 52 million times.<ref>{{Cite web |title=deepset/roberta-base-squad2 · Hugging Face |url=https://huggingface.co/deepset/roberta-base-squad2 |access-date=October 12, 2022 |website=huggingface.co}}</ref>
In 2024, deepset was named a Gartner Cool Vendor in AI Engineering.<ref>{{Cite web |title=deepset {{!}} deepset Recognized in 2024 Gartner® Cool Vendors in AI Engineering |url=https://www.deepset.ai/news/deepset-2024-gartner-cool-vendors-ai-engineering |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref>
In 2025, deepset was recognized for its growth by WirtschaftsWoche<ref>{{Cite web |title=WirtschaftsWoche |url=https://www.wiwo.de/technologie/digitale-welt/kuenstliche-intelligenz-das-sind-die-vielversprechendsten-ki-start-ups-in-deutschland/100151531.html |access-date=2025-09-23 |website=www.wiwo.de}}</ref> and Sifted<ref>{{Cite web |title=deepset {{!}} deepset Rises on Sifted's 2025 “Rising 100” for B2B SaaS |url=https://www.deepset.ai/news/deepset-2025-b2b-saas-rising-100 |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref> and shared partnership integrations and announcements with Meta Llama Stack,<ref>{{Cite web |title=deepset {{!}} deepset Accelerates Enterprise Adoption of Domain-Specific Sovereign AI with Llama Stack |url=https://www.deepset.ai/news/deepset-sovereign-ai-integration-meta-llama-stack |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref> MongoDB,<ref>{{Cite web |title=MongoDB and deepset Pave the Way for Effortless AI App Creation {{!}} MongoDB Blog |url=https://www.mongodb.com/company/blog/innovation/mongodb-deepset-pave-way-for-effortless-ai-app-creation |access-date=2025-09-23 |website=MongoDB |language=en-us}}</ref> NVIDIA,<ref>{{Cite web |title=deepset {{!}} deepset Brings Custom AI Agent Orchestration to NVIDIA Enterprise AI Factory |url=https://www.deepset.ai/news/deepset-custom-ai-agent-orchestration-nvidia-enterprise-ai-factory |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref> Amazon Web Services (AWS),<ref>{{Cite web |title=deepset {{!}} deepset Signs Strategic Collaboration Agreement with AWS to Deliver Custom Generative AI Solutions at Scale |url=https://www.deepset.ai/news/deepset-aws-partnership-strategic-collaboration-agreement |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref> and PwC.<ref>{{Cite web |title=deepset {{!}} deepset and PwC Announce Partnership to Accelerate Gen AI Adoption through Custom Agents and Apps |url=https://www.deepset.ai/news/deepset-and-pwc-partnership |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref>
As of September 2025, the Haystack open source AI orchestration framework has more than 24,000 GitHub stars.<ref>{{Citation |last=Pietsch |first=Malte |title=Haystack: the end-to-end NLP framework for pragmatic builders |date=November 2019 |url=https://github.com/deepset-ai/haystack |access-date=2025-09-23 |last2=Möller |first2=Timo |last3=Kostic |first3=Bogdan |last4=Risch |first4=Julian |last5=Pippi |first5=Massimiliano |last6=Jobanputra |first6=Mayank |last7=Zanzottera |first7=Sara |last8=Cerza |first8=Silvano |last9=Blagojevic |first9=Vladimir}}</ref>
== Products and applications == Haystack is an open source [[Python (programming language)|Python]] AI Orchestration framework for building custom AI agents and applications with [[large language model]]s. With its modular building block components, software developers and AI engineers can implement pipelines to build and customize various AI architectures over large document and multimodal data collections, such as agents, retrieval augmented generation (RAG), intelligent document processing (IDP), text-to-SQL as well as [[document retrieval]], [[semantic search]], [[Natural language generation|text generation]], [[question answering]], or [[Automatic summarization|summarization]].
Haystack emphasizes ''[https://www.deepset.ai/blog/context-engineering-the-next-frontier-beyond-prompt-engineering context engineering]'', an approach to AI system design that focuses on explicit control over how contextual information is retrieved, structured, routed to language models, and evaluated after generation. This allows developers to build AI systems with transparent data flow, tool usage, and configurable reasoning processes.
Haystack integrates with 90+ model and technology providers including [[Hugging Face|Hugging Face Transformers]], [[Elasticsearch]], [[OpenSearch (specification)|OpenSearch]], [[OpenAI]], [[Cohere]], [[Anthropic]], [[Mistral AI|Mistral]] and others. Developers can extend these integrations with their own [https://docs.cloud.deepset.ai/docs/custom-components custom components]. The [[Software framework|framework]] has an active community on [[Discord]] with more than 4k members and [[GitHub]], where so far more than 300 people have contributed to its continuous development,<ref>{{Cite web |title=Contributors to deepset-ai/haystack |url=https://github.com/deepset-ai/haystack |access-date=August 31, 2022 |website=[[GitHub]]|language=en}}</ref> and engage on [[Meetup]].<ref>{{Cite web |title=Open NLP Group |url=https://www.meetup.com/open-nlp-meetup/ |access-date=August 31, 2022 |website=[[Meetup]] |language=en}}</ref> Thousands of organizations use the framework, including public sector leaders like the [[European Commission]] and Global 500 enterprises like [[Airbus]], [[Intel]], [[Nvidia|NVIDIA]], [[Lufthansa]], [[Netflix]], [[Apple Inc.|Apple]], [[Infineon]], [[Alcatel-Lucent Enterprise]], [[BetterUp]], Etalab, Sooth.ai, and [[Lego]].<ref>{{Cite news |last=Laughlin |first=Eleni |date=April 28, 2022 |title=deepset Raises $14 Million Series A Led By GV for Advanced NLP Platform |work=[[Business Wire]] |url=https://www.businesswire.com/news/home/20220428005187/en/ |access-date=August 31, 2022}}</ref><ref>{{Cite web |title=Who uses Haystack |url=https://github.com/deepset-ai/haystack#who-uses-haystack |access-date=August 31, 2022 |website=[[GitHub]]|language=en}}</ref>
On top of the Haystack open source framework, deepset offers two enterprise offerings to organizations.
Haystack Enterprise Starter provides enterprise support on the open source framework from the Haystack engineering team as well as a private GitHub repository with production use case templates and Kubernetes deployment guides.<ref>{{Cite web |title=Announcing Haystack Enterprise: Best Practices and Support |url=https://haystack.deepset.ai/blog/announcing-haystack-enterprise/ |access-date=2025-09-23 |website=Haystack |language=en}}</ref>
The Haystack Enterprise Platform supports customers at building scalable AI applications by covering the entire process of prototyping, experimentation, deployment, monitoring, and governance.<ref>{{Cite web |title=deepset Cloud |url=https://venturebeat.com/ai/open-source-nlp-company-deepset-nabs-14m-to-power-plain-english-enterprise-search/ |access-date=November 1, 2022 |website=[[VentureBeat]] |date=28 April 2022 |language=en}}</ref> It is built on the Haystack open source framework and is available for hosting in the cloud and self-hosted via VPC, on-premise, or air gapped environments. deepset's enterprise tools are used by organizations including The European Commission, The Economist, Oxford University Press, the German Federal Ministry of Research, Technology, and Space (BMFTR), Manz Verlag, and the German Armed Forces.<ref>{{Cite web |title=Gen AI Case Studies in Government, Banking, Business & More |url=https://www.deepset.ai/case-studies |access-date=2025-09-23 |website=www.deepset.ai |language=en}}</ref>
FARM was an earlier [[Software framework|framework]] for adapting representation models.<ref name=":3" /> One of its core concepts was the implementation of adaptive models, which comprised language models and an arbitrary number of prediction heads. FARM supported domain-adaptation and finetuning of these models with advanced options, for example gradient accumulation, [[Cross-validation (statistics)|cross-validation]] or [[Mixed-precision arithmetic|automatic mixed-precision training]]. Its main features were integrated into Haystack in November 2021, and its development was discontinued at that time.<ref>{{Cite book |chapter=Finding A Needle in a Haystack: Automated Mining of Silent Vulnerability Fixes |doi=10.1109/ase51524.2021.9678720|s2cid=246081539 |title=2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) |date=2021 |last1=Zhou |first1=Jiayuan |last2=Pacheco |first2=Michael |last3=Wan |first3=Zhiyuan |last4=Xia |first4=Xin |last5=Lo |first5=David |last6=Wang |first6=Yuan |last7=Hassan |first7=Ahmed E. |pages=705–716 |isbn=978-1-6654-0337-5 }}</ref>
== Funding == On August 9, 2023, deepset announced a Series B investment round of $30 million led by [[Balderton Capital]] and including participation from existing investors [[GV (company)|GV]], System.One, Lunar Ventures and Harpoon Ventures.<ref>{{Cite web |title=Deepset raises $30M to help enterprises unlock the value of LLMs|url=https://venturebeat.com/ai/deepset-raises-30m-to-help-enterprises-unlock-the-value-of-llms/ |access-date=August 22, 2023 |website=[[VentureBeat]] |date=9 August 2023 |language=en}}</ref><ref>{{Cite web |title=Deepset secures $30M to expand its LLM-focused MLOps offerings |url=https://techcrunch.com/2023/08/09/deepset-secures-30m-to-expand-its-llm-focused-mlops-offerings/ |access-date=August 22, 2023 |website=[[TechCrunch]] |date=9 August 2023 |language=en}}</ref><ref>{{Cite web |title=Deepset, an AI startup that helps companies build apps with LLMs, just raised $30 million with this 12-slide pitch deck |url=https://www.businessinsider.com/deepset-german-ai-startup-raises-30m-balderton-to-expand-llms-2023-8 |access-date=August 22, 2023 |website=[[Business Insider]] |language=en}}</ref><ref>{{Cite web |title=Deepset raises $30 million to help the world's biggest companies leverage LLM promise |url=https://www.balderton.com/news/deepset-raises-30-million-to-help-the-worlds-biggest-companies-leverage-llm-promise/ |access-date=August 22, 2023 |website=Balderton |date=9 August 2023 |language=en}}</ref> On April 28, 2022, deepset announced a Series A investment round of $14 million led by GV, with the participation of Harpoon Ventures, Acequia Capital and a team of experienced commercial [[open-source software|open source software]] and [[machine learning]] founders, such as Alex Ratner (Snorkel AI), [[Mustafa Suleyman]] ([[Deepmind]]), [[Spencer Kimball (computer programmer)|Spencer Kimball]] ([[Cockroach Labs]]), [[Jeff Hammerbacher]] ([[Cloudera]]) and Emil Eifrem ([[Neo4j]]).<ref name=":0" /> A previous pre-seed investment round of $1.6 million on March 8, 2021, was led by System.One and Lunar Ventures, who also participated in the subsequent Series A round.
== References == {{reflist}}
== External links == * {{Official website|https://www.deepset.ai/}} * {{GitHub|deepset-ai}}
[[Category:Natural language processing software]] [[Category:Software companies of Germany]]