# Data Commons

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Data_Commons
> Markdown URL: https://mediated.wiki/source/Data_Commons.md
> Source: https://en.wikipedia.org/wiki/Data_Commons
> Source revision: 1322884219
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

{{short description|Knowledge repository integrating open datasets}}
{{Infobox website
| name = Data Commons
| logo = Data Commons logo 2025.png
| logo_size = 
| logo_alt = 
| logo_caption = 
| screenshot = Data Commons screenshot.png
| screenshot_size = 
| screenshot_alt = Screenshot of a query in Data Commons
| collapsible = <!-- set as "on", "y", etc, otherwise omit/leave blank. Does nothing for mobile users. -->
| collapsetext = <!-- collapsible area's heading (default "Screenshot"); omit/leave blank if collapsible not set -->
| caption = Results for a query in Data Commons
| former_name = 
| company_type = 
| type = <!-- or: | website_type = -->
| language = 
| language_count = 
| language_footnote = 
| founded = 
| predecessor = 
| headquarters = <!-- or: | location = -->
| location_city = 
| location_country = <!-- or: | country = -->
| country_of_origin = 
| owner = <!-- or: | owners = -->
| author = <!-- or: | authors / creator / creators -->
| founder = [Ramanathan V. Guha](/source/Ramanathan_V._Guha)
| parent = [Google](/source/Google)
| url = {{URL|Datacommons.org}}
| ipv6 = 
| advertising = 
| commercial = <!-- "Yes", "No" or leave blank -->
| registration = <!-- or: | reg = -->
| num_users = <!-- or: | users = -->
| launch_date = {{start date and age|2018|05}}
| current_status = 
| content_license = 
| footnotes = 
| key_people = Prem Ramaswami (Head of Data Commons)
}}
'''Data Commons''' is an open-source platform<ref>{{cite web |title=Custom Data Commons |url=https://docs.datacommons.org/custom_dc/ |website=Docs - Data Commons |access-date=16 July 2024}}</ref> created by [Google](/source/Google)<ref name="Google 0923">{{cite news |title=Data Commons is using AI to make the world's public data more accessible and helpful |url=https://blog.google/technology/ai/google-data-commons-ai/ |access-date=16 July 2024 |work=Google |date=13 September 2023 |language=en-us}}</ref> that provides an [open knowledge](/source/Open_data) graph, combining economic, scientific and other public datasets into a unified view.<ref name=":0">{{Citation|last1=Fensel|first1=Dieter|title=Introduction: What Is a Knowledge Graph?|date=2020|url=http://link.springer.com/10.1007/978-3-030-37439-6_1|work=Knowledge Graphs|pages=1–10|place=Cham|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-37439-6_1|isbn=978-3-030-37438-9|access-date=2020-10-16|last2=Şimşek|first2=Umutcan|last3=Angele|first3=Kevin|last4=Huaman|first4=Elwin|last5=Kärle|first5=Elias|last6=Panasiuk|first6=Oleksandra|last7=Toma|first7=Ioan|last8=Umbrich|first8=Jürgen|last9=Wahler|first9=Alexander|s2cid=213620389|author-link=Dieter Fensel|url-access=subscription}}</ref> [Ramanathan V. Guha](/source/Ramanathan_V._Guha), a creator of web standards including [RDF](/source/Resource_Description_Framework),<ref>{{cite journal |last1=Guns |first1=Raf |date=2013 |title=Tracing the origins of the semantic web |journal=Journal of the American Society for Information Science and Technology |volume=64 |issue=10 |pages=2173–2181 |doi=10.1002/asi.22907 |hdl-access=free |hdl=10067/1111170151162165141}}</ref> [RSS](/source/RSS), and [Schema.org](/source/Schema.org),<ref>{{cite news |last1=Funke |first1=Daniel |date=7 December 2017 |title=This website helps you find related fact checks - and it was built by a 17-year-old |url=https://www.poynter.org/fact-checking/2017/this-website-helps-you-find-related-fact-checks-%C2%97-and-it-was-built-by-a-17-year-old/ |access-date=16 July 2024 |work=Poynter}}</ref> founded the project,<ref>{{Cite web |last=Guha |first=Ramanathan V. |author-link=Ramanathan V. Guha |date=15 October 2020 |title=Data Commons, now accessible on Google Search |url=https://docs.datacommons.org/2020/10/15/search_launch.html |access-date=2020-10-16 |website=docs.datacommons.org}}</ref> which is now led by Prem Ramaswami.<ref>{{cite news |last1=O'Donnell |first1=James |date=12 September 2024 |title=Google's new tool lets large language models fact-check their responses |url=https://www.technologyreview.com/2024/09/12/1103926/googles-new-tool-lets-large-language-models-fact-check-their-responses/ |access-date=17 September 2024 |work=MIT Technology Review |language=en}}</ref>

The Data Commons website was launched in May 2018 with an initial dataset consisting of [fact-checking](/source/fact-checking) data published in [Schema.org](/source/Schema.org) "ClaimReview" format by several fact checkers from the [International Fact-Checking Network](/source/Poynter_Institute).<ref>{{cite web |url=http://www.datacommons.org/factcheck/
 |title=Fact Checks |date=29 March 2019 |website=datacommons.org |access-date=14 October 2020}}</ref><ref>{{Cite book|last1=Jiang|first1=Shan|last2=Baumgartner|first2=Simon|last3=Ittycheriah|first3=Abe|last4=Yu|first4=Cong|title=Proceedings of the Web Conference 2020 |chapter=Factoring Fact-Checks: Structured Information Extraction from Fact-Checking Articles |date=2020-04-20|chapter-url=https://dl.acm.org/doi/10.1145/3366423.3380231|series=WWW '20|language=en|location=Taipei Taiwan|publisher=ACM|pages=1592–1603|doi=10.1145/3366423.3380231|isbn=978-1-4503-7023-3|s2cid=215882520}}</ref> Google has worked with partners such as the [United Nations](/source/United_Nations) (UN) to populate the repository,<ref name="Google 0923"/> which also includes data from the [United States Census](/source/United_States_Census), the [World Bank](/source/World_Bank), the [US Bureau of Labor Statistics](/source/Bureau_of_Labor_Statistics),<ref>{{Cite web|last=Raghavan|first=Prabhakar|author-link=Prabhakar Raghavan|date=2020-10-15|title=How AI is powering a more helpful Google|url=https://blog.google/products/search/search-on/|access-date=2020-10-16|website=Google|language=en}}</ref> [Wikipedia](/source/Wikipedia), the [National Oceanic and Atmospheric Administration](/source/National_Oceanic_and_Atmospheric_Administration) and the [Federal Bureau of Investigation](/source/Federal_Bureau_of_Investigation).<ref name=":1">{{Cite journal|last1=Sheth|first1=Amit|last2=Padhee|first2=Swati|last3=Gyrard|first3=Amelie|last4=Sheth|first4=Amit|date=2019-07-01|title=Knowledge Graphs and Knowledge Networks: The Story in Brief|journal=IEEE Internet Computing|volume=23|issue=4|pages=67–75|doi=10.1109/MIC.2019.2928449|arxiv=2003.03623|bibcode=2019IIC....23d..67S |s2cid=204820800|issn=1089-7801}}</ref>

The service expanded during 2019 to include an [RDF-style](/source/Resource_Description_Framework) [knowledge graph](/source/knowledge_graph) populated from a number of largely statistical open datasets. The service was announced to a wider audience in 2019.<ref>{{cite web|last1=Luong|first1=Daphne|last2=Chou|first2=Charina|date=5 March 2019|title=Doing our part to share open data responsibly|url=https://www.blog.google/technology/ai/sharing-open-data/|access-date=14 October 2020|website=The Keyword}}</ref> In 2020 the service improved its coverage of non-US datasets, while also increasing its coverage of [bioinformatics](/source/bioinformatics) and [coronavirus](/source/Coronavirus_disease_2019).<ref>{{cite news |last=Ramasubramanian |first=Sowmya |date=21 September 2020 |title=Google's open source data to study impact of COVID-19 |url=https://www.thehindu.com/sci-tech/technology/googles-open-source-data-to-study-impact-of-covid-19/article32660642.ece |work=[The Hindu](/source/The_Hindu) | access-date=14 October 2020}}</ref> In 2023, the service relaunched with a natural-language front end powered by a [large language model](/source/large_language_model).<ref name="Google 0923"/> It also launched as the back end to the UN data portal with [Sustainable Development Goals](/source/Sustainable_Development_Goals) data.<ref>{{cite news |last1=Manyika |first1=James |title=Using data and AI to track progress toward the UN Global Goals |url=https://blog.google/technology/ai/google-ai-data-un-global-goals/ |access-date=22 July 2024 |work=Google |date=19 September 2023 |language=en-us}}</ref>

== Features ==
Data Commons places more emphasis on statistical data than is common for [linked data](/source/linked_data) and [knowledge graph](/source/knowledge_graph) initiatives. It includes geographical, demographic, weather and real estate data alongside other categories,<ref name=":0" /> describing states, Congressional districts, and cities in the United States as well as biological specimens, power plants, and elements of the [human genome](/source/human_genome) via the [Encyclopedia of DNA Elements (ENCODE)](/source/ENCODE) project.<ref name=":1" /> It represents data as [semantic triple](/source/semantic_triple)s each of which can have its own provenance.<ref name=":0" /> It centers on the entity-oriented integration of statistical observations from a variety of public datasets. Although it supports a subset of the W3C [SPARQL query language](/source/SPARQL),<ref>{{cite web |url=https://docs.datacommons.org/api/python/query.html |title=Query the Data Commons Knowledge Graph using SPARQL |website=datacommons.org |access-date=14 October 2020}}</ref> its [API](/source/API)s<ref>{{cite web |url=https://docs.datacommons.org/api/ |title=Overview |website=datacommons.org |access-date=14 October 2020}}</ref> also include tools — such as a [Pandas](/source/Pandas_(software)) dataframe interface — oriented towards data science, statistics and data visualization.

Data Commons is integrative, meaning that it does not provide a hosting platform for different datasets, but rather attempts to consolidate much of the information provided by the datasets into a single data graph.

== Technology ==
Data Commons is built on a [graph data-model](/source/Graph_database). The graph can be accessed through a browser interface and several APIs,<ref name=":0" /><ref name=":1" /> and is expanded through loading data (typically CSV and [MCF](/source/Meta_Content_Framework)-based templates).<ref>{{cite web |title=Contributing to Data Commons – Adding datasets |url=https://docs.datacommons.org/contributing/adding_datasets.html |website=datacommons.org |publisher=Data Commons |access-date=2020-10-14 |archive-date=2020-09-19 |archive-url=https://web.archive.org/web/20200919001318/https://docs.datacommons.org/contributing/adding_datasets.html |url-status=dead }}</ref> The graph can be accessed by natural language queries in [Google Search](/source/Google_Search).<ref>{{Cite web|last=Guha|first=Ramanathan V.|author-link=Ramanathan V. Guha|date=15 October 2020|title=Data Commons, now accessible on Google Search|url=https://docs.datacommons.org/2020/10/15/search_launch.html|access-date=2020-10-16|website=docs.datacommons.org}}</ref> The data vocabulary used to define the datacommons.org graph is based upon [Schema.org](/source/Schema.org).<ref name=":0" /> In particular the Schema.org terms StatisticalPopulation<ref>{{cite web |url=https://schema.org/StatisticalPopulation |title=StatisticalPopulation type at Schema.org |website=schema.org |access-date=14 October 2020}}</ref> and Observation<ref>{{cite web |url=https://schema.org/Observation |title=Observation type at Schema.org |website=schema.org |access-date=14 October 2020}}</ref> were proposed to Schema.org to support datacommons-like use cases.<ref>{{cite web |url=https://github.com/schemaorg/schemaorg/issues/2291 |title=Proposal for representing Aggregate Statistical Data |date=25 June 2019 |website=GitHub – Schema.org repository |access-date=14 October 2020}}</ref>

Software from the project is available on [GitHub](/source/GitHub) under [Apache 2 license](/source/Apache_License).<ref>{{cite web |url=https://github.com/datacommonsorg/ |title=datacommons.org GitHub|website=[GitHub](/source/GitHub) }}</ref>

== References ==
{{reflist}}

== External links ==
* {{Official website}}
* [https://github.com/datacommonsorg/ GitHub repository]

{{Google LLC}}

Category:Google
Category:Knowledge graphs
Category:Open data

---
Adapted from the Wikipedia article [Data Commons](https://en.wikipedia.org/wiki/Data_Commons) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Data_Commons?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
