Registry of data portals, catalogs, data repositories including data catalogs dataset and catalog description standard
Registry of data portals, catalogs, data repositories and e.t.c.
This is a transitional repository to create registry of all existing open data portals and repositories.
This is the first pillar of the open search engine project. Other pillars include:
Please take a look at project mindmap to see it's goals and structure.
This registry includes description of the following data catalogs:
This project inspired by Re3Data and Fairsharing projects. Key difference is the focus on open data as a broad topic, not just open research data.
Final version of this repository will be reorganized as database with publicly available open API and bulk data dumps.
Warning: this is temporary description and subject of change
Data catalog descriptions are YAML files in data/entities folder. Files separated by country/territory folders and inside each country folder there are folders like scientific, opendata, microdata, geo, search, marketplace, other.
Data.gov YAML file
access_mode:
- open
api: true
api_status: active
catalog_type: Open data portal
content_types:
- dataset
coverage:
- location:
country:
id: US
name: United States
level: 1
endpoints:
- type: ckanapi
url: https://catalog.data.gov/api/3
export_standard: CKAN API
id: catalogdatagov
identifiers:
- id: wikidata
url: https://www.wikidata.org/wiki/Q5227102
value: Q5227102
- id: re3data
url: https://www.re3data.org/repository/r3d100010078
value: r3d100010078
- id: fairsharing
url: https://fairsharing.org/FAIRsharing.6069e1
valye: FAIRsharing.6069e1
langs:
- EN
link: https://catalog.data.gov
name: NETL Energy Data eXchange
owner:
location:
country:
id: US
name: United States
level: 1
name: U.S. Department of Energy
type: Central government
software: CKAN
status: active
tags:
- government
- has_api
Datasets kept in data/datasets folder, right now it's catalogs.jsonl file generated by script builder.py in scripts folder.
Run python builder.py build
in scripts folder to regenerate catalogs.jsonl file from YAML files.
If you find any mistake or you have an additional data catalog to add, please generate pull request or write an issue.
Following data sources used:
Source code licensed under MIT license Data licensed under CC-BY 4.0 license