STADATA is a Python package that simplifies access to statistical data provided by BPS - Statistics Indonesia
STADATA is a Python package that simplifies access to statistical data provided by BPS - Statistics Indonesia, National Statistics Office of Indonesia. BPS offers a WebAPI - https://webapi.bps.go.id/developer/ that allows users to programmatically access various types of data, including Publications, Press Releases, static tables, and dynamic tables.
With STADATA, Python users can utilize this WebAPI to retrieve data directly from Python scripts, providing users with a convenient and easy-to-use interface to interact with the WebAPI BPS. The package aims to facilitate public access to the data generated by BPS - Statistics Indonesia and eliminate the need for manual data downloads from the https://www.bps.go.id/.
The key features of STADATA include:
To install STADATA, use the following pip command:
pip install stadata
STADATA is designed for Python 3.7 and above. To use the package, the following dependencies are required:
With the necessary requirements in place, you can easily start utilizing STADATA to access the WebAPI BPS and retrieve statistical data from BPS - Statistics Indonesia directly in your Python scripts.
To begin using STADATA, you must first install the package and satisfy its requirements, as mentioned in the previous section. Once you have the package installed and the dependencies in place, you can start accessing statistical data from BPS - Statistics Indonesia through the WebAPI BPS.
To get started with STADATA, you will need an API token from WebAPI BPS. Once you have obtained your token, you can use it to set up the STADATA client in your Python script:
import stadata
# Replace 'token' with your actual API token obtained from WebAPI BPS - https://webapi.bps.go.id/developer/
client = stadata.Client('token')
Parameter:
token
(str, required): Your personal API token provided by the WebAPI BPS Developer portal. This token is necessary to authenticate and access the API. Make sure to replace token
with your actual API token.The STADATA package provides the following API methods:
This method returns a list of BPS's webpage domains from the national level to the district level. Domains are used to specify the region from which data is requested.
client.list_domain()
Returns:
domains
: A list of domain IDs for different regions, e.g., provinces, districts, or national.This method returns a list of all static tables available on the BPS's webpage. You can specify whether to get all static tables from all domains or only from specific domains.
# Get all static tables from all domains
client.list_statictable(all=True)
# Get static tables from specific domains
client.list_statictable(all=False, domain=['domain_id-1', 'domain_id-2'])
Parameters:
all
(bool, optional): A boolean indicating whether to get all static tables from all domains (True) or only from specific domains (False).domain
(list of str, required if all
is False): A list of domain IDs which you want to retrieve static tables from.Returns:
data
: A list of static table information
table_id|title|subj_id|subj|updt_date|size|domain
This method returns a list of all dynamic tables available on the BPS's webpage. You can specify whether to get all dynamic tables from all domains or only from specific domains.
# Get all static tables from all domains
client.list_dynamictable(all=True)
# Get static tables from specific domains
client.list_dynamictable(all=False, domain=['domain_id-1', 'domain_id-2'])
Parameters:
all
(bool, optional): A boolean indicating whether to get all static tables from all domains (True) or only from specific domains (False).domain
(list of str, required if all
is False): A list of domain IDs which you want to retrieve static tables from.Returns:
data
: A list of static table information
var_id|title|sub_id|sub_name|subcsa_id|subcsa_name|notes|vertical|unit|graph_id|graph_name|domain
This method returns a list of all publication available on the BPS's webpage. You can specify whether to get all publication from all domains or only from specific domains. You can also specify month and year when publication published to get specific publication.
# Get all static tables from all domains
client.list_publication(all=True)
# Get static tables from specific domains
client.list_publication(all=False, domain=['domain_id-1', 'domain_id-2'])
# Get static tables from specific domains, year, and month
client.list_publication(all=False, domain=['domain_id-1', 'domain_id-2'], month="4", year="2022")
Parameters:
all
(bool, optional): A boolean indicating whether to get all publication from all domains (True) or only from specific domains (False).domain
(list of str, required if all
is False): A list of domain IDs which you want to retrieve publication from.month
(str, optional): A month when publication published.year
(str, required): A year when publication published.Returns:
data
: A list of publication
pub_id|title|issn|sch_date|rl_date|updt_date|size|domain
This method returns a list of all press release available on the BPS's webpage. You can specify whether to get all press release content from all domains or only from specific domains. You can also specify month and year when press release published to get specific press release.
# Get all static tables from all domains
client.list_pressrelease(all=True)
# Get static tables from specific domains
client.list_pressrelease(all=False, domain=['domain_id-1', 'domain_id-2'])
# Get static tables from specific domains, year, and month
client.list_pressrelease(all=False, domain=['domain_id-1', 'domain_id-2'], month="4", year="2022")
Parameters:
all
(bool, optional): A boolean indicating whether to get press release from all domains (True) or only from specific domains (False).domain
(list of str, required if all
is False): A list of domain IDs which you want to retrieve press release from.month
(str, optional): A month when press release published.year
(str, required): A year when press release published.Returns:
data
: A list of press release
brs_id|subj_id|subj|title|rl_date|updt_date|size|domain
This method returns data from a specific static table. You need to provide the domain ID and the table ID, which you can get from the list of static tables.
# View static table in Indonesian language (default)
client.view_statictable(domain='domain_id', table_id='table_id', lang='ind')
Parameters:
domain
(str, required): The domain ID where the static table is located.table_id
(str, required): The ID of the specific static table you want to retrieve data from.lang
(str, optional, default: ind
): The language in which the table data should be displayed (ind
for Indonesian, eng
for English).Returns:
data
: The static table data in the specified language.This method returns data from a specific dynamic table. You need to provide the domain ID, variable ID, and the period (year) for the dynamic table.
# View dynamic table with a specific period
client.view_dynamictable(domain='domain_id', var='variable_id', th='year')
Parameters:
domain
(str, required): The domain ID where the dynamic table is located.var
(str, required): The ID of the specific variable in the dynamic table you want to retrieve data from.th
(str, optional, default: ''): The period (year) of the dynamic table data you want to retrieve.Returns:
data
: The dynamic table data for the specified variable and period.This method returns data from a specific publication. You need to provide the domain ID, publication ID for the publication.
# View dynamic table with a specific period
client.view_publication(domain='domain_id', idx='publication_id')
Parameters:
domain
(str, required): The domain ID where the publication is located.idx
(str, required): The ID of the specific publication in the list of publication you want to retrieve data from.Returns:
Material
: Object interface for publication and press release content.Methods:
desc()
: Show all detail data of spesific publicationdownload(url)
: Download publication content in PDFThis method returns data from a specific press release. You need to provide the domain ID, press release ID for the spesific press release.
# View dynamic table with a specific period
client.view_pressrelease(domain='domain_id', idx='press_release_id')
Parameters:
domain
(str, required): The domain ID where the press release is located.idx
(str, required): The ID of the specific press release in the list of press release you want to retrieve data from.Returns:
Material
: Object interface for publication and press release content.Methods:
desc()
: Show all detail data of spesific press releasedownload(url)
: Download press release content in PDF