PheKnowLator Versions Save

PheKnowLator: Heterogeneous Biomedical Knowledge Graphs and Benchmarks Constructed Under Alternative Semantic Models

v3.1.2

6 months ago

Release: v3.1.2

Website: https://github.com/callahantiff/PheKnowLator/wiki/v3-Build-Details Data Access: Archived Builds Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.1.2

Description

This release provides bug repairs for merging ontologies and addressed issue #140. Thanks to @ GuarinoValentina for helping point out this error!

v3.1.1

9 months ago

Release: v3.1.1

Website: https://github.com/callahantiff/PheKnowLator/wiki/v3-Build-Details Data Access: Archived Builds Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.1.1

Description

This release provides minor bug repairs for the updates that were made to the OWL-NETS workflow in v3.1.0. Thanks to @sanyabt for helping point out this error!

v3.1.0

1 year ago

Release: v3.1.0

Website: https://github.com/callahantiff/PheKnowLator/wiki/v3-Build-Details Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.1.0

Description

This release includes updates to the OWL-NETS workflow, addresses deprecated functions associated with Networkx v3.0, and removes pkt namespace from the final OWL files.

Updated Jupyter Notebooks:

Updated Scripts:

  • .github/workflows/build-qa.yml
  • pkt_kg/__version__.py
  • pkt_kg/metadata.py
  • pkt_kg/construction_approach.py
  • pkt_kg/downloads.py
  • pkt_kg/edge_list.py
  • pkt_kg/knowledge_graph.py
  • pkt_kg/metadata.py
  • pkt_kg/owlnets.py
  • pkt_kg/utils/kg_utils.py
  • pkt_kg/utils/data_utils.py
  • tests/test_metadata.py
  • tests/test_owlnets.py

v3.0.2

2 years ago

Release: v3.0.2

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.0.2

Updated Jupyter Notebooks:

Updated Scripts:

  • builds/data_preprocessing.py
  • pkt_kg/metadata.py
  • pkt_kg/utils/kg_utils.py
  • builds/data_to_download.txt
  • pkt_kg/utils/data_utils.py
  • tests/test_data_utils_downloading.py

Updates

  • Addresses issue #118 (PR: #119) by patching the prior functionality related to obtaining labels and definitions from ontologies. Specifically, it now ensures that whenever possible the language encoding for these fields is English. Please see details below for information on how to address nodes containing foreign characters prior to this release.

    Solution for Builds Prior to v3.0.2 The (bad_node_patch.json - attached) file contains a dictionary where the outer keys are the entity_uri and the outer values are another dictionary where the inner keys are label and description/definition and the inner values for these inner keys are the updated strings without foreign characters. An example of this dictionary is shown below:

    key = '<http://purl.obolibrary.org/obo/UBERON_0000468>'
    

bad_node_patch.json.zip

print(bad_node_patch[key])

{'label': 'multicellular organism', 'description/definition': 'Anatomical structure that is an individual member of a species and consists of more than one cell.'}


The code to identify the nodes with erroneous foreign characters is shown below:

```python
import re
import pandas as pd

# link to downloaded `NodeLabels.txt` file
input_file = `'NodeLabels.txt'`

# load data as Pandas DataFrame
nodedf = pd.read_csv(input_file, sep='\t', header=0)

# identify bad nodes and filter DataFrame so it only contains these rows
nodedf['bad'] = nodedf['label'].apply(lambda x: re.search("[\u4e00-\u9FFF]", x) if not pd.isna(x) else None)
nodedf_bad_nodes = nodedf[~pd.isna(nodedf['bad'])].drop_duplicates()

v3.0.1

2 years ago

Release: v3.0.1

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.0.1

Updated Jupyter Notebooks:

Updated Scripts:

  • pkt_kg/metadata.py

Updates

  • Addresses issue #116 (PR: #117) by patching the prior functionality related to processing metadata from a dictionary (node_metadata_dict.pkl) to a flat-file. The patch ensures that all potential faulty newline characters (\n) are removed before writing the file out.

v3.0.0

2 years ago

Release: v3.0.0

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 3.0.0

Updated Jupyter Notebooks:

Updated Scripts:

  • pkt_kg/utils/kg_utils.py
  • builds/data_preprocessing.py
  • builds/deploy/triple-store/docker-compose.yml

Updates

  • The gets_ontology_class_dbxrefs() and gets_ontology_class_synonyms() functions were updated to account for classes and object properties that may have the same synonym and dbXref and/or multiple synonyms and dbXrefs. Originally, these functions were keyed by a synonym or dbXref string with class and object property URLs as values. This change maintains the same keys, but now includes a list of potential class and object property URLs for each key
  • Both notebooks and the builds/data_preprocessing.py script have been updated to reflect and account for this change
  • Updated the docker-compose.yml file to account for changes made in the DBCLS SPARQL proxy

v2.1.1

2 years ago

Release: v2.1.1

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 2.1.1

Updated Jupyter Notebooks:

Updated Scripts:

  • pkt_kg/owlnets.py
  • pkt_kg/utils/kg_utils.py

Updates

  • For the owlnets.py script, three new hyperparameters were added to provide users with more flexibility in terms of what support, top, and relation ontology objects are included in an OWL-NETS graph. The pruning functions were also improved to make sure that metadata are not getting through (i.e., obsolete classes and XML Schema).
  • For the OWLNETS_Example_Application.ipynb Jupyter Notebook, new functionality was added to include node and relation definitions.
  • Added new function to obtain the definitions for all owl:Class and owl:ObjectProperty objects to 'pkt_kg/utils/kg_utils.py`

Example of New Functionality:
Screen Shot 2021-09-02 at 01 28 24

v2.1.0

3 years ago

Release: v2.1.0

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 2.1.0

New Jupyter Notebooks:

Updates

  • Parallelize edge_list.py, knowledge_graph.py, and owlnets.py using ray
  • Moderate updates to the logic for how non-ontology data are added to the merged set of base ontologies. Please see the resources/consrtuction_approach/README.md for additional details and updated examples.
  • New functionality added for splitting the logical core of a graph from its annotation assertions.
  • Changed the output files: no longer generating .owl files.
  • Cleaned up OWL-NETS helper functions and modified the logic for filtering OWL-specific annotations and axioms. Also added logic to enforce that the OWL-NETS graphs are all a single connected component.
  • Added more extensive statistics to logging and which print during the run-time.
  • Adding arguments for progress/logging verbosity.
  • New method added for better load balancing when input into Ray

Performance Stats used in Testing

GCP Instance:

  • Machine Type: custom (24 vCPU, 500 GB memory)
  • CPU Platform: Intel Haswell
  • Image OS: Debian, Debian GNU/Linux, 10 (buster), amd64 built on 20210217
  • Boot Disc: Balanced persistent disk (150 GB)

Graph Build Statistics:

Screen Shot 2021-04-13 at 15 11 33


Maximum Memory Use (GiB):

Screen Shot 2021-04-13 at 23 23 11

Runtime (minutes):

Screen Shot 2021-04-13 at 23 22 59

v2.0.0

3 years ago

First Official Release

Website: https://github.com/callahantiff/PheKnowLator/wiki/v2.0.0 Data Access: Archived Builds
Docker Container: DockerHub Dedicated Project Container PyPI: pkt-kg 2.0.1

Jupyter Notebooks:

All changes between this and the last release are thoroughly documented on the project Wiki under the v2.0.0 release. Please see that page for all described changes and updates between this and the prior release. This page also contains a description of the data used for the build as well as the data files generated as part of the build.

Note: The version on PyPI has been bumped to v2.0.1 instead of v2.0.0. This is the result of a testing error that caused an early release of the software on PyPI. Please use the latest version of the library available on PyPI. This issue will be resolved to equate the version on GitHub and PyPI in a future release.

v1.0.0

5 years ago

Release: v1.0.0

This is the first release of the PheKnowLator project. Additional information can be found here.

Data Sources

Ontologies

Classes

Instances

Knowledge Representation

Results