🐍 Python Implementation and Extension of RDF2Vec
skip_verify
attribute to the KG
class to skip or not the verification of the entity existence with remote Knowledge Graphs (default to skip_verify=False
).WideSampler
as a new sampling strategy.SplitWalker
as a new walking strategy.poetry
.HALKWalker
walking strategy.RandomWalker
and CommunityWalker
to return duplicate walks and prevent a different number of walks for the entities.with_reverse
parameter for the different walking strategies._post_extract
private method in the Walker
class for a post processing of walks by a walking strategy.HALKWalker
(0.001 -> 0.01).negative=20
and vector_size=500
for Word2Vec.size
hyperparameter by vector_size
of the default dictionary in the Word2Vec
class._update
private method in the RDF2VecTransformer
class.md5_bytes
attribute in the CommuniWalker
, HALKWalker
, RandomWalker
, and WLWalker
classes to hash or not an object in MD5 and with how many bytes to keep.extract
method in the Walker
to returns a list of entities with their walks instead of a list of walks.Fix the issue with nest-asyncio
as dependency.
cache
(default to cachetools.TTLCache(maxsize=1024, ttl=1200)
) attribute to the KG
class to significantly speed up the walks extraction through caching.is_update
(default to False
) hyper-parameter in the fit
method of the Embedder
and Word2Vec
classes to update an existing vocabulary.literals
(default to []
) attribute in the KG
class to support a basic literal extraction.mul_req
(default to False
) attribute to the KG
class to speed up the extraction of walks and literals for remote Knowledge Graph by sending asynchronous requests.n_jobs
(default to None
) attribute to the Walker
class to speed up the extraction of walks with multiprocessing.random_state
(default to None
) parameter for the Walker
class to handle better random determinism with walking and sampling strategies.verbose
(default to 0
) attribute to the RDF2VecTransformer
class to display useful debugging information and to measure the time of extraction, fit and generation of embeddings and literals.with_reverse
(default to False
) parameter for the Walker
class to generate more walks and improve the accuracy with Word2Vec
, by including the parents of the entities in the walks.load
and the save
methods in the RDF2VecTransformer
class.Connector
generic class to simplify the implementation of new connectors.SPARQLConnector
class to delegate the connection part to the SPARQL endpoint server.Vertex
class in a slot to reduce RAM usage.WalkerNotSupported
and SamplerNotSupported
exceptions in the Walker
and Sampler
classes when a walking strategy and a sampling strategy is not supported._cast_literals
private method to the KG
class to convert the raw literals of an entity according to their real types._embeddings
, _entities
, _literals
, and _walks
, attributes in the RDF2VecTransformer
class to be able to get all the embeddings, entities, literals, and walks after the online training of a model._fill_hops
private method in the KG
class to fill the entity hops in cache when mul_req=True
is provided for a remote Knowledge Graph._get_hops
private method in the KG
class to get the hops of a vertex for a local Knowledge Graph._is_support_remote
(default to False
) private attribute in the Walker
and Sampler
classes to restrict the use of walking and sampling strategies for some remote/local Knowledge Graph._res2hops
private method in the KG
class to convert a JSON response from a SPARQL endpoint server to hops.add_walk
method to the KG
class to simplify the addition of walk in a Knowledge Graph.examples/online-training
and examples/literals
files to illustrate the use of online training and literals with pyRDF2Vec
.fetch_hops
method to the KG
class to fetch to get the hops of a vertex on a remote Knowledge Graph.get_pliterals
method to the KG
class to gets the literals for an entity and a local KG based on a chain of predicates.get_walks
method in the RDF2VecTransformer
class to get the walks of a given entities in a Knowledge Graph.get_weights
method in the Sampler
class to get the hops weights.pyrdf2vec.typings
file to contains the aliases of the most commonly used typing with mypy.get_weight
method in the PageRankSampler
to raise an error if the method is called before the fit
method.remove_edge
method of the KG
class to also remove the edge of a children for a parent node._counts
dictionary with the PredFreqSampler
and ObjPredFreqSampler
classes._get_shops
and _get_rhops
functions in the KG
class.id
attribute of the Vertex
class.print_walks
method of the Walker
class.read_file
method in the KG
class.visualise
method in the KG
class.HalkWalker
class by HALKWalker
.SPARQLWrapper
library in favor of using requests
for synchronous requests and aiohttp
for asynchronous requests.WeisfeilerLehmanWalker
class by WLWalker
.add_edge
, add_vertex
, and remove_edge
methods in the KG
class to return a boolean value indicating that the addition/removal of an edge/vertex has been performed.depth
parameter with max_depth
for the Walker
class.extract_random_community_walks
, extract_random_community_walks_bfs
, and extract_random_community_walks_dfs
methods in the CommunityWalker
class by extract_walks
, _bfs
, and _dfs
methods.extract_random_walks
, extract_random_walks_bfs
, and extract_random_walks_dfs
methods in the RandomWalker
class by extract_walks
, _bfs
, and _dfs
methods.file_type
attribute in the KG
class by fmt
.get_inv_neighbors
method in the KG
class by a is_reverse
(default to False
) parameter in the get_neighbors
method.initialize
method in the Sampler
class by the use of @property
.is_remote
parameter in the KG
class for automatic link detection based on the http and https prefix.last
parameter with is_last_depth
in the sample_neighbor
method of the Sampler
class.label_predicates
attribute in the KG
class by skip_predicates
and now use a set instead of a list.pyrdf2vec.graphs.kg.Vertex
class with pyrdf2vec.graphs.Vertex
.fit_transform
and transform
functions in the RDF2VecTransformer
class to return a tuple containing the list of embeddings and literals.RDF2VecTransformer
class for Word2Vec
.Word2Vec
class to size=500
, min_count=0
, and negative=20
.RDF2VecTransformer
class to [RandomWalker(2)]
.Removes default prints from rdf2vec
.
Fix the README in PyPI.
verbose
(default to False
) hyper-parameter for the fit
method.Embedder
abstract class (currently only Word2Vec is included).UniformSampler
) from Cochez et al. to better deal with larger Knowledge Graphs.extract_random_walks_dfs
and extract_random_walks_bfs
methods for the RamdomWalker
class.get_hops
method along with the private _get_rhops
and _get_shops
methods in the KG
class.examples/countries.py
, examples/mutag.py
and examples/samplers.py
) for pyRDF2vec
.graph
for kg
in the fit
and fit_transform
methods of the RDF2VecTransformer
class.instance
for entities
in the transform
and fit_transform
methods of the RDF2VecTransformer
class.gensim
implementation.KnowledgeGraph
class for KG
.Walker
class to be abstract._rdf2vec.py
file for rdf2vec.py
.extract_random_community_walks
method in the CommunityWalker
to be private.extract
methods in walkers
to be private.graph.py
file for graphs/kg.py
.rdf2vec
module for pyrdf2vec
.graph
hyper-parameter in the transform
method of the RDF2VecTransformer
class.RDF2VecTransformer
for embedder
and walkers
ones.WildcardWalker
walking strategy.converter.py
file.create_kg
, endpoint_to_kg
, rdflib_to_kg
functions for the location
, file_type
, is_remote
hyper-parameters in KG
with the read_file
private method.Vertex.vertex_count
for itertools.count
in the Vertex
class.