Self-contained Machine Learning and Natural Language Processing library in Go
nlp.embeddings.syncmap
package.ml.nn.recurrent.srnn.BiModel
which implements a bidirectional variant of the Shuffling Recurrent Neural Networks (SRNN).docker-entrypoint
can reuse all other cli.App
objects, instead of just running separate executables. By extension, now the Dockerfile builds a single executable file, and the final image is way smaller.fmt.Errorf
instead of functions from github.com/pkg/errors
.encoding.gob
. Many specific functions and methods are now replaced by fewer and simpler encoding/decoding methods compatible with gob
. A list of important related changes follows.
utils.kvdb.KeyValueDB
is no longer an interface, but a struct which directly implements the former "badger backend".utils.SerializeToFile
and utils.DeserializeFromFile
now handle generic interface{}
objects, instead of values implementing Serializer
and Deserializer
.mat32
and mat64
custom serialization functions (e.g. MarshalBinarySlice
, MarshalBinaryTo
, ...) are replaced by implementations of BinaryMarshaler
and BinaryUnmarshaler
interfaces on Dense
and Sparse
matrix types.PositionalEncoder.Cache
and AxialPositionalEncoder.Cache
fields (from ml.encoding.pe
package) are now public.nn.Model
interface are registered for gob serialization (in init functions).embeddings.Model.UsedEmbeddings
type is now nlp.embeddings.syncmap.Map
.sequencelabeler.Model.LoadParams
has been renamed to Load
.nn.ParamSerializer
and related functionsnn.ParamsSerializer
and related functionsutils.Serializer
and utils.Deserializer
interfacesutils.ReadFull
functionsequencelabeler.Model.LoadVocabulary
docker-entrypoint
sub-command hugging-face-importer
has been renamed to huggingface-importer
, just like the main command itself.docker-entrypoint
sub-command can be correctly specified without leading ./
or /
when run from a Docker container.golint
and gocyclo
) to Go
GitHub workflow.ml.ag.ConcurrentComputations()
(GraphOption
) and ml.ag.Graph.ConcurrentComputations()
. If no option is specified, by default the limit is set to runtime.NumCPU()
.ml.optimizers.gd.GradientDescent
(e.g. params update step).utils.processingqueue
.mat32
package, which operates on float32
data type.float32
and float64
as default floating-point data type, using the script change-float-type.sh
Go
GitHub workflow has been adapted to run tests using both float32
and float64
as main floating-point data type.ml.ag.ConcurrentComputations
(GraphOption
) expects the maximum number of concurrent computations handled by heavyweight Graph operations (e.g. forward and backward steps).ml.nn.linear.Model
and ml.nn.convolution.Model
read the concurrent computations limit set on the model's Graph, thus SetConcurrentComputations()
methods have been removed.mat
has been renamed to mat64
and some functions have been renamed.float32
floating-point data type by default, by using the package mat32
.mat32
is always aliased as mat
. Then, explicit usages of float64
type have been replaced with mat.Float
. Moreover, bitsize-specific functions have been made more generic (i.e. operating with mat.Float
type) or split into separate implementation, in mat32
and mat64
. In this way, switching the whole project between float32
and float64
is just a matter of changing all imports, from mat32
to mat64
, or vice-versa (see also the new file change-float-type.sh
).nlp.sequencelabeler.Convert()
now loads and converts original Flair models, instead of pre-processed dumps.nn.Model
and nn.Processor
interfaces:
nn.Model
interface that can be reified to become a neural processor. See nn.Reify()
;Forward
method in the nn.Model
interface so it has been removed, gracefully increasing flexibility in the implementation of a model.