Somber Save

Recursive Self-Organizing Map/Neural Gas.

Project README

SOMBER

somber (Somber Organizes Maps By Enabling Recurrence) is a collection of numpy/python implementations of various kinds of Self-Organizing Maps (SOMS), with a focus on SOMs for sequence data.

To the best of my knowledge, the sequential SOM algorithms implemented in this package haven't been open-sourced yet. If you do find examples, please let me know, so I can compare and link to them.

The package currently contains implementations of:

Regular Som (SOM) (Kohonen, various publications)
Recursive Som (RecSOM) (Voegtlin, 2002 <http://www.sciencedirect.com/science/article/pii/S0893608002000722>_)
Neural Gas (NG) (Martinetz & Schulten, 1991 <https://www.ks.uiuc.edu/Publications/Papers/PDF/MART91B/MART91B.pdf>_)
Recursive Neural Gas (Voegtlin, 2002)
Parameterless Som (Berglund & Sitte, 2007 <https://arxiv.org/abs/0705.0199>_)

Because these various sequential SOMs rely on internal dynamics for convergence, i.e. they do not fixate on some external label like a regular Recurrent Neural Network, processing in a sequential SOM is currently strictly online. This means that every example is processed separately, and weight updates happen after every example. Research into the development of batching and/or multi-threading is currently underway.

If you need a fast regular SOM, check out SOMPY <https://github.com/sevamoo/SOMPY>_, which is a direct port of the MATLAB Som toolbox.

Usage

Care has been taken to make SOMBER easy to use, and function like a drop-in replacement for sklearn-like systems. The non-recurrent SOMs take as input [M * N] arrays, where M is the number of samples and N is the number of features. The recurrent SOMs take as input [M * S * N] arrays, where M is the number of sequences, S is the number of items per sequence, and N is the number of features.

Examples

Colors

Color clustering is a kind of Hello, World for Soms, because it nicely demonstrates how SOMs create a continuous mapping. The color dataset comes from this nice blog <https://codesachin.wordpress.com/2015/11/28/self-organizing-maps-with-googles-tensorflow>_

.. code-block:: python

import numpy as np

from somber import Som

X = np.array([[0., 0., 0.], [0., 0., 1.], [0., 0., 0.5], [0.125, 0.529, 1.0], [0.33, 0.4, 0.67], [0.6, 0.5, 1.0], [0., 1., 0.], [1., 0., 0.], [0., 1., 1.], [1., 0., 1.], [1., 1., 0.], [1., 1., 1.], [.33, .33, .33], [.5, .5, .5], [.66, .66, .66]])

color_names = ['black', 'blue', 'darkblue', 'skyblue', 'greyblue', 'lilac', 'green', 'red', 'cyan', 'violet', 'yellow', 'white', 'darkgrey', 'mediumgrey', 'lightgrey']

initialize

s = Som((10, 10), learning_rate=0.3)

train

10 updates with 10 epochs = 100 updates to the parameters.

s.fit(X, num_epochs=10, updates_epoch=10)

predict: get the index of each best matching unit.

predictions = s.predict(X)

quantization error: how well do the best matching units fit?

quantization_error = s.quantization_error(X)

inversion: associate each node with the exemplar that fits best.

inverted = s.invert_projection(X, color_names)

Mapping: get weights, mapped to the grid points of the SOM

mapped = s.map_weights()

import matplotlib.pyplot as plt

plt.imshow(mapped)

Sequences

In this example, we will show that the RecursiveSOM is able to memorize short sequences which are generated by a markov chain. We will also demonstrate that the RecursiveSOM can generate sequences which are consistent with the sequences on which it has been trained.

.. code-block:: python

import numpy as np

from somber import RecursiveSom from string import ascii_lowercase

Dumb sequence generator.

def seq_gen(num_to_gen, probas):

  symbols = ascii_lowercase[:probas.shape[0]]
  identities = np.eye(probas.shape[0])
  seq = []
  ids = []
  r = 0
  choices = np.arange(probas.shape[0])
  for x in range(num_to_gen):
      r = np.random.choice(choices, p=probas[r])
      ids.append(symbols[r])
      seq.append(identities[r])

  return np.array(seq), ids

Transfer probabilities.

after an A, we have a 50% chance of B or C

after B, we have a 100% chance of A

after C, we have a 50% chance of B or C

therefore, we will never expect sequential A or B, but we do expect

sequential C.

probas = np.array(((0.0, 0.5, 0.5), (1.0, 0.0, 0.0), (0.0, 0.5, 0.5)))

X, ids = seq_gen(10000, probas)

initialize

alpha = contribution of non-recurrent part to the activation.

beta = contribution of recurrent part to activation.

higher alpha to beta ratio

s = RecursiveSom((10, 10), learning_rate=0.3, alpha=1.2, beta=.9)

train

show a progressbar.

s.fit(X, num_epochs=100, updates_epoch=10, show_progressbar=True)

predict: get the index of each best matching unit.

predictions = s.predict(X)

quantization error: how well do the best matching units fit?

quantization_error = s.quantization_error(X)

inversion: associate each node with the exemplar that fits best.

inverted = s.invert_projection(X, ids)

find which sequences are mapped to which neuron.

receptive_field = s.receptive_field(X, ids)

generate some data by starting from some position.

the position can be anything, but must have a dimensionality

equal to the number of weights.

starting_pos = np.ones(s.num_neurons) generated_indices = s.generate(50, starting_pos)

turn the generated indices into a sequence of symbols.

generated_seq = inverted[generated_indices]

TODO

See issues for TODOs/enhancements. If you use SOMBER, feel free to send me suggestions!

Contributors

Stéphan Tulkens

LICENSE

MIT

Open Source Agenda is not affiliated with "Somber" Project. README Source: stephantul/somber

Stars

Open Issues

Last Commit

1 year ago

Repository

stephantul/somber

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/somber"><img src="https://www.opensourceagenda.com/projects/somber/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022