Spiceai Versions Save

A unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets from any database, data warehouse, or data lake.

v0.3-alpha

2 years ago

Spice.ai v0.3-alpha

We are excited to announce the release of Spice.ai v0.3-alpha! 🎉

This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.

A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.3-alpha

Categorical data

In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.

Pre-v0.3, the manifest author specified numerical data using the fields node.

In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.

dataspaces:
  - from: event
    name: stream
    measurements:
      - name: duration
        selector: length_of_time
        fill: none
      - name: guest_count
        selector: num_guests
        fill: none
    categories:
      - name: event_type
        values:
          - dinner
          - party
      - name: target_audience
        values:
          - employees
          - investors
    tags:
      - tagA
      - tagB

Data visualizations preview

A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.

Listing pods

Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.

Coinbase data connector

A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader.

Tweet Recommendation Quickstart

A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.

Trader Sample

A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.

New in this release

  • Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
  • Changes Spicepod manifest fields node to measurements and add the categories node.
  • Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
  • Adds support for JSON content type when fetching from the /observations API. Previously, only CSV was supported.
  • Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
  • Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
  • Adds the ability to list loaded pods with the CLI command spice pods list.
  • Adds a new coinbase data connector for Coinbase Pro market prices.
  • Adds a new Tweet Recommendation Quickstart.
  • Adds a new Trader Sample.
  • Fixes bug where the /observations endpoint was not providing fully qualified field names.
  • Fixes issue where debugging messages were printed when using spice add.

v0.3-alpha-rc

2 years ago

This is the release candidate 0.3-alpha-rc

v0.2.1-alpha

2 years ago

Spice.ai v0.2.1-alpha

Announcing the release of Spice.ai v0.2.1-alpha! 🚚

This point release focuses on fixes and improvements to v0.2-alpha. Highlights include the ability to specify how missing data should be treated and a new production mode for spiced.

This release supports the ability to specify how the runtime should treat missing data. Previous releases filled missing data with the last value (or initial value) in the series. While this makes sense for some data, i.e., market prices of a stock or cryptocurrency, it does not make sense for discrete data, i.e., ratings. In v0.2.1, developers can now add the fill parameter on a dataspace field to specify the behavior. This release supports fill types previous and none. The default is previous.

Example in a manifest:

dataspaces:
  - from: twitter
    name: tweets
    fields:
      - name: likes
        fill: none # The new fill parameter

spiced now defaults to a new production mode when run standalone (not via the CLI), with development mode now explicitly set with the --development flag. Production mode does not activate development time features, such as the Spicepod file watcher. The CLI always runs spiced in development mode as it is not expected to be used in production deployments.

New in this release

  • Adds a fill parameter to dataspace fields to specify how missing values should be treated.
  • Adds the ability to specify the fill behavior of empty values in a dataspace.
  • Simplifies releases with a single spiceai release instead of separate spice and spiced releases.
  • Adds an explicit development mode to spiced. Production mode does not activate the file watcher.
  • Fixes a bug when the pod parameter epoch_time was not set which would cause data not to be sent to the AI engine.
  • Fixes a bug where the User-Agent was not set correctly from CLI calls to api.spicerack.org

v0.2.1-alpha-rc

2 years ago

This is the release candidate 0.2.1-alpha-rc

v0.2.1-alpha-rc-spiceai

2 years ago

This is the release candidate 0.2.1-alpha-rc

v0.2.1-alpha-rc-spiced

2 years ago

This is the release candidate 0.2.1-alpha-rc

v0.2.1-alpha-rc-spice

2 years ago

This is the release candidate 0.2.1-alpha-rc

v0.2-alpha-spice

2 years ago

v0.2-alpha-spiced

2 years ago

Spice.ai v0.2-alpha

We are excited to announce the release of Spice.ai v0.2-alpha! 🎉

This release is the first major version since the initial v0.1 announcement and includes significant improvements based upon community and early customer feedback. If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.2-alpha

Tagged data

In the first release, the runtime and AI engine could only ingest numerical data. In v0.2, tagged data is accepted and automatically encoded into fields available for learning. For example, it's now possible to include a "liked" tag when using tweet data, automatically encoded to a 0/1 field for training. Both CSV and the new JSON observation formats support tags. The v0.3 release will add additional support for sets of categorical data.

Streaming data

Previously, the runtime would trigger each data connector to fetch on a 15-second interval. In v0.2, we upgraded the interface for data connectors to a push/streaming model, which enables continuous streaming data into the environment and AI engine.

Interpreted data

Spice.ai works together with your application code and works best when it's provided continuous feedback. This feedback could be from the application itself, for example, ratings, likes, thumbs-up/down, profit from trades, or external expertise. The interpretations API was introduced in v0.1.1, and v0.2 adds AI engine support providing a way to give meaning or an interpretation of ranges of time-series data, which are then available within reward functions. For example, a time range of stock prices could be a "good time to buy," or perhaps Tuesday mornings is a "good time to tweet," and an application or expert can teach the AI engine this through interpretations providing a shortcut to it's learning.

New in this release

  • Adds core runtime and AI engine tagged data support
  • Adds tagged data support to the CSV processor
  • Adds streaming data support to the engine and data connectors
  • Adds a new JSON data processor for ingesting JSON data
  • Adds a new Twitter data connector with JSON processor support
  • Adds a new /pods//dataspaces API
  • Adds support for using interpretations in reward functions Learn more.
  • Adds support for downloading zipped pods from the spicerack.org registry
  • Adds support for adding data along with the pod manifest when adding a pod from the spicerack.org registry
  • Adds basic /pods//diagnostics API
  • Fixes pod period, interval, and granularity not being correctly set when trying to use a "d" format
  • Fixes the color scheme of action counts in the dashboard to improve readability

v0.1.1-alpha-spice

2 years ago

alpha