Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
kedro run
--telemetry
flag to kedro new
, allowing the user to register consent to have user analytics collected at the same time as the project is created.Pipeline
object creation and summing.toposort
in favour of the built-in graphlib
module.--verbose
flag.kedro pipeline create
and kedro pipeline delete
to read the base environment from the project settings.kedro catalog resolve
to read credentials properly.kedro pipeline create
from <project root>/src/tests/pipelines/<pipeline name>
to <project root>/tests/pipelines/<pipeline name>
..gitignore
to prevent pushing Mlflow local runs folder to a remote forge when using mlflow and git.node
-creation allowing self-dependencies when using transcoding, that is datasets named like name@format
._is_project
and _find_kedro_project
have been moved to kedro.utils
. We recommend not using private methods in your code, but if you do, please update your code to use the new location.merge_strategy
argument in OmegaConfigLoader.Many thanks to the following Kedroids for contributing PRs to this release:
%load_node
for Jupyter Notebook and Jupyter Lab.%load_node
and minimal support for Databricks.%load_node
.kedro catalog resolve
to work with dataset factories that use PartitionedDataset
._EPHEMERAL
attribute to AbstractDataset
and other Dataset classes that inherit from it.kedro-telemetry
and the data collected by it.Many thanks to the following Kedroids for contributing PRs to this release:
tools
.source_dir
explicitly in pyproject.toml
for non-src layout project.MemoryDataset
entries are now included in free outputs.ruff format
.SequentiallRunner
and ParallelRunner
.bootstrap_project
and configure_project
.kedro run
and hook execution order.Release 0.19.1
kedro-telemetry
by @merelcht in https://github.com/kedro-org/kedro/pull/3417
:rocket: Major Features and improvements
--starter
flag.--conf-source
option to %reload_kedro
, allowing users to specify a source for project configuration._ProjectSettings
. This enables the use of config loader as a standalone class without affecting existing Kedro Framework users.:beetle: Bug fixes and other changes
:boom: Breaking changes
kedro.io
(import them from kedro-datasets instead)KEDRO_LOGGING_CONFIG
.data_set
and DataSet to dataset and Dataset everywhere.create_default_data_set()
method in the Runner in favour of using dataset factories to create default dataset instances.:writing_hand: Documentation changes
New Contributors
Full Changelog: https://github.com/kedro-org/kedro/compare/0.18.14...0.19.0
:rotating_light: If you are upgrading from Kedro 0.18, have a look at the migration guide.
We welcome every community contribution, large or small. See what we're working on now and report bugs or suggest future features. Until next time, The Kedro Team :yellow_heart:
--template
flag for kedro pipeline create
or via template/pipeline
folder.runtime_params
resolver with OmegaConfigLoader
.OmegaConfigLoader
to handle paths containing dots outside of conf_source
.settings.py
optional.standalone-datacatalog
starter into its README file.kedro.extras.datasets
). Install and import them from the kedro-datasets
package instead.DataSet
are deprecated and will be removed in Kedro 0.19.0
and kedro-datasets
2.0.0
. Instead, use the updated class names ending with Dataset
.pandas-iris
, pyspark-iris
, pyspark
, and standalone-datacatalog
are deprecated and will be archived in Kedro 0.19.0.PartitionedDataset
and IncrementalDataset
have been moved to kedro-datasets
and will be removed in Kedro 0.19.0
. Install and import them from the kedro-datasets
package instead.Many thanks to the following Kedroids for contributing PRs to this release:
OmegaConfigLoader
features:
OmegaConfigLoader
through CONFIG_LOADER_ARGS
.OmegaConfigLoader
.kedro catalog resolve
CLI command that resolves dataset factories in the catalog with any explicit entries in the project pipeline.conf/
structure for modular pipelines, and accordingly, updated the kedro pipeline create
and kedro catalog create
command.OmegaConfigLoader
.setup.py
in new Kedro project template and Kedro starters to pyproject.toml
and moved flake8 configuration
to dedicated file .flake8
.conf/
structure.OmegaConfigLoader
to ignore config from hidden directories like .ipynb_checkpoints
.data
section to restructure beginner and advanced pages about the Data Catalog and datasets.ConfigLoader
and the TemplatedConfigLoader
to the OmegaConfigLoader
. The ConfigLoader
and the TemplatedConfigLoader
are deprecated and will be removed in the 0.19.0
release.pytables
to 3.8.0
due to compatibility issues.pyspark
at <3.4 due to breaking changes in 3.4.moto
version now supports parallel test execution for Python 3.10, resolving previous issues.kedro.io
; only the module where they are defined is listed as the location.Type | Deprecated Alias | Location |
---|---|---|
AbstractDataset |
AbstractDataSet |
kedro.io.core |
AbstractVersionedDataset |
AbstractVersionedDataSet |
kedro.io.core |
layer
attribute at the top level is deprecated; it will be removed in Kedro version 0.19.0. Please move layer
inside the metadata
-> kedro-viz
attributes.Thanks to Laíza Milena Scheid Parizotto and Jonathan Cohen.
OmegaConfigLoader
except for oc.env
.kedro catalog rank
CLI command that ranks dataset factories in the catalog by matching priority.pyproject.toml
.kedro catalog list
to show datasets generated with factories.ruff
as the linter and removed mentions of pylint
, isort
, flake8
.Thanks to Laíza Milena Scheid Parizotto and Chris Schopp.
ConfigLoader
and TemplatedConfigLoader
will be deprecated. Please use OmegaConfigLoader
instead.