Open source platform for the machine learning lifecycle
We are happy to announce the availability of MLflow 1.30.0!
MLflow 1.30.0 includes several major features and improvements
Features:
Delta
tables as a datasource in the ingest step (#7010, @sunishsheth2009)run_name
attribute for create_run
, get_run
and update_run
APIs (#6782, #6798 @apurva-koti)creation_time
and last_update_time
for the search_experiments
API (#6979, @harupy)run_id IN
and run ID NOT IN
for the search_runs
API (#6945, @harupy)user_id
and end_time
for the search_runs
API (#6881, #6880 @subramaniam02)run_name
and run_id
for the search_runs
API (#6899, @harupy; #6952, @alexacole)name
attribute and mlflow.runName
tag (#6971, @BenWilson2)update_run()
API for modifying the status
and name
attributes of existing runs (#7013, @gabrielfu)mlflow gc
cli API (#6977, @shaikmoeed)evaluate()
API (#6728, @jerrylian-db)evaluate()
API (#7077, @dbczumar)BooleanType
to mlflow.pyfunc.spark_udf()
(#6913, @BenWilson2)Pool
class options for SqlAlchemyStore
(#6883, @mingyu89)Bug fixes:
SparkSession
if one does not exist (#6846, @prithvikannan)bool
column types in Step Card data profiles (#6907, @sunishsheth2009)mlflow.pyspark.ml.autolog()
(#6831, @harupy)mlflow-skinny
package to serve as base requirement in MLmodel
requirements (#6974, @BenWilson2)pos_label
to sklearn.metrics.precision_recall_curve
in mlflow.evaluate()
(#6854, @dbczumar)SqlAlchemyStore
where set_tag()
updates the incorrect tags (#7027, @gabrielfu)Documentation updates:
Keras
serialization format (#7022, @balvisio)Small bug fixes and documentation updates:
#7093, #7095, #7092, #7064, #7049, #6921, #6920, #6940, #6926, #6923, #6862, @jerrylian-db; #6946, #6954, #6938, @mingyu89; #7047, #7087, #7056, #6936, #6925, #6892, #6860, #6828, @sunishsheth2009; #7061, #7058, #7098, #7071, #7073, #7057, #7038, #7029, #6918, #6993, #6944, #6976, #6960, #6933, #6943, #6941, #6900, #6901, #6898, #6890, #6888, #6886, #6887, #6885, #6884, #6849, #6835, #6834, @harupy; #7094, #7065, #7053, #7026, #7034, #7021, #7020, #6999, #6998, #6996, #6990, #6989, #6934, #6924, #6896, #6895, #6876, #6875, #6861, @prithvikannan; #7081, #7030, #7031, #6965, #6750, @bbarnes52; #7080, #7069, #7051, #7039, #7012, #7004, @dbczumar; #7054, @jinzhang21; #7055, #7037, #7036, #6949, #6951, @apurva-koti; #6815, @michaguenther; #6897, @chaturvedakash; #7025, #6981, #6950, #6948, #6937, #6829, #6830, @BenWilson2; #6982, @vadim; #6985, #6927, @kriscon-db; #6917, #6919, #6872, #6855, @WeichenXu123; #6980, @utkarsh867; #6973, #6935, @wentinghu; #6930, @mingyangge-db; #6956, @RohanBha1; #6916, @av-maslov; #6824, @shrinath-suresh; #6732, @oojo12; #6807, @ikrizanic; #7066, @subramaniam20jan; #7043, @AvikantSrivastava; #6879, @jspablo
We are happy to announce the availability of MLflow 1.29.0!
MLflow 1.29.0 includes several major features and improvements
Features:
[Pipelines] Improve performance and fidelity of dataset profiling in the scikit-learn regression Pipeline (#6792, @sunishsheth2009) [Pipelines] Add an mlflow pipelines get-artifact CLI for retrieving Pipeline artifacts (#6517, @prithvikannan) [Pipelines] Introduce an option for skipping dataset profiling to the scikit-learn regression Pipeline (#6456, @apurva-koti) [Pipelines / UI] Display an mlflow pipelines CLI command for reproducing a Pipeline run in the MLflow UI (#6376, @hubertzub-db) [Tracking] Automatically generate friendly names for Runs if not supplied by the user (#6736, @BenWilson2) [Tracking] Add load_text(), load_image() and load_dict() fluent APIs for convenient artifact loading (#6475, @subramaniam02) [Tracking] Add creation_time and last_update_time attributes to the Experiment class (#6756, @subramaniam02) [Tracking] Add official MLflow Tracking Server Dockerfiles to the MLflow repository (#6731, @oojo12) [Tracking] Add searchExperiments API to Java client and deprecate listExperiments (#6561, @dbczumar) [Tracking] Add mlflow_search_experiments API to R client and deprecate mlflow_list_experiments (#6576, @dbczumar) [UI] Make URLs clickable in the MLflow Tracking UI (#6526, @marijncv) [UI] Introduce support for csv data preview within the artifact viewer pane (#6567, @nnethery) [Model Registry / Models] Introduce mlflow.models.add_libraries_to_model() API for adding libraries to an MLflow Model (#6586, @arjundc-db) [Models] Add model validation support to mlflow.evaluate() (#6582, @zhe-db, @jerrylian-db) [Models] Introduce sample_weights support to mlflow.evaluate() (#6806, @dbczumar) [Models] Add pos_label support to mlflow.evaluate() for identifying the positive class (#6696, @harupy) [Models] Make the metric name prefix and dataset info configurable in mlflow.evaluate() (#6593, @dbczumar) [Models] Add utility for validating the compatibility of a dataset with a model signature (#6494, @serena-ruan) [Models] Add predict_proba() support to the pyfunc representation of scikit-learn models (#6631, @skylarbpayne) [Models] Add support for Decimal type inference to MLflow Model schemas (#6600, @shitaoli-db) [Models] Add new CLI command for generating Dockerfiles for model serving (#6591, @anuarkaliyev23) [Scoring] Add /health endpoint to scoring server (#6574, @gabriel-milan) [Scoring] Support specifying a variant_name during Sagemaker deployment (#6486, @nfarley-soaren) [Scoring] Support specifying a data_capture_config during SageMaker deployment (#6423, @jonwiggins)
Bug fixes:
[Tracking] Make Run and Experiment deletion and restoration idempotent (#6641, @dbczumar) [UI] Fix an alignment bug affecting the Experiments list in the MLflow UI (#6569, @sunishsheth2009) [Models] Fix a regression in the directory path structure of logged Spark Models that occurred in MLflow 1.28.0 (#6683, @gwy1995) [Models] No longer reload the main module when loading model code (#6647, @Jooakim) [Artifacts] Fix an mlflow server compatibility issue with HDFS when running in --serve-artifacts mode (#6482, @shidianshifen) [Scoring] Fix an inference failure with 1-dimensional tensor inputs in TensorFlow and Keras (#6796, @LiamConnell)
Documentation updates:
[Tracking] Mark the SearchExperiments API as stable (#6551, @dbczumar) [Tracking / Model Registry] Deprecate the ListExperiments, ListRegisteredModels, and list_run_infos() APIs (#6550, @dbczumar) [Scoring] Deprecate mlflow.sagemaker.deploy() in favor of SageMakerDeploymentClient.create() (#6651, @dbczumar) Small bug fixes and documentation updates:
#6803, #6804, #6801, #6791, #6772, #6745, #6762, #6760, #6761, #6741, #6725, #6720, #6666, #6708, #6717, #6704, #6711, #6710, #6706, #6699, #6700, #6702, #6701, #6685, #6664, #6644, #6653, #6629, #6639, #6624, #6565, #6558, #6557, #6552, #6549, #6534, #6533, #6516, #6514, #6506, #6509, #6505, #6492, #6490, #6478, #6481, #6464, #6463, #6460, #6461, @harupy; #6810, #6809, #6727, #6648, @BenWilson2; #6808, #6766, #6729, @jerrylian-db; #6781, #6694, @marijncv; #6580, #6661, @bbarnes52; #6778, #6687, #6623, @shraddhafalane; #6662, #6737, #6612, #6595, @sunishsheth2009; #6777, @aviralsharma07; #6665, #6743, #6573, @liangz1; #6784, @apurva-koti; #6753, #6751, @mingyu89; #6690, #6455, #6484, @kriscon-db; #6465, #6689, @hubertzub-db; #6721, @WeichenXu123; #6722, #6718, #6668, #6663, #6621, #6547, #6508, #6474, #6452, @dbczumar; #6555, #6584, #6543, #6542, #6521, @dsgibbons; #6634, #6596, #6563, #6495, @prithvikannan; #6571, @smurching; #6630, #6483, @serena-ruan; #6642, @thinkall; #6614, #6597, @jinzhang21; #6457, @cnphil; #6570, #6559, @kumaryogesh17; #6560, #6540, @iamthen0ise; #6544, @Monkero; #6438, @ahlag; #3292, @dolfinus; #6637, @ninabacc-db; #6632, @arpitjasa-db
MLflow 1.28.0 includes several major features and improvements:
Features:
pipeline.yaml
configurations to specify the Model Registry backend used for model registration (#6284, @sunishsheth2009)transform
step of the scikit-learn regression pipeline (#6362, @sunishsheth2009)mlflow.search_experiments()
API for searching experiments by name and by tags (#6333, @WeichenXu123; #6227, #6172, #6154, @harupy)--older-than
flag to mlflow gc
for removing runs based on deletion time (#6354, @Jason-CKY)MLFLOW_SQLALCHEMYSTORE_POOL_RECYCLE
environment variable for recycling SQLAlchemy connections (#6344, @postrational)MlflowClient
importable as mlflow.MlflowClient
(#6085, @subramaniam02)stage
parameter to set_model_version_tag()
(#6185, @subramaniam02)--registry-store-uri
flag to mlflow server
for specifying the Model Registry backend URI (#6142, @Secbone)model_uri
optional in mlflow models build-docker
to support building generic model serving images (#6302, @harupy)Bug fixes and documentation updates:
xdg-open
instead of open
for viewing Pipeline results on Linux systems (#6326, @strangiato)mlflow.pyspark.ml.autolog()
to only log model signatures for supported input / output data types (#6365, @harupy)mlflow.tensorflow.autolog()
to log TensorFlow early stopping callback info when log_models=False
is specified (#6170, @WeichenXu123)mlflow.sklearn.autolog()
for models containing transformers (#6230, @dbczumar)mlflow gc
that occurred when removing a run whose artifacts had been previously deleted (#6165, @dbczumar)sqlparse
library to MLflow Skinny client, which is required for search support (#6174, @dbczumar)mlflow server
bug that rejected parameters and tags with empty string values (#6179, @dbczumar)--serve-arifacts
enabled (#6355, @abbas123456)mlflow deployments predict
CLI (#6323, @dbczumar)mlflow.pyfunc.spark_udf()
(#6244, @harupy)MlflowClient
from mlflow.tracking
to mlflow.client
(#6405, @dbczumar)CONTRIBUTING.rst
(#6330, @ahlag)Small bug fixes and doc updates (#6322, #6321, #6213, @KarthikKothareddy; #6409, #6408, #6396, #6402, #6399, #6398, #6397, #6390, #6381, #6386, #6385, #6373, #6375, #6380, #6374, #6372, #6363, #6353, #6352, #6350, #6351, #6349, #6347, #6287, #6341, #6342, #6340, #6338, #6319, #6314, #6316, #6317, #6318, #6315, #6313, #6311, #6300, #6292, #6291, #6289, #6290, #6278, #6279, #6276, #6272, #6252, #6243, #6250, #6242, #6241, #6240, #6224, #6220, #6208, #6219, #6207, #6171, #6206, #6199, #6196, #6191, #6190, #6175, #6167, #6161, #6160, #6153, @harupy; #6193, @jwgwalton; #6304, #6239, #6234, #6229, @sunishsheth2009; #6258, @xanderwebs; #6106, @balvisio; #6303, @bbarnes52; #6117, @wenfeiy-db; #6389, #6214, @apurva-koti; #6412, #6420, #6277, #6266, #6260, #6148, @WeichenXu123; #6120, @ameya-parab; #6281, @nathaneastwood; #6426, #6415, #6417, #6418, #6257, #6182, #6157, @dbczumar; #6189, @shrinath-suresh; #6309, @SamirPS; #5897, @temporaer; #6251, @herrmann; #6198, @sniafas; #6368, #6158, @jinzhang21; #6236, @subramaniam02; #6036, @serena-ruan; #6430, @ninabacc-db)
Note: Version 1.28.0 of the MLflow R package has not yet been released. It will be available on CRAN within the next week.
MLflow 1.27.0 includes several major features and improvements:
[Pipelines] With MLflow 1.27.0, we are excited to announce the release of MLflow Pipelines, an opinionated framework for structuring MLOps workflows that simplifies and standardizes machine learning application development and productionization. MLflow Pipelines makes it easy for data scientists to follow best practices for creating production-ready ML deliverables, allowing them to focus on developing excellent models. MLflow Pipelines also enables ML engineers and DevOps teams to seamlessly deploy models to production and incorporate them into applications. To get started with MLflow Pipelines, check out the docs at https://mlflow.org/docs/latest/pipelines.html. (#6115)
[UI] Introduce UI support for searching and comparing runs across multiple Experiments (#5971, @r3stl355)
More features:
ndarray
and tensor instances as metrics via the mlflow.log_metric()
API (#5756, @ntakouris)CatBoostRanker
models to the mlflow.catboost
flavor (#6032, @danielgafni)KernelExplainer
with mlflow.evaluate()
, enabling model explanations on categorical data (#6044, #5920, @WeichenXu123)mlflow.evaluate()
to automatically log the score()
outputs of scikit-learn models as metrics (#5935, #5903, @WeichenXu123)Bug fixes and documentation updates:
sqlalchemy>=1.4.0
upon MLflow installation, which is necessary for usage of SQL-based MLflow Tracking backends (#6024, @sniafas)mlflow server
to reject LogParam
API requests containing empty string values (#6031, @harupy)matplotlib
was not installed on the host system (#5995, @fa9r)tf.data.Dataset
inputs (#6061, @dbczumar)mlflow.sklearn.model()
did not properly restore bundled model code (#6037, @WeichenXu123)mlflow.evaluate()
that caused input data objects to be mutated when evaluating certain scikit-learn models (#6141, @dbczumar)mlflow.pyfunc.spark_udf
that occurred when the UDF was invoked on an empty RDD partition (#6063, @WeichenXu123)mlflow models build-docker
that occurred when env-manager=local
was specified (#6046, @bneijt)master
branch (#5889, @harupy)Small bug fixes and doc updates (#6041, @drsantos89; #6138, #6137, #6132, @sunishsheth2009; #6144, #6124, #6125, #6123, #6057, #6060, #6050, #6038, #6029, #6030, #6025, #6018, #6019, #5962, #5974, #5972, #5957, #5947, #5907, #5938, #5906, #5932, #5919, #5914, #5888, #5890, #5886, #5873, #5865, #5843, @harupy; #6113, @comojin1994; #5930, @yashaswikakumanu; #5837, @shrinath-suresh; #6067, @deepyaman; #5997, @idlefella; #6021, @BenWilson2; #5984, @Sumanth077; #5929, @krunal16-c; #5879, @kugland; #5875, @ognis1205; #6006, @ryanrussell; #6140, @jinzhang21; #5983, @elk15; #6022, @apurva-koti; #5982, @EB-Joel; #5981, #5980, @punitkashyup; #6103, @ikrizanic; #5988, #5969, @SaumyaBhushan; #6020, #5991, @WeichenXu123; #5910, #5912, @Dark-Knight11; #6005, @Asinsa; #6023, @subramaniam02; #5999, @Regis-Caelum; #6007, @CaioCavalcanti; #5943, @kvaithin; #6017, #6002, @NeoKish; #6111, @T1b4lt; #5986, @seyyidibrahimgulec; #6053, @Zohair-coder; #6146, #6145, #6143, #6139, #6134, #6136, #6135, #6133, #6071, #6070, @dbczumar; #6026, @rotate2050)
MLflow 1.26.1 is a patch release containing the following bug fixes:
protobuf >= 4.21.0
(#5945, @harupy)get_model_dependencies
behavior for models:
URIs containing artifact paths (#5921, @harupy)artifacts
persistence in mlflow.pyfunc.log_model()
that was introduced in MLflow 1.25.0 (#5891, @kyle-jarvis)EvaluationArtifact
outputs from mlflow.evaluate()
are garbage collected (#5900, @WeichenXu123)Small bug fixes and updates (#5874, #5942, #5941, #5940, #5938, @harupy; #5893, @PrajwalBorkar; #5909, @yashaswikakumanu; #5937, @BenWilson2)
MLflow 1.26.0 includes several major features and improvements:
Features:
mlflow.set_tracking_uri
to add support for paths defined as pathlib.Path
in addition to existing str
path declarations (#5824, @cacharle)pos_label
argument for eval_and_log_metrics
API to support accurate binary classifier evaluation metrics (#5807, @yxiong)input_example
and signature
logging for pyspark ml flavor when using autologging (#5719, @bali0019)virtualenv
environment manager support for mlflow models docker-build
CLI (#5728, @harupy)virtualenv
environment manager support for MLflow projects (#5631, @harupy)virtualenv
environment manager support for MLflow Models (#5380, @harupy)virtualenv
environment manager support for mlflow.pyfunc.spark_udf
(#5676, @WeichenXu123)input_example
and signature
logging for tensorflow
flavor when using autologging (#5510, @bali0019)endpoint
interface for mlflow deployments (#5378, @trangevi)End Time
and Duration
fields to run comparison page (#3378, @RealArpanBhattacharya)Bug fixes and documentation updates:
ag-grid
and implement getRowId
to improve performance in the runs table visualization (#5725, @adamreeve)tf-serving
parsing to support columnar-based formatting (#5825, @arjundc-db)log_artifact
to support models larger than 2GB in HDFS (#5812, @hitchhicker)lightgbm
metric names with "@" symbols within their names (#5785, @mengchendd)virtualenv
environment manager support for MLflow projects (#5727, @harupy)tensorflow
flavor (#5683, @MarkYHZhang)SqlAlchemyStore.log_batch
implementation to make it log data in batches (#5460, @erensahin)Small bug fixes and doc updates (#5858, #5859, #5853, #5854, #5845, #5829, #5842, #5834, #5795, #5777, #5794, #5766, #5778, #5765, #5763, #5768, #5769, #5760, #5727, #5748, #5726, #5721, #5711, #5710, #5708, #5703, #5702, #5696, #5695, #5669, #5670, #5668, #5661, #5638, @harupy; #5749, @arpitjasa-db; #5675, @Davidswinkels; #5803, #5797, @ahlag; #5743, @kzhang01; #5650, #5805, #5724, #5720, #5662, @BenWilson2; #5627, @cterrelljones; #5646, @kutal10; #5758, @davideli-db; #5810, @rahulporuri; #5816, #5764, @shrinath-suresh; #5869, #5715, #5737, #5752, #5677, #5636, @WeichenXu123; #5735, @subramaniam02; #5746, @akaigraham; #5734, #5685, @lucalves; #5761, @marcelatoffernet; #5707, @aashish-khub; #5808, @ketangangal; #5730, #5700, @shaikmoeed; #5775, @dbczumar; #5747, @zhixuanevelynwu)
Note: Version 1.26.0 of the MLflow R package has not yet been released. It will be available on CRAN within the next week.
MLflow 1.25.1 is a patch release containing the following bug fixes:
pyfunc
artifact overwrite bug when multiple artifacts are saved in sub-directories (#5657, @kyle-jarvis)Note: Version 1.25.1 of the MLflow R package has not yet been released. It will be available on CRAN within the next week.
MLflow 1.25.0 includes several major features and improvements:
Features:
mlflow.last_active_run()
that provides the most recent fluent active run (#5584, @MarkYHZhang)experiment_names
argument to the mlflow.search_runs()
API to support searching runs by experiment names (#5564, @r3stl355)description
parameter to mlflow.start_run()
(#5534, @dogeplusplus)log_every_n_step
parameter to mlflow.pytorch.autolog()
to control metric logging frequency (#5516, @adamreeve)pyspark.ml.param.Params
values as MLflow parameters during PySpark autologging (#5481, @serena-ruan)pyspark.ml.Transformer
s to PySpark autologging (#5466, @serena-ruan)mlflow.diviner
flavor for large-scale time series forecasting (#5553, @BenWilson2)pyfunc.get_model_dependencies()
API to retrieve reproducible environment specifications for MLflow Models with the pyfunc flavor (#5503, @WeichenXu123)code_paths
argument to all model flavors to support packaging custom module code with MLflow Models (#5448, @stevenchen-db)mlflow.evaluate()
(#5405, #5476 @MarkYHZhang)mlflow_version
field to MLModel specification (#5515, #5576, @r3stl355)--env-manager
configuration for specifying environment restoration tools (e.g. conda
) and deprecate --no-conda
(#5567, @harupy)mlflow.pyfunc.spark_udf()
to ensure accurate predictions (#5487, #5561, @WeichenXu123)numpy.ndarray
type inputs to the TensorFlow pyfunc predict()
function (#5545, @WeichenXu123)mlflow.artifacts.download_artifacts()
API mirroring the functionality of the mlflow artifacts download
CLI (#5585, @dbczumar)Bug fixes and documentation updates:
run_uuid
for PostgreSQL to improve query performance (#5446, @harupy)split
orientation for DataFrame inputs to SageMaker deployment predict()
API to preserve column ordering (#5522, @dbczumar)mlflow-skinny
client that caused mlflow --version
to fail (#5573, @BenWilson2)mlflow-azureml
package (#5491, @santiagxf)Small bug fixes and doc updates (#5591, #5629, #5597, #5592, #5562, #5477, @BenWilson2; #5554, @juntai-zheng; #5570, @tahesse; #5605, @guelate; #5633, #5632, #5625, #5623, #5615, #5608, #5600, #5603, #5602, #5596, #5587, #5586, #5580, #5577, #5568, #5290, #5556, #5560, #5557, #5548, #5547, #5538, #5513, #5505, #5464, #5495, #5488, #5485, #5468, #5455, #5453, #5454, #5452, #5445, #5431, @harupy; #5640, @nchittela; #5520, #5422, @Ark-kun; #5639, #5604, @nishipy; #5543, #5532, #5447, #5435, @WeichenXu123; #5502, @singankit; #5500, @Sohamkayal4103; #5449, #5442, @apurva-koti; #5552, @vinijaiswal; #5511, @adamreeve; #5428, @jinzhang21; #5309, @sunishsheth2009; #5581, #5559, @Kr4is; #5626, #5618, #5529, @sisp; #5652, #5624, #5622, #5613, #5509, #5459, #5437, @dbczumar; #5616, @liangz1)
MLflow 1.24.0 includes several major features and improvements:
Features:
mlflow server --serve-artifacts
(#5320, @BenWilson2, @harupy)registered_model_name
argument to mlflow.autolog()
for automatic model registration during autologging (#5395, @WeichenXu123)mlflow.pmdarima
flavor for pmdarima models (#5373, @BenWilson2)mlflow.evaluate()
(#5389, @MarkYHZhang)Bug fixes and documentation updates:
--serve-artifacts
mode (#5409, @dbczumar)--serve-artifacts
mode (#5370, @TimNooren)--serve-artifacts
mode (#5384, #5385, @mert-kirpici)mlflow.log_figure()
was used without matplotlib.figure
imported (#5406, @WeichenXu123)@
symbol during autologging (#5403, @maxfriedrich)mlflow.spark.log_model()
is called (#5355, @szczeles)mlflow.pyfunc.load_model()
(#5317, @ecm200)mlflow.evaluate()
(#5333, @WeichenXu123)Small bug fixes and doc updates (#5298, @wamartin-aml; #5399, #5321, #5313, #5307, #5305, #5268, #5284, @harupy; #5329, @Ark-kun; #5375, #5346, #5304, @dbczumar; #5401, #5366, #5345, @BenWilson2; #5326, #5315, @WeichenXu123; #5236, @singankit; #5302, @timvink; #5357, @maitre-matt; #5347, #5344, @mehtayogita; #5367, @apurva-koti; #5348, #5328, #5310, @liangz1; #5267, @sunishsheth2009)
Note: Version 1.24.0 of the MLflow R package has not yet been released. It will be available on CRAN within the next week.