Seaborn Versions Save

Statistical data visualization in Python

v0.13.2

3 months ago

This is a minor release containing internal changes that adapt to upcoming deprecations in pandas. All users are encouraged to update.

v0.13.1

3 months ago

This is a minor release with some bug fixes and a couple new features. All users are encouraged to update.

  • |Feature| Added support for weighted mean estimation (with boostrap CIs) in lineplot, barplot, pointplot, and objects.Est (#3580, #3586).

  • |Feature| Added the extent option in objects.Plot.layout (#3552).

  • |Fix| Fixed a regression in v0.13.0 that triggered an exception when working with non-numpy data types (#3516).

  • |Fix| Fixed a bug in objects.Plot so that tick labels are shown for wrapped axes that aren't in the bottom-most row (#3600).

  • |Fix| Fixed a bug in catplot where a blank legend would be added when hue was redundantly assigned (#3540).

  • |Fix| Fixed a bug in catplot where the edgecolor parameter was ignored with kind="bar" (#3547).

  • |Fix| Fixed a bug in boxplot where an exception was raised when using the matplotlib bootstrap option (#3562).

  • |Fix| Fixed a bug in lineplot where an exception was raised when hue was assigned with an empty dataframe (#3569).

  • |Fix| Fixed a bug in multiple categorical plots that raised with hue=None and dodge=True; this is now has no effect (#3605).

v0.13.0

6 months ago

See the online docs for an annotated version of these notes with working links.

This is a major release with a number of important new features and changes. The highlight is a major overhaul to seaborn's categorical plotting functions, providing them with many new capabilities and better aligning their API with the rest of the library. There is also provisional support for alternate dataframe libraries like polars, a new theme and display configuration system for objects.Plot, and many smaller bugfixes and enhancements.

Updating is recommended, but users are encouraged to carefully check the outputs of existing code that uses the categorical functions, and they should be aware of some deprecations and intentional changes to the default appearance of the resulting plots (see notes below with and tags).

Major enhancements to categorical plots

Seaborn's categorical functions <categorical_api> have been completely rewritten for this release. This provided the opportunity to address some longstanding quirks as well as to add a number of smaller but much-desired features and enhancements.

Support for numeric and datetime data

The categorical functions have historically treated all data as categorical, even when it has a numeric or datetime type. This can now be controlled with the new native_scale parameter. The default remains False to preserve existing behavior. But with native_scale=True, values will be treated as they would by other seaborn or matplotlib functions. Element widths will be derived from the minimum distance between two unique values on the categorical axis.

Additionally, while seaborn previously determined the mapping from categorical values to ordinal positions internally, this is now delegated to matplotlib. The change should mostly be transparent to the user, but categorical plots (even with native_scale=False) will better align with artists added by other seaborn or matplotlib functions in most cases, and matplotlib's interactive machinery will work better.

Changes to color defaults and specification

The categorical functions now act more like the rest of seaborn in that they will produce a plot with a single main color unless the hue variable is assigned. Previously, there would be an implicit redundant color mapping (e.g., each box in a boxplot would get a separate color from the default palette). To retain the previous behavior, explicitly assign a redundant hue variable (e.g., boxplot(data, x="x", y="y", hue="x")).

Two related idiosyncratic color specifications are deprecated, but they will continue to work (with a warning) for one release cycle:

  • Passing a palette without explicitly assigning hue is no longer supported (add an explicitly redundant hue assignment instead).
  • Passing a color while assigning hue to produce a gradient is no longer supported (use palette="dark:{color}" or palette="light:{color}" instead).

Finally, like other seaborn functions, the default palette now depends on the variable type, and a sequential palette will be used with numeric data. To retain the previous behavior, pass the name of a qualitative palette (e.g., palette="deep" for seaborn's default). Accordingly, the functions have gained a parameter to control numeric color mappings (hue_norm).

Other features, enhancements, and changes

The following updates apply to multiple categorical functions.

  • All functions now accept a legend parameter, which can be a boolean (to suppress the legend) or one of {"auto", "brief", "full"} to control the amount of information shown in the legend for a numerical color mapping.
  • All functions now accept a callable formatter parameter to control the string representation of the data.
  • All functions that draw a solid patch now accept a boolean fill parameter, which when set to False will draw line-art elements.
  • All functions that support dodging now have an additional gap parameter that can be set to a non-zero value to leave space between dodged elements.
  • The boxplot, boxenplot, and violinplot functions now support a single linecolor parameter.
  • The default value for dodge has changed from True to "auto". With "auto", elements will dodge only when at least one set of elements would otherwise overlap.
  • When the value axis of the plot has a non-linear scale, the statistical operations (e.g. an aggregation in pointplot or the kernel density fit in violinplot) are now applied in that scale space.
  • All functions now accept a log_scale parameter. With a single argument, this will set the scale on the "value" axis (opposite the categorical axis). A tuple will set each axis directly (although setting a log scale categorical axis also requires native_scale=True).
  • The orient parameter now accepts "x"/"y" to specify the categorical axis, matching the objects interface.
  • The categorical functions are generally more deferential to the user's additional matplotlib keyword arguments.
  • Using "gray" to select an automatic gray value that complements the main palette is now deprecated in favor of "auto".

The following updates are function-specific.

  • In pointplot, a single matplotlib.lines.Line2D artist is now used rather than adding separate matplotlib.collections.PathCollection artist for the points. As a result, it is now possible to pass additional keyword arguments for complete customization the appearance of both the lines and markers; additionally, the legend representation is improved. Accordingly, parameters that previously allowed only partial customization (scale, join, and errwidth) are now deprecated. The old parameters will now trigger detailed warning messages with instructions for adapting existing code.
  • The bandwidth specification in violinplot better aligns with kdeplot, as the bw parameter is now deprecated in favor of bw_method and bw_adjust.
  • In boxenplot, the boxen are now drawn with separate patch artists in each tail. This may have consequences for code that works with the underlying artists, but it produces a better result for low-alpha / unfilled plots and enables proper area/density scaling.
  • In barplot, the errcolor and errwidth parameters are now deprecated in favor of a more general err_kws` dictionary. The existing parameters will continue to work for two releases.
  • In violinplot, the scale and scale_hue parameters have been renamed to density_norm and common_norm for clarity and to reflect the fact that common normalization is now applied over both hue and faceting variables in catplot.
  • In boxenplot, the scale parameter has been renamed to width_method as part of a broader effort to de-confound the meaning of "scale" in seaborn parameters.
  • When passing a vector to the data parameter of barplot or pointplot, a bar or point will be drawn for each entry in the vector rather than plotting a single aggregated value. To retain the previous behavior, assign the vector to the y variable.
  • In boxplot, the default flier marker now follows the matplotlib rcparams so that it can be globally customized.
  • When using split=True and inner="box" in violinplot, a separate mini-box is now drawn for each split violin.
  • In boxenplot, all plots now use a consistent luminance ramp for the different box levels. This leads to a change in the appearance of existing plots, but reduces the chances of a misleading result.
  • The "area" scaling in boxenplot now approximates the density of the underlying observations, including for asymmetric distributions. This produces a substantial change in the appearance of plots with width_method="area", although the existing behavior was poorly defined.
  • In countplot, the new stat parameter can be used to apply a normalization (e.g to show a "percent" or "proportion").
  • The split parameter in violinplot is now more general and can be set to True regardless of the number of hue variable levels (or even without hue). This is probably most useful for showing half violins.
  • In violinplot, the new inner_kws parameter allows additional control over the interior artists.
  • It is no longer required to use a DataFrame in catplot, as data vectors can now be passed directly.
  • In boxplot, the artists that comprise each box plot are now packaged in a BoxPlotContainer for easier post-plotting access.

Support for alternate dataframe libraries

  • Nearly all functions / objects now use the dataframe exchange protocol to accept DataFrame objects from libraries other than pandas (e.g. polars). Note that seaborn will still convert the data object to pandas internally, but this feature will simplify code for users of other dataframe libraries (3369).

Improved configuration for the objects interface

  • Added control over the default theme to objects.Plot (3223)
  • Added control over the default notebook display to objects.Plot (3225).
  • Added the concept of a "layer legend" in objects.Plot via the new label parameter in objects.Plot.add (3456).
  • In objects.Plot.scale, objects.Plot.limit, and objects.Plot.label the x / y parameters can be used to set a common scale / limit / label for paired subplots (3458).

Other updates

  • Improved the legend display for relational and categorical functions to better represent the user's additional keyword arguments (3467).
  • In ecdfplot, stat="percent" is now a valid option (3336).
  • Data values outside the scale transform domain (e.g. non-positive values with a log scale) are now dropped prior to any statistical operations (3488).
  • In histplot, infinite values are now ignored when choosing the default bin range (3488).
  • There is now generalized support for performing statistics in the appropriate space based on axes scales; previously support for this was spotty and at best worked only for log scales (3440).
  • Updated load_dataset to use an approach more compatible with pyiodide (3234).
  • Support for array-typed palettes is now deprecated. This was not previously documented as supported, but it worked by accident in a few places (3452).
  • In histplot, treatment of the binwidth parameter has changed such that the actual bin width will be only approximately equal to the requested width when that value does not evenly divide the bin range. This fixes an issue where the largest data value was sometimes dropped due to floating point error (3489).
  • Fixed objects.Bar and objects.Bars widths when using a nonlinear scale (3217).
  • Worked around an issue in matplotlib that caused incorrect results in move_legend when labels were provided (3454).
  • Fixed a bug introduced in v0.12.0 where histplot added a stray empty BarContainer (3246).
  • Fixed a bug where objects.Plot.on would override a figure's layout engine (3216).
  • Fixed a bug introduced in v0.12.0 where lineplot with a list of tuples for the keyword argument dashes caused a TypeError (3316).
  • Fixed a bug in PairGrid that caused an exception when the input dataframe had a column multiindex (3407).
  • Improved a few edge cases when using pandas nullable dtypes (3394).

v0.13.0rc0

7 months ago

This is a release candidate for seaborn v0.13.0, a major release with a complete overhaul of seaborn's categorical plotting functions.

Please test the release candidate, especially the categorical plots. The internals of these functions have been completely rewritten to provide new functionality and to better align with the rest of the library. There are some intentional changes to default behavior / deprecations, but also the potential for unintentional breakage. Please help surface any examples of the latter prior to final release.

See the Release Notes for more information about the new features and changes.

Please open a GitHub issue with a reproducible example demonstrating any problems that you encounter. The final release is targeted for the end of September.

v0.12.2

1 year ago

v0.12.2 (December 2022)

This is an incremental release that is a recommended upgrade for all users. It is very likely the final release of the 0.12 series and the last version to support Python 3.7.

  • |Feature| Added the objects.KDE stat (#3111).

  • |Feature| Added the objects.Boolean scale (#3205).

  • |Enhancement| Improved user feedback for failures during plot compilation by catching exceptions and re-raising with a PlotSpecError that provides additional context. (#3203).

  • |Fix| Improved calculation of automatic mark widths with unshared facet axes (#3119).

  • |Fix| Improved robustness to empty data in several components of the objects interface (#3202).

  • |Fix| Fixed a bug where legends for numeric variables with large values would be incorrectly shown (i.e. with a missing offset or exponent; #3187).

  • |Fix| Fixed a regression in v0.12.0 where manually-added labels could have duplicate legend entries (#3116).

  • |Fix| Fixed a bug in histplot with kde=True and log_scale=True where the curve was not scaled properly (#3173).

  • |Fix| Fixed a bug in relplot where inner axis labels would be shown when axis sharing was disabled (#3180).

  • |Fix| Fixed a bug in objects.Continuous to avoid an exception with boolean data (#3190).

v0.12.1

1 year ago

This is an incremental release that is a recommended upgrade for all users. It addresses a handful of bugs / regressions in v0.12.0 and adds several features and enhancements to the new objects interface.

  • Added the objects.Text mark (#3051).
  • Added the objects.Dash mark (#3074).
  • Added the objects.Perc stat (#3063).
  • Added the objects.Count stat (#3086).
  • The objects.Band and objects.Range marks will now cover the full extent of the data if min / max variables are not explicitly assigned or added in a transform (#3056).
  • The objects.Jitter move now applies a small amount of jitter by default (#3066).
  • Axes with a objects.Nominal scale now appear like categorical axes in classic seaborn, with fixed margins, no grid, and an inverted y axis (#3069).
  • The objects.Continuous.label method now accepts base=None to override the default formatter with a log transform (#3087).
  • Marks that sort along the orient axis (e.g. objects.Line) now use a stable algorithm (#3064).
  • Added a label parameter to pointplot, which addresses a regression in 0.12.0 when pointplot is passed to FacetGrid (#3016).
  • Fixed a bug that caused an exception when more than two layers with the same mappings were added to objects.Plot (#3055).
  • Made objects.PolyFit robust to missing data (#3010).
  • Fixed a bug in objects.Plot that occurred when data assigned to the orient coordinate had zero variance (#3084).
  • Fixed a regression in kdeplot where passing cmap for an unfilled bivariate plot would raise an exception (#3065).
  • Addressed a performance regression in lineplot with a large number of unique x values (#3081).
  • Seaborn no longer contains doctest-style examples, simplifying the testing infrastructure (#3034).

v0.12.0

1 year ago

Introduction of the objects interface

This release debuts the seaborn.objects interface, an entirely new approach to making plots with seaborn. It is the product of several years of design and 16 months of implementation work. The interface aims to provide a more declarative, composable, and extensible API for making statistical graphics. It is inspired by Wilkinson's grammar of graphics, offering a Pythonic API that is informed by the design of libraries such as ggplot2 and vega-lite along with lessons from the past 10 years of seaborn's development.

For more information and numerous examples, see the tutorial chapter and API reference.

This initial release should be considered "experimental". While it is stable enough for serious use, there are definitely some rough edges, and some key features remain to be implemented. It is possible that breaking changes may occur over the next few minor releases. Please be patient with any limitations that you encounter and help the development by reporting issues when you find behavior surprising.

Keyword-only arguments

Seaborn's plotting functions now require explicit keywords for most arguments, following the deprecation of positional arguments in v0.11.0. With this enforcement, most functions have also had their parameter lists rearranged so that data is the first and only positional argument. This adds consistency across the various functions in the library. It also means that calling func(data) will do something for nearly all functions (those that support wide-form data) and that pandas.DataFrame can be piped directly into a plot. It is possible that the signatures will be loosened a bit in future releases so that x and y can be positional, but minimal support for positional arguments after this change will reduce the chance of inadvertent mis-specification (2804).

Modernization of categorical scatterplots

This release begins the process of modernizing the categorical plots, beginning with stripplot and swarmplot. These functions are sporting some enhancements that alleviate a few long-running frustrations (2413, 2447):

  • The new native_scale parameter allows numeric or datetime categories to be plotted with their original scale rather than converted to strings and plotted at fixed intervals.
  • The new formatter parameter allows more control over the string representation of values on the categorical axis. There should also be improved defaults for some types, such as dates.
  • It is now possible to assign hue when using only one coordinate variable (i.e. only x or y).
  • It is now possible to disable the legend.

The updates also harmonize behavior with functions that have been more recently introduced. This should be relatively non-disruptive, although a few defaults will change:

  • The functions now hook into matplotlib's unit system for plotting categorical data. (Seaborn's categorical functions actually predate support for categorical data in matplotlib.) This should mostly be transparent to the user, but it may resolve a few edge cases. For example, matplotlib interactivity should work better (e.g., for showing the data value under the cursor).
  • A color palette is no longer applied to levels of the categorical variable by default. It is now necessary to explicitly assign hue to see multiple colors (i.e., assign the same variable to x/y and hue). Passing palette without hue will continue to be honored for one release cycle.
  • Numeric hue variables now receive a continuous mapping by default, using the same rules as scatterplot. Pass palette="deep" to reproduce previous defaults.
  • The plots now follow the default property cycle; i.e. calling an axes-level function multiple times with the same active axes will produce different-colored artists.
  • Currently, assigning hue and then passing a color will produce a gradient palette. This is now deprecated, as it is easy to request a gradient with, e.g. palette="light:blue".

Similar enhancements / updates should be expected to roll out to other categorical plotting functions in future releases. There are also several function-specific enhancements:

  • In stripplot, a "strip" with a single observation will be plotted without jitter (2413)
  • In swarmplot, the points are now swarmed at draw time, meaning that the plot will adapt to further changes in axis scaling or tweaks to the plot layout (2443).
  • In swarmplot, the proportion of points that must overlap before issuing a warning can now be controlled with the warn_thresh parameter (2447).
  • In swarmplot, the order of the points in each swarm now matches the order in the original dataset; previously they were sorted. This affects only the underlying data stored in the matplotlib artist, not the visual representation (2443).

More flexible errorbars

Increased the flexibility of what can be shown by the internally-calculated errorbars for lineplot, barplot, and pointplot.

With the new errorbar parameter, it is now possible to select bootstrap confidence intervals, percentile / predictive intervals, or intervals formed by scaled standard deviations or standard errors. The parameter also accepts an arbitrary function that maps from a vector to an interval. There is a new user guide chapter demonstrating these options and explaining when you might want to use each one.

As a consequence of this change, the ci parameter has been deprecated. Note that regplot retains the previous API, but it will likely be updated in a future release (2407, 2866).

Other updates

  • It is now possible to aggregate / sort a lineplot along the y axis using orient="y" (2854).
  • Made it easier to customize FacetGrid / PairGrid / JointGrid with a fluent (method-chained) style by adding apply/ pipe methods. Additionally, fixed the tight_layout and refline methods so that they return self (2926).
  • Added FacetGrid.tick_params and PairGrid.tick_params to customize the appearance of the ticks, tick labels, and gridlines of all subplots at once (2944).
  • Added a width parameter to barplot (2860).
  • It is now possible to specify estimator as a string in barplot and pointplot, in addition to a callable (2866).
  • Error bars in regplot now inherit the alpha value of the points they correspond to (2540).
  • When using pairplot with corner=True and diag_kind=None, the top left y axis label is no longer hidden (2850).
  • It is now possible to plot a discrete histplot as a step function or polygon (2859).
  • It is now possible to customize the appearance of elements in a boxenplot with box_kws/line_kws/flier_kws (2909).
  • Improved integration with the matplotlib color cycle in most axes-level functions (2449).
  • Fixed a regression in 0.11.2 that caused some functions to stall indefinitely or raise when the input data had a duplicate index (2776).
  • Fixed a bug in histplot and kdeplot where weights were not factored into the normalization (2812).
  • Fixed two edgecases in histplot when only binwidth was provided (2813).
  • Fixed a bug in violinplot where inner boxes/points could be missing with unpaired split violins (2814).
  • Fixed a bug in PairGrid where an error would be raised when defining hue only in the mapping methods (2847).
  • Fixed a bug in scatterplot where an error would be raised when hue_order was a subset of the hue levels (2848).
  • Fixed a bug in histplot where dodged bars would have different widths on a log scale (2849).
  • In lineplot, allowed the dashes keyword to set the style of a line without mapping a style variable (2449).
  • Improved support in relplot for "wide" data and for faceting variables passed as non-pandas objects (2846).
  • Subplot titles will no longer be reset when calling FacetGrid.map or FacetGrid.map_dataframe (2705).
  • Added a workaround for a matplotlib issue that caused figure-level functions to freeze when plt.show was called (2925).
  • Improved robustness to numerical errors in kdeplot (2862).
  • Fixed a bug where rugplot was ignoring expand_margins=False (2953).
  • The patch.facecolor rc param is no longer set by set_palette (or set_theme). This should have no general effect, because the matplotlib default is now "C0" (2906).
  • Made scipy an optional dependency and added pip install seaborn[stats] as a method for ensuring the availability of compatible scipy and statsmodels libraries at install time. This has a few minor implications for existing code, which are explained in the Github pull request (2398).
  • Example datasets are now stored in an OS-specific cache location (as determined by appdirs) rather than in the user's home directory. Users should feel free to remove ~/seaborn-data if desired (2773).
  • The unit test suite is no longer part of the source or wheel distribution. Seaborn has never had a runtime API for exercising the tests, so this should not have workflow implications (2833).
  • Following NEP29, dropped support for Python 3.6 and bumped the minimally-supported versions of the library dependencies.
  • Removed the previously-deprecated factorplot along with several previously-deprecated utility functions (iqr, percentiles, pmf_hist, and sort_df).
  • Removed the (previously-unused) option to pass additional keyword arguments to pointplot.

v0.12.0rc0

1 year ago

This is the first release candidate for seaborn v0.12, a major update introducing an entirely new interface along with numerous features, enhancements, and fixes for existing functionality.

To install for testing, run

pip install seaborn==0.12.0rc0

There were several renamings and API changes from the final beta release. See the referenced PRs for more information on each change.

Mark renamings

  • Scatter -> Dots (#2942)
  • Ribbon -> Band (#2945)
  • Interval -> Range (#2945)

Plot API changes

  • The stat= and move= parameters were removed from Plot.add, which now has the following signature: Plot.add(mark, *transforms, ...). (#2948)
  • The Plot.configure method was renamed to Plot.layout, with the figsize parameter changed to size. The share{x,y} parameters were removed from Plot.layout, with that functionality now supported by the new Plot.share method. (#2954)

Additionally, the install extra for including statistical packages was changed from seaborn[all] to seaborn[stats]. (#2939)

v0.12.0b3

1 year ago

This is the third and final beta release for seaborn v0.12, a major update introducing an entirely new interface along with numerous features, enhancements, and fixes for existing functionality.

To install for testing, run

pip install seaborn==0.12.0b3

Changes from the second beta release:

Objects interface

  • Added Est stat for aggregating with a flexible error bar interval (#2912)
  • Added Interval mark for drawing lines perpendicular to the orient axis (#2912)
  • Added Plot.theme for basic control over figure appearance (#2929)
  • Expanded Plot.label to control plot titles (#2934)
  • Fixed Plot.scale so that it applies to variables added during the stat transform (#2915)
  • Fixed a bug where the Plot.configure spec would not persist after further method calls (#2917)
  • Fixed a bug where dot marks ignored the artist_kws parameter (#2921)

Function interface

  • Added .apply and .pipe methods to FacetGrid/PairGrid/JointGrid for fluent customization (#2928)
  • Added a workaround for an issue in matplotlib that caused figure-level plots to freeze or close (#2925)

v0.12.0b2

1 year ago

This is the second beta release for seaborn v0.12, a major update introducing an entirely new interface along with numerous features, enhancements, and fixes for existing functionality.

To install for testing, run

pip install seaborn==0.12.0b2

Changes from the first beta release:

Objects interface

  • Added Plot.label method for controlling axis labels/legend titles (#2902)
  • Added Plot.limit method for controlling axis limits (#2898)
  • Added Bars, a more efficient mark for histograms, and improved performance of Bar mark as well (#2893)
  • Improved the visual appearance of the Area and Ribbon marks used a simpler matplotlib artist class for them (#2896)
  • Improved the visual appearance of the Bar mark (#2889)