Ibis Versions Save

the portable Python dataframe library

4.1.0

1 year ago

4.1.0 (2023-01-25)

Features

add ibis.get_backend function (2d27df8)
add py.typed to allow mypy to type check packages that use ibis (765d42e)
api: add ibis.set_backend function (e7fabaf)
api: add selectors for easier selection of columns (306bc88)
bigquery: add JS UDF support (e74328b)
bigquery: add SQL UDF support (db24173)
bigquery: add to_pyarrow method (30157c5)
bigquery: implement bitwise operations (55b69b1)
bigquery: implement ops.Typeof (b219919)
bigquery: implement ops.ZeroIfNull (f4c5607)
bigquery: implement struct literal (c5f2a1d)
clickhouse: properly support native boolean types (31cc7ba)
common: add support for annotating with coercible types (ae4a415)
common: make frozendict truly immutable (1c25213)
common: support annotations with typing.Literal (6f89f0b)
common: support generic mapping and sequence type annotations (ddc6603)
dask: support connect() with no arguments (67eed42)
datatype: add optional timestamp scale parameter (a38115a)
datatypes: add as_struct method to convert schemas to structs (64be7b1)
duckdb: add read_json function for consuming newline-delimited JSON files (65e65c1)
mssql: add a bunch of missing types (c698d35)
mssql: implement inference for DATETIME2 and DATETIMEOFFSET (aa9f151)
nicer repr for Backend.tables (0d319ca)
pandas: support connect() with no arguments (78cbbdd)
polars: allow ibis.polars.connect() to function without any arguments (d653a07)
polars: handle casting to scaled timestamps (099d1ec)
postgres: add Map(string, string) support via the built-in HSTORE extension (f968f8f)
pyarrow: support conversion to pyarrow map and struct types (54a4557)
snowflake: add more array operations (8d8bb70)
snowflake: add more map operations (7ae6e25)
snowflake: any/all/notany/notall reductions (ba1af5e)
snowflake: bitwise reductions (5aba997)
snowflake: date from ymd (035f856)
snowflake: fix array slicing (bd7af2a)
snowflake: implement ArrayCollect (c425f68)
snowflake: implement NthValue (0dca57c)
snowflake: implement ops.Arbitrary (45f4f05)
snowflake: implement ops.StructColumn (41698ed)
snowflake: implement StringSplit (e6acc09)
snowflake: implement StructField and struct literals (286a5c3)
snowflake: implement TimestampFromUNIX (314637d)
snowflake: implement TimestampFromYMDHMS (1eba8be)
snowflake: implement typeof operation (029499c)
snowflake: implement exists/not exists (7c8363b)
snowflake: implement extract millisecond (3292e91)
snowflake: make literal maps and params work (dd759d3)
snowflake: regex extract, search and replace (9c82179)
snowflake: string to timestamp (095ded6)
sqlite: implement _get_schema_using_query in SQLite backend (7ff84c8)
trino: compile timestamp types with scale (67683d3)
trino: enable ops.ExistsSubquery and ops.NotExistsSubquery (9b9b315)
trino: map parameters (53bd910)
ux: improve error message when column is not found (b527506)

Bug Fixes

backend: read the default backend setting in _default_backend (11252af)
bigquery: move connection logic to do_connect (42f2106)
bigquery: remove invalid operations from registry (911a080)
bigquery: resolve deprecation warnings for StructType and Schema (c9e7078)
clickhouse: fix position call (702de5d)
correctly visualize array type (26b0b3f)
deps: make sure pyarrow is not an implicit dependency (10373f4)
duckdb: make read_csv on URLs work (9e61816)
duckdb: only try to load extensions when necessary for csv (c77bde7)
duckdb: remove invalid operations from registry (ba2ec59)
fallback to default backend with to_pyarrow/to_pyarrow_batches (a1a6902)
impala: remove broken alias elision (32b120f)
ir: error for order_by on nonexistent column (57b1dd8)
ir: ops.Where output shape should consider all arguments (6f87064)
mssql: infer bit as boolean everywhere (24f9d7c)
mssql: pull nullability from column information (490f8b4)
mysql: fix mysql query schema inference (12f6438)
polars: remove non-working Binary and Decimal literal inference (0482d15)
postgres: use permanent views to avoid connection pool defeat (49a4991)
pyspark: fix substring constant translation (40d2072)
set ops: raise if no tables passed to set operations (bf4bdde)
snowflake: bring back bitwise operations (260facd)
snowflake: don't always insert a cast (ee8817b)
snowflake: implement working TimestampNow (42d95b0)
snowflake: make sqlalchemy 2.0 compatible (8071255)
snowflake: re-enable ops.TableArrayView (a1ad2b7)
snowflake: remove invalid operations from registry (2831559)
sql: add typeof test and bring back implementations (7dc5356)
sqlalchemy: 2.0 compatibility (837a736)
sqlalchemy: fix view creation with select stmts that have bind parameters (d760e69)
sqlalchemy: handle correlated exists sanely (efa42bd)
sqlalchemy: handle generic geography/geometry by name instead of geotype (23c35e1)
sqlalchemy: use exec_driver_sql in view teardown (2599c9b)
sqlalchemy: use the backend's compiler instead of AlchemyCompiler (9f4ff54)
sql: fix broken call to ibis.map (045edc7)
sqlite: interpolate pathlib.Path correctly in attach (0415bd3)
trino: ensure connecting works with trino 0.321 (07cee38)
trino: remove invalid operations from registry (665265c)
ux: remove extra trailing newline in expression repr (ee6d58a)

Documentation

add BigQuery backend docs (09d8995)
add streamlit app for showing the backend operation matrix (3228f64)
allow deselecting geospatial ops in backend support matrix (012da8c)
api: document more public expression APIs (337018f)
backend-info: prevent app from trying install duckdb extensions (3d94082)
clean up gen_matrix.py after adding streamlit app (deb80f2)
duckdb: add to_pyarrow_batches documentation (ec1ffce)
embed streamlit operation matrix app to docs (469a50d)
make firefox render the proper iframe height (ff1d4dc)
publish raw data for operation matrix (62e68da)
re-order when to download test data (8ce8c16)
release: update breaking changes in the release notes for 4.0.0 (4e91401)
remove trailing parenthesis (4294397)
update ibis-version-4.0.0-release.md (f6701df)
update links to contributing guides (da615e4)

Refactors

bigquery: explicite disallow INT64 in JS UDF (fb33bf9)
datatype: add custom sqlalchemy nested types for backend differentiation (dec70f5)
datatype: introduce to_sqla_type dispatching on dialect (a8bbc00)
datatypes: remove Geography and Geometry types in favor of GeoSpatial (d44978c)
datatype: use a mapping to store StructType fields rather than names and types tuples (ff34c7b)
dtypes: expose nbytes property for integer and floating point datatypes (ccf80fd)
duckdb: remove .raw_sql call (abc939e)
duckdb: use sqlalchemy-views to reduce string hacking (c162750)
ir: remove UnnamedMarker (dd352b1)
postgres: use a bindparam for metadata queries (b6b4669)
remove empty unused file (9d63fd6)
schema: use a mapping to store Schema fields rather than names and types tuples (318179a)
simplify _find_backend implementation (60f1a1b)
snowflake: remove unnecessary parse_json call in ops.StructField impl (9e80231)
snowflake: remove unnecessary casting (271554c)
snowflake: use unary instead of fixed_arity(..., 1) (4a1c7c9)
sqlalchemy: clean up quoting implementation (506ce01)
sqlalchemy: generalize handling of failed type inference (b0f4e4c)
sqlalchemy: move _get_schema_using_query to base class (296cd7d)
sqlalchemy: remove the need for deferred columns (e4011aa)
sqlalchemy: remove use of deprecated isnot (4ec53a4)
sqlalchemy: use exec_driver_sql everywhere (e8f96b6)
sql: finally remove _CorrelatedRefCheck (f49e429)

Deprecations

api: deprecate .to_projection in favor of .as_table (7706a86)
api: deprecate get_column/s in favor of __getitem__/__getattr__ syntax (e6372e2)
ir: schedule DatabaseTable.change_name for removal (e4bae26)
schema: schedule Schema.delete() and Schema.append() for removal (45ac9a9)

4.0.0

1 year ago

3.2.0

1 year ago

3.2.0 (2022-09-15)

Features

add api to get backend entry points (0152f5e)
api: add and_ and or_ helpers (94bd4df)
api: add argmax and argmin column methods (b52216a)
api: add distinct to Intersection and Difference operations (cd9a34c)
api: add ibis.memtable API for constructing in-memory table expressions (0cc6948)
api: add ibis.sql to easily get a formatted SQL string (d971cc3)
api: add Table.unpack() and StructValue.lift() APIs for projecting struct fields (ced5f53)
api: allow transmute-style select method (d5fc364)
api: implement all bitwise operators (7fc5073)
api: promote psql to a show_sql public API (877a05d)
clickhouse: add dataframe external table support for memtables (bc86aa7)
clickhouse: add enum, ipaddr, json, lowcardinality to type parser (8f0287f)
clickhouse: enable support for working window functions (310a5a8)
clickhouse: implement argmin and argmax (ee7c878)
clickhouse: implement bitwise operations (348cd08)
clickhouse: implement struct scalars (1f3efe9)
dask: implement StringReplace execution (1389f4b)
dask: implement ungrouped argmin and argmax (854aea7)
deps: support duckdb 0.5.0 (47165b2)
duckdb: handle query parameters in ibis.connect (fbde95d)
duckdb: implement argmin and argmax (abf03f1)
duckdb: implement bitwise xor (ca3abed)
duckdb: register tables from pandas/pyarrow objects (36e48cc)
duckdb: support unsigned integer types (2e67918)
impala: implement bitwise operations (c5302ab)
implement dropna for SQL backends (8a747fb)
log: make BaseSQLBackend._log print by default (12de5bb)
mysql: register BLOB types (1e4fb92)
pandas: implement argmin and argmax (bf9b948)
pandas: implement NotContains on grouped data (976dce7)
pandas: implement StringReplace execution (578795f)
pandas: implement Contains with a group by (c534848)
postgres: implement bitwise xor (9b1ebf5)
pyspark: add option to treat nan as null in aggregations (bf47250)
pyspark: implement ibis.connect for pyspark (a191744)
pyspark: implement Intersection and Difference (9845a3c)
pyspark: implement bitwise operators (33cadb1)
sqlalchemy: implement bitwise operator translation (bd9f64c)
sqlalchemy: make ibis.connect with sqlalchemy backends (b6cefb9)
sqlalchemy: properly implement Intersection and Difference (2bc0b69)
sql: implement StringReplace translation (29daa32)
sqlite: implement bitwise xor and bitwise not (58c42f9)
support table.sort_by(ibis.random()) (693005d)
type-system: infer pandas' string dtype (5f0eb5d)
ux: add duckdb as the default backend (8ccb81d)
ux: use rich to format Table.info() output (67234c3)
ux: use sqlglot for pretty printing SQL (a3c81c5)
variadic union, intersect, & difference functions (05aca5a)

Bug Fixes

api: make sure column names that are already inferred are not overwritten (6f1cb16)
api: support deferred objects in existing API functions (241ce6a)
backend: ensure that chained limits respect prior limits (02a04f5)
backends: ensure select after filter works (e58ca73)
backends: only recommend installing ibis-foo when foo is a known backend (ac6974a)
base-sql: fix String-generating backend string concat implementation (3cf78c1)
clickhouse: add IPv4/IPv6 literal inference (0a2f315)
clickhouse: cast repeat times argument to UInt64 (b643544)
clickhouse: fix listing tables from databases with no tables (08900c3)
compilers: make sure memtable rows have names in the SQL string compilers (18e7f95)
compiler: use repr for SQL string VALUES data (75af658)
dask: ensure predicates are computed before projections (5cd70e1)
dask: implement timestamp-date binary comparisons (48d5058)
dask: set dask upper bound due to large scale test breakage (796c645), closes #9221
decimal: add decimal type inference (3fe3fd8)
deps: update dependency duckdb-engine to >=0.1.8,<0.4.0 (113dc8f)
deps: update dependency duckdb-engine to >=0.1.8,<0.5.0 (ef97c9d)
deps: update dependency parsy to v2 (9a06131)
deps: update dependency shapely to >=1.6,<1.8.4 (0c787d2)
deps: update dependency shapely to >=1.6,<1.8.5 (d08c737)
deps: update dependency sqlglot to v5 (f210bb8)
deps: update dependency sqlglot to v6 (5ca4533)
duckdb: add missing types (59bad07)
duckdb: ensure that in-memory connections remain in their creating thread (39bc537)
duckdb: use fetch_arrow_table() to be able to handle big timestamps (85a76eb)
fix bug in pandas & dask difference implementation (88a78fa)
fix dask where implementation (49f8845)
impala: add date column dtype to impala to ibis type dict (c59e94e), closes #4449
pandas where supports scalar for left (48f6c1e)
pandas: fix anti-joins (10a659d)
pandas: implement timestamp-date binary comparisons (4fc666d)
pandas: properly handle empty groups when aggregating with GroupConcat (6545f4d)
pyspark: fix broken StringReplace implementation (22cb297)
pyspark: make sure ibis.connect works with pyspark (a7ab107)
pyspark: translate predicates before projections (b3d1c80)
sqlalchemy: fix float64 type mapping (8782773)
sqlalchemy: handle reductions with multiple arguments (5b2039b)
sqlalchemy: implement SQLQueryResult translation (786a50f)
sql: fix sql compilation after making InMemoryTable a subclass of PhysicalTable (aac9524)
squash several bugs in sort_by asc/desc handling (222b2ba)
support chained set operations in SQL backends (227aed3)
support filters on InMemoryTable exprs (abfaf1f)
typo: in BaseSQLBackend.compile docstring (0561b13)

Deprecations

right kwarg in union/intersect/difference (719a5a1)
duckdb: deprecate path argument in favor of database (fcacc20)
sqlite: deprecate path argument in favor of database (0f85919)

Performance

pandas: remove reexecution of alias children (64efa53)
pyspark: ensure that pyspark DDL doesn't use VALUES (422c98d)
sqlalchemy: register DataFrames cheaply where possible (ee9f1be)

Documentation

add to_sql (e2821a5)
add back constraints for transitive doc dependencies and fix docs (350fd43)
add coc reporting information (c2355ba)
add community guidelines documentation (fd0893f)
add HeavyAI to the readme (4c5ca80)
add how-to bfill and ffill (ff84027)
add how-to for ibis+duckdb register (73a726e)
add how-to section to docs (33c4b93)
duckdb: add installation note for duckdb >= 0.5.0 (608b1fb)
fix memtable docstrings (72bc0f5)
fix flake8 line length issues (fb7af75)
fix markdown (4ab6b95)
fix relative links in tutorial (2bd075f), closes #4064 #4201
make attribution style uniform across the blog (05561e0)
move the blog out to the top level sidebar for visibility (417ba64)
remove underspecified UDF doc page (0eb0ac0)

3.1.0

1 year ago

3.1.0 (2022-07-26)

Features

add __getattr__ support to StructValue (75bded1)
allow selection subclasses to define new node args (2a7dc41)
api: accept Schema objects in public ibis.schema (0daac6c)
api: add .tables accessor to BaseBackend (7ad27f0)
api: add e function to public API (3a07e70)
api: add ops.StructColumn operation (020bfdb)
api: add cume_dist operation (6b6b185)
api: add toplevel ibis.connect() (e13946b)
api: handle literal timestamps with timezone embedded in string (1ae976b)
api: ibis.connect() default to duckdb for parquet/csv extensions (ff2f088)
api: make struct metadata more convenient to access (3fd9bd8)
api: support tab completion for backends (eb75fc5)
api: underscore convenience api (81716da)
api: unnest (98ecb09)
backends: allow column expressions from non-foreign tables on the right side of isin/notin (e1374a4)
base-sql: implement trig and math functions (addb2c1)
clickhouse: add ability to pass arbitrary kwargs to Clickhouse do_connect (583f599)
clickhouse: implement ops.StructColumn operation (0063007)
clickhouse: implement array collect (8b2577d)
clickhouse: implement ArrayColumn (1301f18)
clickhouse: implement bit aggs (f94a5d2)
clickhouse: implement clip (12dfe50)
clickhouse: implement covariance and correlation (a37c155)
clickhouse: implement degrees (7946c0f)
clickhouse: implement proper type serialization (80f4ab9)
clickhouse: implement radians (c7b7f08)
clickhouse: implement strftime (222f2b5)
clickhouse: implement struct field access (fff69f3)
clickhouse: implement trig and math functions (c56440a)
clickhouse: support subsecond timestamp literals (e8698a6)
compiler: restore intersect_class and difference_class overrides in base SQL backend (2c46a15)
dask: implement trig functions (e4086bb)
dask: implement zeroifnull (38487db)
datafusion: implement negate (69dd64d)
datafusion: implement trig functions (16803e1)
duckdb: add register method to duckdb backend to load parquet and csv files (4ccc6fc)
duckdb: enable find_in_set test (377023d)
duckdb: enable group_concat test (4b9ad6c)
duckdb: implement ops.StructColumn operation (211bfab)
duckdb: implement approx_count_distinct (03c89ad)
duckdb: implement approx_median (894ce90)
duckdb: implement arbitrary first and last aggregation (8a500bc)
duckdb: implement NthValue (1bf2842)
duckdb: implement strftime (aebc252)
duckdb: return the ir.Table instance from DuckDB's register API (0d05d41)
mysql: implement FindInSet (e55bbbf)
mysql: implement StringToTimestamp (169250f)
pandas: implement bitwise aggregations (37ff328)
pandas: implement degrees (25b4f69)
pandas: implement radians (6816b75)
pandas: implement trig functions (1fd52d2)
pandas: implement zeroifnull (48e8ed1)
postgres/duckdb: implement covariance and correlation (464d3ef)
postgres: implement ArrayColumn (7b0a506)
pyspark: implement approx_count_distinct (1fe1d75)
pyspark: implement approx_median (07571a9)
pyspark: implement covariance and correlation (ae818fb)
pyspark: implement degrees (f478c7c)
pyspark: implement nth_value (abb559d)
pyspark: implement nullifzero (640234b)
pyspark: implement radians (18843c0)
pyspark: implement trig functions (fd7621a)
pyspark: implement Where (32b9abb)
pyspark: implement xor (550b35b)
pyspark: implement zeroifnull (db13241)
pyspark: topk support (9344591)
sqlalchemy: add degrees and radians (8b7415f)
sqlalchemy: add xor translation rule (2921664)
sqlalchemy: allow non-primitive arrays (4e02918)
sqlalchemy: implement approx_count_distinct as count distinct (4e8bcab)
sqlalchemy: implement clip (8c02639)
sqlalchemy: implement trig functions (34c1514)
sqlalchemy: implement Where (7424704)
sqlalchemy: implement zeroifnull (4735e9a)
sqlite: implement BitAnd, BitOr and BitXor (e478479)
sqlite: implement cotangent (01e7ce7)
sqlite: implement degrees and radians (2cf9c5e)

Bug Fixes

api: bring back null datatype parsing (fc131a1)
api: compute the type from both branches of Where expressions (b8f4120)
api: ensure that Deferred objects work in aggregations (bbb376c)
api: ensure that nulls can be cast to any type to allow caller promotion (fab4393)
api: make ExistSubquery and NotExistsSubquery pure boolean operations (dd70024)
backends: make execution transactional where possible (d1ea269)
clickhouse: cast empty result dataframe (27ae68a)
clickhouse: handle empty IN and NOT IN expressions (2c892eb)
clickhouse: return null instead of empty string for group_concat when values are filtered out (b826b40)
compiler: fix bool bool comparisons (1ac9a9e)
dask/pandas: allow limit to be None (9f91d6b)
dask: aggregation with multi-key groupby fails on dask backend (4f8bc70)
datafusion: handle predicates in aggregates (4725571)
deps: update dependency datafusion to >=0.4,<0.7 (f5b244e)
deps: update dependency duckdb to >=0.3.2,<0.5.0 (57ee818)
deps: update dependency duckdb-engine to >=0.1.8,<0.3.0 (3e379a0)
deps: update dependency geoalchemy2 to >=0.6.3,<0.13 (c04a533)
deps: update dependency geopandas to >=0.6,<0.12 (b899c37)
deps: update dependency Shapely to >=1.6,<1.8.3 (87a49ad)
deps: update dependency toolz to >=0.11,<0.13 (258a641)
don't mask udf module in init.py (3e567ba)
duckdb: ensure that paths with non-extension . chars are parsed correctly (9448fd3)
duckdb: fix struct datatype parsing (5124763)
duckdb: force string_agg separator to be a constant (21cdf2f)
duckdb: handle multiple dotted extensions; quote names; consolidate implementations (1494246)
duckdb: remove timezone function invocation (33d38fc)
geospatial: ensure that later versions of numpy are compatible with geospatial code (33f0afb)
impala: a delimited table explicitly declare stored as textfile (04086a4), closes #4260
impala: remove broken nth_value implementation (dbc9cc2)
ir: don't attempt fusion when projections aren't exactly equivalent (3482ba2)
mysql: cast mysql timestamp literals to ensure correct return type (8116e04)
mysql: implement integer to timestamp using from_unixtime (1b43004)
pandas/dask: look at pre_execute for has_operation reporting (cb44efc)
pandas: execute negate on bool as not (330ab4f)
pandas: fix struct inference from dict in the pandas backend (5886a9a)
pandas: force backend options registration on trace.enable() calls (8818fe6)
pandas: handle empty boolean column casting in Series conversion (f697e3e)
pandas: handle struct columns with NA elements (9a7c510)
pandas: handle the case of selection from a join when remapping overlapping column names (031c4c6)
pandas: perform correct equality comparison (d62e7b9)
postgres/duckdb: cast after milliseconds computation instead of after extraction (bdd1d65)
pyspark: handle predicates in Aggregation (842c307)
pyspark: prevent spark from trying to convert timezone of naive timestamps (dfb4127)
pyspark: remove xpassing test for #2453 (c051e28)
pyspark: specialize implementation of has_operation (5082346)
pyspark: use empty check for collect_list in GroupConcat rule (df66acb)
repr: allow DestructValue selections to be formatted by fmt (4b45d87)
repr: when formatting DestructValue selections, use struct field names as column names (d01fe42)
sqlalchemy: fix parsing and construction of nested array types (e20bcc0)
sqlalchemy: remove unused second argument when creating temporary views (8766b40)
sqlite: register coversion to isoformat for pandas.Timestamp (fe95dca)
sqlite: test case with whitespace at the end of the line (7623ae9)
sql: use isoformat for timestamp literals (70d0ba6)
type-system: infer null datatype for empty sequence of expressions (f67d5f9)
use bounded precision for decimal aggregations (596acfb)

Performance Improvements

analysis: add _projection as cached_property to avoid reconstruction of projections (98510c8)
lineage: ensure that expressions are not traversed multiple times in most cases (ff9708c)

Reverts

ci: install sqlite3 on ubuntu (1f2705f)

3.0.2

2 years ago

3.0.2 (2022-04-28)

Bug Fixes

docs: fix tempdir location for docs build (dcd1b22)

3.0.1

2 years ago

3.0.1 (2022-04-28)

Bug Fixes

build: replace version before exec plugin runs (573139c)

3.0.0

2 years ago

3.0.0 (2022-04-25)

⚠ BREAKING CHANGES

ir: The following are breaking changes due to simplifying expression internals
- ibis.expr.datatypes.DataType.scalar_type and DataType.column_type factory methods have been removed, DataType.scalar and DataType.column class fields can be used to directly construct a corresponding expression instance (though prefer to use operation.to_expr())
- ibis.expr.types.ValueExpr._name and ValueExpr._dtype`` fields are not accassible anymore. While these were not supposed to used directly now ValueExpr.has_name(), ValueExpr.get_name()andValueExpr.type()` methods are the only way to retrieve the expression's name and datatype.
- ibis.expr.operations.Node.output_type is a property now not a method, decorate those methods with @property
- ibis.expr.operations.ValueOp subclasses must define output_shape and output_dtype properties from now on (note the datatype abbreviation dtype in the property name)
- ibis.expr.rules.cast(), scalar_like() and array_like() rules have been removed
api: Replace t["a"].distinct() with t[["a"]].distinct().
deps: The sqlalchemy lower bound is now 1.4
ir: Schema.names and Schema.types attributes now have tuple type rather than list
expr: Columns that were added or used in an aggregation or mutation would be alphabetically sorted in compiled SQL outputs. This was a vestige from when Python dicts didn't preserve insertion order. Now columns will appear in the order in which they were passed to aggregate or mutate
api: dt.float is now dt.float64; use dt.float32 for the previous behavior.
ir: Relation-based execute_node dispatch rules must now accept tuples of expressions.
ir: removed ibis.expr.lineage.{roots,find_nodes} functions
config: Use ibis.options.graphviz_repr = True to enable
hdfs: Use fsspec instead of HDFS from ibis
udf: Vectorized UDF coercion functions are no longer a public API.
The minimum supported Python version is now Python 3.8
config: register_option is no longer supported, please submit option requests upstream
backends: Read tables with pandas.read_hdf and use the pandas backend
The CSV backend is removed. Use Datafusion for CSV execution.
backends: Use the datafusion backend to read parquet files
Expr() -> Expr.pipe()
coercion functions previously in expr/schema.py are now in udf/vectorized.py
api: materialize is removed. Joins with overlapping columns now have suffixes.
kudu: use impala instead: https://kudu.apache.org/docs/kudu_impala_integration.html
Any code that was relying implicitly on string-y behavior from UUID datatypes will need to add an explicit cast first.

Features

add repr_html for expressions to print as tables in ipython (cd6fa4e)
add duckdb backend (667f2d5)
allow construction of decimal literals (3d9e865)
api: add ibis.asc expression (efe177e), closes #1454
api: add has_operation API to the backend (4fab014)
api: implement type for SortExpr (ab19bd6)
clickhouse: implement string concat for clickhouse (1767205)
clickhouse: implement StrRight operation (67749a0)
clickhouse: implement table union (e0008d7)
clickhouse: implement trim, pad and string predicates (a5b7293)
datafusion: implement Count operation (4797a86)
datatypes: unbounded decimal type (f7e6f65)
date: add ibis.date(y,m,d) functionality (26892b6), closes #386
duckdb/postgres/mysql/pyspark: implement .sql on tables for mixing sql and expressions (00e8087)
duckdb: add functionality needed to pass integer to interval test (e2119e8)
duckdb: implement _get_schema_using_query (93cd730)
duckdb: implement now() function (6924f50)
duckdb: implement regexp replace and extract (18d16a7)
implement force argument in sqlalchemy backend base class (9df7f1b)
implement coalesce for the pyspark backend (8183efe)
implement semi/anti join for the pandas backend (cb36fc5)
implement semi/anti join for the pyspark backend (3e1ba9c)
implement the remaining clickhouse joins (b3aa1f0)
ir: rewrite and speed up expression repr (45ce9b2)
mysql: implement _get_schema_from_query (456cd44)
mysql: move string join impl up to alchemy for mysql (77a8eb9)
postgres: implement _get_schema_using_query (f2459eb)
pyspark: implement Distinct for pyspark (4306ad9)
pyspark: implement log base b for pyspark (527af3c)
pyspark: implement percent_rank and enable testing (c051617)
repr: add interval info to interval repr (df26231)
sqlalchemy: implement ilike (43996c0)
sqlite: implement date_truncate (3ce4f2a)
sqlite: implement ISO week of year (714ff7b)
sqlite: implement string join and concat (6f5f353)
support of arrays and tuples for clickhouse (db512a8)
ver: dynamic version identifiers (408f862)

Bug Fixes

added wheel to pyproject toml for venv users (b0b8e5c)
allow major version changes in CalVer dependencies (9c3fbe5)
annotable: allow optional arguments at any position (778995f), closes #3730
api: add ibis.map and .struct (327b342), closes #3118
api: map string multiplication with integer to repeat method (b205922)
api: thread suffixes parameter to individual join methods (31a9aff)
change TimestampType to Timestamp (e0750be)
clickhouse: disconnect from clickhouse when computing version (11cbf08)
clickhouse: use a context manager for execution (a471225)
combine windows during windowization (7fdd851)
conform epoch_seconds impls to expression return type (18a70f1)
context-adjustment: pass scope when calling adjust_context in pyspark backend (33aad7b), closes #3108
dask: fix asof joins for newer version of dask (50711cc)
dask: workaround dask bug (a0f3bd9)
deps: update dependency atpublic to v3 (3fe8f0d)
deps: update dependency datafusion to >=0.4,<0.6 (3fb2194)
deps: update dependency geoalchemy2 to >=0.6.3,<0.12 (dc3c361)
deps: update dependency graphviz to >=0.16,<0.21 (3014445)
duckdb: add casts to literals to fix binding errors (1977a55), closes #3629
duckdb: fix array column type discovery on leaf tables and add tests (15e5412)
duckdb: fix log with base b impl (4920097)
duckdb: support both 0.3.2 and 0.3.3 (a73ccce)
enforce the schema's column names in apply_to (b0f334d)
expose ops.IfNull for mysql backend (156c2bd)
expr: add more binary operators to char list and implement fallback (b88184c)
expr: fix formatting of table info using tabulate (b110636)
fix float vs real data type detection in sqlalchemy (24e6774)
fix list_schemas argument (69c1abf)
fix postgres udfs and reenable ci tests (7d480d2)
fix tablecolumn execution for filter following join (064595b)
format: remove some newlines from formatted expr repr (ed4fa78)
histogram: cross_join needs onclause=True (5d36a58), closes #622
ibis.expr.signature.Parameter is not pickleable (828fd54)
implement coalesce properly in the pandas backend (aca5312)
implement count on tables for pyspark (7fe5573), closes #2879
infer coalesce types when a non-null expression occurs after the first argument (c5f2906)
mutate: do not lift table column that results from mutate (ba4e5e5)
pandas: disable range windows with order by (e016664)
pandas: don't reassign the same column to silence SettingWithCopyWarning warning (75dc616)
pandas: implement percent_rank correctly (d8b83e7)
prevent unintentional cross joins in mutate + filter (83eef99)
pyspark: fix range windows (a6f2aa8)
regression in Selection.sort_by with resolved_keys (c7a69cd)
regression in sort_by with resolved_keys (63f1382), closes #3619
remove broken csv pre_execute (93b662a)
remove importorskip call for backend tests (2f0bcd8)
remove incorrect fix for pandas regression (339f544)
remove passing schema into register_parquet (bdcbb08)
repr: add ops.TimeAdd to repr binop lookup table (fd94275)
repr: allow ops.TableNode in fmt_value (6f57003)
reverse the predicate pushdown subsitution (f3cd358)
sort_index to satisfy pandas 1.4.x (6bac0fc)
sqlalchemy: ensure correlated subqueries FROM clauses are rendered (3175321)
sqlalchemy: use corresponding_column to prevent spurious cross joins (fdada21)
sqlalchemy: use replace selectables to prevent semi/anti join cross join (e8a1a71)
sql: retain column names for named ColumnExprs (f1b4b6e), closes #3754
sql: walk right join trees and substitute joins with right-side joins with views (0231592)
store schema on the pandas backend to allow correct inference (35070be)

Performance Improvements

datatypes: speed up str and hash (262d3d7)
fast path for simple column selection (d178498)
ir: global equality cache (13c2bb2)
ir: introduce CachedEqMixin to speed up equality checks (b633925)
repr: remove full tree repr from rule validator error message (65885ab)
speed up attribute access (89d1c05)
use assign instead of concat in projections when possible (985c242)

Miscellaneous Chores

deps: increase sqlalchemy lower bound to 1.4 (560854a)
drop support for Python 3.7 (0afd138)

Code Refactoring

api: make primitive types more cohesive (71da8f7)
api: remove distinct ColumnExpr API (3f48cb8)
api: remove materialize (24285c1)
backends: remove the hdf5 backend (ff34f3e)
backends: remove the parquet backend (b510473)
config: disable graphviz-repr-in-notebook by default (214ad4e)
config: remove old config code and port to pydantic (4bb96d1)
dt.UUID inherits from DataType, not String (2ba540d)
expr: preserve column ordering in aggregations/mutations (668be0f)
hdfs: replace HDFS with fsspec (cc6eddb)
ir: make Annotable immutable (1f2b3fa)
ir: make schema annotable (b980903)
ir: remove unused lineage roots and find_nodes functions (d630a77)
ir: simplify expressions by not storing dtype and name (e929f85)
kudu: remove support for use of kudu through kudu-python (36bd97f)
move coercion functions from schema.py to udf (58eea56), closes #3033
remove blanket call for Expr (3a71116), closes #2258
remove the csv backend (0e3e02e)
udf: make coerce functions in ibis.udf.vectorized private (9ba4392)

2.1.1

2 years ago

2.1.1 (2022-01-12)

Bug Fixes

setup.py: set the correct version number for 2.1.0 (f3d267b)

2.1.0

2 years ago

2.1.0 (2022-01-12)

Bug Fixes

consider all packages' entry points (b495cf6)
datatypes: infer bytes literal as binary #2915 (#3124) (887efbd)
deps: bump minimum dask version to 2021.10.0 (e6b5c09)
deps: constrain numpy to ensure wheels are used on windows (70c308b)
deps: update dependency clickhouse-driver to ^0.1 || ^0.2.0 (#3061) (a839d54)
deps: update dependency geoalchemy2 to >=0.6,<0.11 (4cede9d)
deps: update dependency pyarrow to v6 (#3092) (61e52b5)
don't force backends to override do_connect until 3.0.0 (4b46973)
execute materialized joins in the pandas and dask backends (#3086) (9ed937a)
literal: allow creating ibis literal with uuid (#3131) (b0f4f44)
restore the ability to have more than two option levels (#3151) (fb4a944)
sqlalchemy: fix correlated subquery compilation (43b9010)
sqlite: defer db connection until needed (#3127) (5467afa), closes #64

Features

allow column_of to take a column expression (dbc34bb)
ci: More readable workflow job titles (#3111) (d8fd7d9)
datafusion: initial implementation for Arrow Datafusion backend (3a67840), closes #2627
datafusion: initial implementation for Arrow Datafusion backend (75876d9), closes #2627
make dayofweek impls conform to pandas semantics (#3161) (9297828)

Reverts

"ci: install gdal for fiona" (8503361)