dplyr: A grammar of data manipulation
join_by()
now allows its helper functions to be namespaced with dplyr::
,
like join_by(dplyr::between(x, lower, upper))
(#6838).
left_join()
and friends now return a specialized error message if they
detect that your join would return more rows than dplyr can handle (#6912).
slice_*()
now throw the correct error if you forget to name n
while also
prefixing the call with dplyr::
(#6946).
dplyr_reconstruct()
's default method has been rewritten to avoid
materializing duckplyr queries too early (#6947).
Updated the storms
data to include 2022 data (#6937, @steveharoz).
Updated the starwars
data to use a new API, because the old one is defunct.
There are very minor changes to the data itself (#6938, @steveharoz).
mutate_each()
and summarise_each()
now throw correct deprecation messages
(#6869).
setequal()
now requires the input data frames to be compatible, similar to
the other set methods like setdiff()
or intersect()
(#6786).
count()
better documents that it has a .drop
argument (#6820).
Fixed tests to maintain compatibility with the next version of waldo (#6823).
Joins better handle key columns will all NA
s (#6804).
Mutating joins now warn about multiple matches much less often. At a high
level, a warning was previously being thrown when a one-to-many or
many-to-many relationship was detected between the keys of x
and y
, but is
now only thrown for a many-to-many relationship, which is much rarer and much
more dangerous than one-to-many because it can result in a Cartesian explosion
in the number of rows returned from the join (#6731, #6717).
We've accomplished this in two steps:
multiple
now defaults to "all"
, and the options of "error"
and
"warning"
are now deprecated in favor of using relationship
(see below).
We are using an accelerated deprecation process for these two options
because they've only been available for a few weeks, and relationship
is
a clearly superior alternative.
The mutating joins gain a new relationship
argument, allowing you to
optionally enforce one of the following relationship constraints between the
keys of x
and y
: "one-to-one"
, "one-to-many"
, "many-to-one"
, or
"many-to-many"
.
For example, "many-to-one"
enforces that each row in x
can match at
most 1 row in y
. If a row in x
matches >1 rows in y
, an error is
thrown. This option serves as the replacement for multiple = "error"
.
The default behavior of relationship
doesn't assume that there is any
relationship between x
and y
. However, for equality joins it will check
for the presence of a many-to-many relationship, and will warn if it detects
one.
This change unfortunately does mean that if you have set multiple = "all"
to
avoid a warning and you happened to be doing a many-to-many style join, then
you will need to replace multiple = "all"
with
relationship = "many-to-many"
to silence the new warning, but we believe
this should be rare since many-to-many relationships are fairly uncommon.
Fixed a major performance regression in case_when()
. It is still a little
slower than in dplyr 1.0.10, but we plan to improve this further in the future
(#6674).
Fixed a performance regression related to nth()
, first()
, and last()
(#6682).
Fixed an issue where expressions involving infix operators had an abnormally large amount of overhead (#6681).
group_data()
on ungrouped data frames is faster (#6736).
n()
is a little faster when there are many groups (#6727).
pick()
now returns a 1 row, 0 column tibble when ...
evaluates to an
empty selection. This makes it more compatible with tidyverse recycling
rules in some
edge cases (#6685).
if_else()
and case_when()
again accept logical conditions that have
attributes (#6678).
arrange()
can once again sort the numeric_version
type from base R
(#6680).
slice_sample()
now works when the input has a column named replace
.
slice_min()
and slice_max()
now work when the input has columns named
na_rm
or with_ties
(#6725).
nth()
now errors informatively if n
is NA
(#6682).
Joins now throw a more informative error when y
doesn't have the same
source as x
(#6798).
All major dplyr verbs now throw an informative error message if the input
data frame contains a column named NA
or ""
(#6758).
Deprecation warnings thrown by filter()
now mention the correct package
where the problem originated from (#6679).
Fixed an issue where using <-
within a grouped mutate()
or summarise()
could cross contaminate other groups (#6666).
The compatibility vignette has been replaced with a more general vignette on
using dplyr in packages, vignette("in-packages")
(#6702).
The developer documentation in ?dplyr_extending
has been refreshed and
brought up to date with all changes made in 1.1.0 (#6695).
rename_with()
now includes an example of using paste0(recycle0 = TRUE)
to
correctly handle empty selections (#6688).
R >=3.5.0 is now explicitly required. This is in line with the tidyverse policy of supporting the 5 most recent versions of R.
.by
/by
is an
experimental alternative to group_by()
that supports per-operation grouping
for mutate()
, summarise()
, filter()
, and the slice()
family (#6528).
Rather than:
starwars %>%
group_by(species, homeworld) %>%
summarise(mean_height = mean(height))
You can now write:
starwars %>%
summarise(
mean_height = mean(height),
.by = c(species, homeworld)
)
The most useful reason to do this is because .by
only affects a single
operation. In the example above, an ungrouped data frame went into the
summarise()
call, so an ungrouped data frame will come out; with .by
, you
never need to remember to ungroup()
afterwards and you never need to use
the .groups
argument.
Additionally, using summarise()
with .by
will never sort the results by
the group key, unlike with group_by()
. Instead, the results are returned
using the existing ordering of the groups from the original data. We feel this
is more predictable, better maintains any ordering you might have already
applied with a previous call to arrange()
, and provides a way to maintain
the current ordering without having to resort to factors.
This feature was inspired by data.table, where the equivalent syntax looks like:
starwars[, .(mean_height = mean(height)), by = .(species, homeworld)]
with_groups()
is superseded in favor of .by
(#6582).
reframe()
is a new experimental verb that creates a new data frame by
applying functions to columns of an existing data frame. It is very similar to
summarise()
, with two big differences:
reframe()
can return an arbitrary number of rows per group, while
summarise()
reduces each group down to a single row.
reframe()
always returns an ungrouped data frame, while summarise()
might return a grouped or rowwise data frame, depending on the scenario.
reframe()
has been added in response to valid concern from the community
that allowing summarise()
to return any number of rows per group increases
the chance for accidental bugs. We still feel that this is a powerful
technique, and is a principled replacement for do()
, so we have moved these
features to reframe()
(#6382).
group_by()
now uses a new algorithm for computing groups. It is often faster
than the previous approach (especially when there are many groups), and in
most cases there should be no changes. The one exception is with character
vectors, see the C locale news bullet below for more details (#4406, #6297).
arrange()
now uses a faster algorithm for sorting character vectors, which
is heavily inspired by data.table's forder()
. See the C locale news bullet
below for more details (#4962).
Joins have been completely overhauled to enable more flexible join operations and provide more tools for quality control. Many of these changes are inspired by data.table's join syntax (#5914, #5661, #5413, #2240).
A join specification can now be created through join_by()
. This allows
you to specify both the left and right hand side of a join using unquoted
column names, such as join_by(sale_date == commercial_date)
. Join
specifications can be supplied to any *_join()
function as the by
argument.
Join specifications allow for new types of joins:
Equality joins: The most common join, specified by ==
. For example,
join_by(sale_date == commercial_date)
.
Inequality joins: For joining on inequalities, i.e.>=
, >
, <
, and
<=
. For example, use join_by(sale_date >= commercial_date)
to find
every commercial that aired before a particular sale.
Rolling joins: For "rolling" the closest match forward or backwards when
there isn't an exact match, specified by using the rolling helper,
closest()
. For example,
join_by(closest(sale_date >= commercial_date))
to find only the most
recent commercial that aired before a particular sale.
Overlap joins: For detecting overlaps between sets of columns, specified
by using one of the overlap helpers: between()
, within()
, or
overlaps()
. For example, use
join_by(between(commercial_date, sale_date_lower, sale_date))
to
find commercials that aired before a particular sale, as long as they
occurred after some lower bound, such as 40 days before the sale was made.
Note that you cannot use arbitrary expressions in the join conditions, like
join_by(sale_date - 40 >= commercial_date)
. Instead, use mutate()
to
create a new column containing the result of sale_date - 40
and refer
to that by name in join_by()
.
multiple
is a new argument for controlling what happens when a row
in x
matches multiple rows in y
. For equality joins and rolling joins,
where this is usually surprising, this defaults to signalling a "warning"
,
but still returns all of the matches. For inequality joins, where multiple
matches are usually expected, this defaults to returning "all"
of the
matches. You can also return only the "first"
or "last"
match, "any"
of the matches, or you can "error"
.
keep
now defaults to NULL
rather than FALSE
. NULL
implies
keep = FALSE
for equality conditions, but keep = TRUE
for inequality
conditions, since you generally want to preserve both sides of an
inequality join.
unmatched
is a new argument for controlling what happens when a row
would be dropped because it doesn't have a match. For backwards
compatibility, the default is "drop"
, but you can also choose to
"error"
if dropped rows would be surprising.
across()
gains an experimental .unpack
argument to optionally unpack
(as in, tidyr::unpack()
) data frames returned by functions in .fns
(#6360).
consecutive_id()
for creating groups based on contiguous runs of the
same values, like data.table::rleid()
(#1534).
case_match()
is a "vectorised switch" variant of case_when()
that matches
on values rather than logical expressions. It is like a SQL "simple"
CASE WHEN
statement, whereas case_when()
is like a SQL "searched"
CASE WHEN
statement (#6328).
cross_join()
is a more explicit and slightly more correct replacement for
using by = character()
during a join (#6604).
pick()
makes it easy to access a subset of columns from the current group.
pick()
is intended as a replacement for across(.fns = NULL)
, cur_data()
,
and cur_data_all()
. We feel that pick()
is a much more evocative name when
you are just trying to select a subset of columns from your data (#6204).
symdiff()
computes the symmetric difference (#4811).
arrange()
and group_by()
now use the C locale, not the system locale,
when ordering or grouping character vectors. This brings substantial
performance improvements, increases reproducibility across R sessions, makes
dplyr more consistent with data.table, and we believe it should affect little
existing code. If it does affect your code, you can use
options(dplyr.legacy_locale = TRUE)
to quickly revert to the previous
behavior. However, in general, we instead recommend that you use the new
.locale
argument to precisely specify the desired locale. For a full
explanation please read the associated
grouping
and ordering
tidyups.
bench_tbls()
, compare_tbls()
, compare_tbls2()
, eval_tbls()
,
eval_tbls2()
, location()
and changes()
, deprecated in 1.0.0, are now
defunct (#6387).
frame_data()
, data_frame_()
, lst_()
and tbl_sum()
are no longer
re-exported from tibble (#6276, #6277, #6278, #6284).
select_vars()
, rename_vars()
, select_var()
and current_vars()
,
deprecated in 0.8.4, are now defunct (#6387).
across()
, c_across()
, if_any()
, and if_all()
now require the
.cols
and .fns
arguments. In general, we now recommend that you use
pick()
instead of an empty across()
call or across()
with no .fns
(e.g. across(c(x, y))
. (#6523).
Relying on the previous default of .cols = everything()
is deprecated.
We have skipped the soft-deprecation stage in this case, because indirect
usage of across()
and friends in this way is rare.
Relying on the previous default of .fns = NULL
is not yet formally
soft-deprecated, because there was no good alternative until now, but it is
discouraged and will be soft-deprecated in the next minor release.
Passing ...
to across()
is soft-deprecated because it's ambiguous when
those arguments are evaluated. Now, instead of (e.g.)
across(a:b, mean, na.rm = TRUE)
you should write
across(a:b, ~ mean(.x, na.rm = TRUE))
(#6073).
all_equal()
is deprecated. We've advised against it for some time, and
we explicitly recommend you use all.equal()
, manually reordering the rows
and columns as needed (#6324).
cur_data()
and cur_data_all()
are soft-deprecated in favour of
pick()
(#6204).
Using by = character()
to perform a cross join is now soft-deprecated in
favor of cross_join()
(#6604).
filter()
ing with a 1-column matrix is deprecated (#6091).
progress_estimate()
is deprecated for all uses (#6387).
Using summarise()
to produce a 0 or >1 row "summary" is deprecated in favor
of the new reframe()
. See the NEWS bullet about reframe()
for more details
(#6382).
All functions deprecated in 1.0.0 (released April 2020) and earlier now warn
every time you use them (#6387). This includes combine()
, src_local()
,
src_mysql()
, src_postgres()
, src_sqlite()
, rename_vars_()
,
select_vars_()
, summarise_each_()
, mutate_each_()
, as.tbl()
,
tbl_df()
, and a handful of older arguments. They are likely to be made
defunct in the next major version (but not before mid 2024).
slice()
ing with a 1-column matrix is deprecated.
recode()
is superseded in favour of case_match()
(#6433).
recode_factor()
is superseded. We don't have a direct replacement for it
yet, but we plan to add one to forcats. In the meantime you can often use
case_match(.ptype = factor(levels = ))
instead (#6433).
transmute()
is superseded in favour of mutate(.keep = "none")
(#6414).
The .keep
, .before
, and .after
arguments to mutate()
have moved
from experimental to stable.
The rows_*()
family of functions have moved from experimental to stable.
Many of dplyr's vector functions have been rewritten to make use of the vctrs package, bringing greater consistency and improved performance.
between()
can now work with all vector types, not just numeric and
date-time. Additionally, left
and right
can now also be vectors (with the
same length as x
), and x
, left
, and right
are cast to the common type
before the comparison is made (#6183, #6260, #6478).
case_when()
(#5106):
Has a new .default
argument that is intended to replace usage of
TRUE ~ default_value
as a more explicit and readable way to specify
a default value. In the future, we will deprecate the unsafe recycling of
the LHS inputs that allows TRUE ~
to work, so we encourage you to switch
to using .default
.
No longer requires exact matching of the types of RHS values. For example,
the following no longer requires you to use NA_character_
.
x <- c("little", "unknown", "small", "missing", "large")
case_when(
x %in% c("little", "small") ~ "one",
x %in% c("big", "large") ~ "two",
x %in% c("missing", "unknown") ~ NA
)
Supports a larger variety of RHS value types. For example, you can use a data frame to create multiple columns at once.
Has new .ptype
and .size
arguments which allow you to enforce
a particular output type and size.
Has a better error when types or lengths were incompatible (#6261, #6206).
coalesce()
(#6265):
Discards NULL
inputs up front.
No longer iterates over the columns of data frame input. Instead, a row is
now only coalesced if it is entirely missing, which is consistent with
vctrs::vec_detect_missing()
and greatly simplifies the implementation.
Has new .ptype
and .size
arguments which allow you to enforce
a particular output type and size.
first()
, last()
, and nth()
(#6331):
When used on a data frame, these functions now return a single row rather than a single column. This is more consistent with the vctrs principle that a data frame is generally treated as a vector of rows.
The default
is no longer "guessed", and will always automatically be set
to a missing value appropriate for the type of x
.
Error if n
is not an integer. nth(x, n = 2)
is fine, but
nth(x, n = 2.5)
is now an error.
Additionally, they have all gained an na_rm
argument since they
are summary functions (#6242, with contributions from @tnederlof).
if_else()
gains most of the same benefits as case_when()
. In particular,
if_else()
now takes the common type of true
, false
, and missing
to
determine the output type, meaning that you can now reliably use NA
,
rather than NA_character_
and friends (#6243).
na_if()
(#6329) now casts y
to the type of x
before comparison, which
makes it clearer that this function is type and size stable on x
. In
particular, this means that you can no longer do na_if(<tibble>, 0)
, which
previously accidentally allowed you to replace any instance of 0
across
every column of the tibble with NA
. na_if()
was never intended to work
this way, and this is considered off-label usage.
You can also now replace NaN
values in x
with na_if(x, NaN)
.
lag()
and lead()
now cast default
to the type of x
, rather than taking
the common type. This ensures that these functions are type stable on x
(#6330).
row_number()
, min_rank()
, dense_rank()
, ntile()
, cume_dist()
, and
percent_rank()
are faster and work for more types. You can now rank by
multiple columns by supplying a data frame (#6428).
with_order()
now checks that the size of order_by
is the same size as x
,
and now works correctly when order_by
is a data frame (#6334).
Fixed an issue with latest rlang that caused internal tools (such as
mask$eval_all_summarise()
) to be mentioned in error messages (#6308).
Warnings are enriched with contextualised information in summarise()
and
filter()
just like they have been in mutate()
and arrange()
.
Joins now reference the correct column in y
when a type error is thrown
while joining on two columns with different names (#6465).
Joins on very wide tables are no longer bottlenecked by the application of
suffix
(#6642).
*_join()
now error if you supply them with additional arguments that
aren't used (#6228).
across()
used without functions inside a rowwise-data frame no longer
generates an invalid data frame (#6264).
Anonymous functions supplied with function()
and \()
are now inlined by
across()
if possible, which slightly improves performance and makes possible
further optimisations in the future.
Functions supplied to across()
are no longer masked by columns (#6545). For
instance, across(1:2, mean)
will now work as expected even if there is a
column called mean
.
across()
will now error when supplied ...
without a .fns
argument
(#6638).
arrange()
now correctly ignores NULL
inputs (#6193).
arrange()
now works correctly when across()
calls are used as the 2nd
(or more) ordering expression (#6495).
arrange(df, mydesc::desc(x))
works correctly when mydesc re-exports
dplyr::desc()
(#6231).
c_across()
now evaluates all_of()
correctly and no longer allows you to
accidentally select grouping variables (#6522).
c_across()
now throws a more informative error if you try to rename during
column selection (#6522).
dplyr no longer provides count()
and tally()
methods for tbl_sql
.
These methods have been accidentally overriding the tbl_lazy
methods that
dbplyr provides, which has resulted in issues with the grouping structure of
the output (#6338, tidyverse/dbplyr#940).
cur_group()
now works correctly with zero row grouped data frames (#6304).
desc()
gives a useful error message if you give it a non-vector (#6028).
distinct()
now retains attributes of bare data frames (#6318).
distinct()
returns columns ordered the way you request, not the same
as the input data (#6156).
Error messages in group_by()
, distinct()
, tally()
, and count()
are now
more relevant (#6139).
group_by_prepare()
loses the caller_env
argument. It was rarely used
and it is no longer needed (#6444).
group_walk()
gains an explict .keep
argument (#6530).
Warnings emitted inside mutate()
and variants are now collected and stashed
away. Run the new last_dplyr_warnings()
function to see the warnings emitted
within dplyr verbs during the last top-level command.
This fixes performance issues when thousands of warnings are emitted with rowwise and grouped data frames (#6005, #6236).
mutate()
behaves a little better with 0-row rowwise inputs (#6303).
A rowwise mutate()
now automatically unlists list-columns containing
length 1 vectors (#6302).
nest_join()
has gained the na_matches
argument that all other joins have.
nest_join()
now preserves the type of y
(#6295).
n_distinct()
now errors if you don't give it any input (#6535).
nth()
, first()
, last()
, and with_order()
now sort character order_by
vectors in the C locale. Using character vectors for order_by
is rare, so we
expect this to have little practical impact (#6451).
ntile()
now requires n
to be a single positive integer.
relocate()
now works correctly with empty data frames and when .before
or
.after
result in empty selections (#6167).
relocate()
no longer drops attributes of bare data frames (#6341).
relocate()
now retains the last name change when a single column is renamed
multiple times while it is being moved. This better matches the behavior of
rename()
(#6209, with help from @eutwt).
rename()
now contains examples of using all_of()
and any_of()
to rename
using a named character vector (#6644).
rename_with()
now disallows renaming in the .cols
tidy-selection (#6561).
rename_with()
now checks that the result of .fn
is the right type and size
(#6561).
rows_insert()
now checks that y
contains the by
columns (#6652).
setequal()
ignores differences between freely coercible types (e.g. integer
and double) (#6114) and ignores duplicated rows (#6057).
slice()
helpers again produce output equivalent to slice(.data, 0)
when
the n
or prop
argument is 0, fixing a bug introduced in the previous
version (@eutwt, #6184).
slice()
with no inputs now returns 0 rows. This is mostly for theoretical
consistency (#6573).
slice()
now errors if any expressions in ...
are named. This helps avoid
accidentally misspelling an optional argument, such as .by
(#6554).
slice_*()
now requires n
to be an integer.
slice_*()
generics now perform argument validation. This should make
methods more consistent and simpler to implement (#6361).
slice_min()
and slice_max()
can order_by
multiple variables if you
supply them as a data.frame or tibble (#6176).
slice_min()
and slice_max()
now consistently include missing values in
the result if necessary (i.e. there aren't enough non-missing values to
reach the n
or prop
you have selected). If you don't want missing values
to be included at all, set na_rm = TRUE
(#6177).
slice_sample()
now accepts negative n
and prop
values (#6402).
slice_sample()
returns a data frame or group with the same number of rows as
the input when replace = FALSE
and n
is larger than the number of rows or
prop
is larger than 1. This reverts a change made in 1.0.8, returning to the
behavior of 1.0.7 (#6185)
slice_sample()
now gives a more informative error when replace = FALSE
and
the number of rows requested in the sample exceeds the number of rows in the
data (#6271).
storms
has been updated to include 2021 data and some missing storms that
were omitted due to an error (@steveharoz, #6320).
summarise()
now correctly recycles named 0-column data frames (#6509).
union_all()
, like union()
, now requires that data frames be compatible:
i.e. they have the same columns, and the columns have compatible types.
where()
is re-exported from tidyselect (#6597).
Hot patch release to resolve R CMD check failures.
New rows_append()
which works like rows_insert()
but ignores keys and
allows you to insert arbitrary rows with a guarantee that the type of x
won't change (#6249, thanks to @krlmlr for the implementation and @mgirlich
for the idea).
The rows_*()
functions no longer require that the key values in x
uniquely
identify each row. Additionally, rows_insert()
and rows_delete()
no
longer require that the key values in y
uniquely identify each row. Relaxing
this restriction should make these functions more practically useful for
data frames, and alternative backends can enforce this in other ways as needed
(i.e. through primary keys) (#5553).
rows_insert()
gained a new conflict
argument allowing you greater control
over rows in y
with keys that conflict with keys in x
. A conflict arises
if a key in y
already exists in x
. By default, a conflict results in an
error, but you can now also "ignore"
these y
rows. This is very similar to
the ON CONFLICT DO NOTHING
command from SQL (#5588, with helpful additions
from @mgirlich and @krlmlr).
rows_update()
, rows_patch()
, and rows_delete()
gained a new unmatched
argument allowing you greater control over rows in y
with keys that are
unmatched by the keys in x
. By default, an unmatched key results in an
error, but you can now also "ignore"
these y
rows (#5984, #5699).
rows_delete()
no longer requires that the columns of y
be a strict subset
of x
. Only the columns specified through by
will be utilized from y
,
all others will be dropped with a message.
The rows_*()
functions now always retain the column types of x
. This
behavior was documented, but previously wasn't being applied correctly
(#6240).
The rows_*()
functions now fail elegantly if y
is a zero column data frame
and by
isn't specified (#6179).
Better display of error messages thanks to rlang 1.0.0.
mutate(.keep = "none")
is no longer identical to transmute()
.
transmute()
has not been changed, and completely ignores the column ordering
of the existing data, instead relying on the ordering of expressions
supplied through ...
. mutate(.keep = "none")
has been changed to ensure
that pre-existing columns are never moved, which aligns more closely with the
other .keep
options (#6086).
filter()
forbids matrix results (#5973) and warns about data frame
results, especially data frames created from across()
with a hint
to use if_any()
or if_all()
.
slice()
helpers (slice_head()
, slice_tail()
, slice_min()
, slice_max()
)
now accept negative values for n
and prop
(#5961).
slice()
now indicates which group produces an error (#5931).
cur_data()
and cur_data_all()
don't simplify list columns in rowwise data frames (#5901).
dplyr now uses rlang::check_installed()
to prompt you whether to install
required packages that are missing.
storms
data updated to 2020 (@steveharoz, #5899).
coalesce()
accepts 1-D arrays (#5557).
The deprecated trunc_mat()
is no longer reexported from dplyr (#6141).
across()
uses the formula environment when inlining them (#5886).
summarise.rowwise_df()
is quiet when the result is ungrouped (#5875).
c_across()
and across()
key deparsing not confused by long calls (#5883).
across()
handles named selections (#5207).
add_count()
is now generic (#5837).
if_any()
and if_all()
abort when a predicate is mistakingly used as .cols=
(#5732).
Multiple calls to if_any()
and/or if_all()
in the same expression are now
properly disambiguated (#5782).
filter()
now inlines if_any()
and if_all()
expressions. This greatly
improves performance with grouped data frames.
Fixed behaviour of ...
in top-level across()
calls (#5813, #5832).
across()
now inlines lambda-formulas. This is slightly more performant and
will allow more optimisations in the future.
Fixed issue in bind_rows()
causing lists to be incorrectly transformed as
data frames (#5417, #5749).
select()
no longer creates duplicate variables when renaming a variable
to the same name as a grouping variable (#5841).
dplyr_col_select()
keeps attributes for bare data frames (#5294, #5831).
Fixed quosure handling in dplyr::group_by()
that caused issues with extra
arguments (tidyverse/lubridate#959).
Removed the name
argument from the compute()
generic (@ianmcook, #5783).
row-wise data frames of 0 rows and list columns are supported again (#5804).