Tidy data structures, summaries, and visualisations for missing data
impute_fixed
, impute_zero
, and impute_factor
. notably these do not implement "scoped variants" which were previously implemented - for example, impute_fixed_if
etc. This is in favour of using the new across
workflow within dplyr
, and it is easier to maintain. #261digit
argument to miss_var_summary
to help display %missing data correctly when there is a very small fraction of missingness. #284impute_mode
- resolves #213.geom_miss_point()
works with shape
argument #290all_complete
, which was implemented as !anyNA(x)
but should be all(complete.cases(x))
.any_na()
(and any_miss()
) and any_complete()
. Rework examples to demonstrate workflow for finding complete variables.shadow_long
not working when gathering variables of mixed type. Fix involves specifying a value transform, which defaults to character. #314Date
, POSIXct
and POSIXlt
methods for impute_below()
- #158gg_miss_fct
where it used a deprecated function from forcats - #342cli::cli_abort
and cli::cli_warn
instead of stop
and warn
(#326)expect_snapshot
instead of expect_error
(#326)shadow_shift
- #193miss_case_cumsum()
and miss_var_cumsum()
- #257Version 1.0.0 of naniar is to signify that this release is associated with the publication of the associated JSS paper, doi:10.18637/jss.v105.i07. There are also a few small changes that have been implemented in this release, which are described below.
There is still a lot to do in naniar, and this release does not signify that there are no changes upcoming, more so to establish that this is a stable release, and that any changes upcoming will go through a more formal deprecation process and so on.
tidyr::gather
with tidyr::pivot_longer
- resolves #301set_n_miss
and set_prop_miss
functions - resolved #298gg_miss_var()
where a warning appears to due change in how to
remove legend #288.vctrs
and cli
- which are both free dependencies as they
are used within the already used tidyverse already.replace_with_na
when columns provided that don't exist (see #160). Thank you to michael-dewar for their help with this.nabular()
and bind_shadow()
. In doing so removes the functions, as_shadow()
, is_shadow()
, is_nabular()
, new_nabular()
, new_shadow()
. These were mostly used internally and it is not expected that users would have used this functions. If these were used, please file an issue and I can implement them again.miss_var_prop()
complete_var_prop()
miss_var_pct()
complete_var_pct()
miss_case_prop()
complete_case_prop()
miss_case_pct()
complete_case_pct()
Instead use: prop_miss_var()
, prop_complete_var()
, pct_miss_var()
, pct_complete_var()
, prop_miss_case()
, prop_complete_case()
, pct_miss_case()
, pct_complete_case()
. (see 242)
replace_to_na()
was made defunct, please use replace_with_na()
instead. (see 242)miss_var_cumsum
and miss_case_cumsum
are now exportedmap_dfc
instead of map_df
rowSums(is.na(x))
, which was 3 times faster.gg_miss_fct()
where warning is given for non explicit NA values - see 241.tibble()
not data_frame()
geom_miss_point()
ggplot2 layer can now be converted into an interactive web-based version by the ggplotly()
function in the plotly package. In order for this to work, naniar now exports the geom2trace.GeomMissPoint()
function (users should never need to call geom2trace.GeomMissPoint()
directly -- ggplotly()
calls it for you).usethis::use_spell_check()
@seealso
bug (#228) (@sfirke)Thanks to a PR (#223) from @romainfrancois:
This fixes two problems that were identified as part of reverse dependency checks of dplyr 0.8.0 release candidate. https://github.com/tidyverse/dplyr/blob/revdep_dplyr_0_8_0_RC/revdep/problems.md#naniar
n() must be imported or prefixed like any other function. In the PR, I've changed 1:n() to dplyr::row_number() as naniar seems to prefix all dplyr functions.
update_shadow was only restoring the class attributes, changed so that it restores all attributes, this was causing problems when data was a grouped_df. This likely was a problem before too, but dplyr 0.8.0 is stricter about what is a grouped data frame.
new_tibble
#220 - Thanks to Kirill Müller.rlang
#218 - thanks for Lionel Henry.Add custom label support for missings and not missings with functions add_label_missings
and add_label_shadow()
and add_any_miss()
. So you can now do `add_label_missings(data, missing = "custom_missing_label", complete = "custom_complete_label")
impute_median()
and scoped variants
any_shade()
returns a logical TRUE or FALSE depending on if there are any shade
values
nabular()
an alias for bind_shadow()
to tie the nabular
term into the work.
is_nabular()
checks if input is nabular.
geom_miss_point()
now gains the arguments from shadow_shift()
/impute_below()
for altering the amount of jitter
and proportion below (prop_below
).
Added two new vignettes, "Exploring Imputed Values", and "Special Missing Values"
miss_var_summary
and miss_case_summary
now no longer provide the
cumulative sum of missingness in the summaries - this summary can be added back
to the data with the option add_cumsum = TRUE
. #186
gg_miss_upset
to replace workflow of:
data %>%
as_shadow_upset() %>%
UpSetR::upset()
recode_shadow
now works! This function allows you to recode your missing
values into special missing values. These special missing values are stored in
the shadow part of the dataframe, which ends in _NA
.shade
where appropriate throughout naniar, and also added
verifiers, is_shade
, are_shade
, which_are_shade
, and removed which_are_shadow
.as_shadow
and bind_shadow
now return data of class shadow
. This will
feed into recode_shadow
methods for flexibly adding new types of missing data.shadow
might be changed to nabble
or something similar.add_label_shadow()
and add_label_missings()
gain arguments so you can only label according to the missingness / shadowy-ness of given variables.which_are_shadow()
, to tell you which values are shadows.long_shadow()
, which converts data in shadow/nabular form into a long format suitable for plotting. Related to #165
miss_scan_count
gg_miss_upset
gets a better default presentation by ordering by the largest
intersections, and also an improved error message when data with only 1 or no
variables have missing values.shadow_shift
gains a more informative error message when it doesn't know the class.common_na_string
to include escape characters for "?", "", "." so
that if they are used in replacement or searching functions they don't return
the wildcard results from the characters "?", "", and ".".miss_case_table
and miss_var_table
now has final column names pct_vars
,
and pct_cases
instead of pct_miss
- fixes #178.old_names | new_names |
---|---|
miss_case_pct |
pct_miss_case |
miss_case_prop |
prop_miss_case |
miss_var_pct |
pct_miss_var |
miss_var_prop |
prop_miss_var |
complete_case_pct |
pct_complete_case |
complete_case_prop |
prop_complete_case |
complete_var_pct |
pct_complete_var |
complete_var_prop |
prop_complete_var |
These old names will be made defunct in 0.5.0, and removed completely in 0.6.0.
impute_below
has changed to be an alias of shadow_shift
- that is it operates on a single vector. impute_below_all
operates on all columns in a dataframe (as specified in #159)miss_scan_count
actually return
'd something.gg_miss_var(airquality)
now prints the ggplot - a typo meant that this did not print the plotThis release is a patch to remove a package imported but not used.
This is a patch release that removes tidyselect
from the package Imports, as
it is unnecessary. Fixes #174
naniar_0.3.1.tar.gz