Title: | Expand Tidy Output for Categorical Parameter Estimates |
---|---|
Description: | Create additional rows and columns on broom::tidy() output to allow for easier control on categorical parameter estimates. |
Authors: | Guy J. Abel [aut, cre] |
Maintainer: | Guy J. Abel <[email protected]> |
License: | GPL-3 |
Version: | 0.1.2 |
Built: | 2024-10-26 02:55:02 UTC |
Source: | https://github.com/guyabel/tidycat |
Primarily developed for use within tidycat::tidy_categorical()
factor_regex(m, at_start = TRUE)
factor_regex(m, at_start = TRUE)
m |
A model object, created using a function such as stats::lm() |
at_start |
Logical indicating whether or not to include |
A character string for use as a regular expression.
Guy J. Abel
m0 <- lm(formula = mpg ~ disp + as.factor(am)*as.factor(vs), data = mtcars) factor_regex(m = m0)
m0 <- lm(formula = mpg ~ disp + as.factor(am)*as.factor(vs), data = mtcars) factor_regex(m = m0)
Create additional columns in a tidy model output (such as broom::tidy.lm()) to allow for easier control when plotting categorical parameter estimates.
tidy_categorical( d = NULL, m = NULL, include_reference = TRUE, reference_label = "Baseline Category", non_reference_label = paste0("Non-", reference_label), exponentiate = FALSE, n_level = FALSE )
tidy_categorical( d = NULL, m = NULL, include_reference = TRUE, reference_label = "Baseline Category", non_reference_label = paste0("Non-", reference_label), exponentiate = FALSE, n_level = FALSE )
d |
A data frame tibble::tibble() output from broom::tidy.lm(); with one row for each term in the regression, including column |
m |
A model object, created using a function such as lm() |
include_reference |
Logical indicating to include additional rows in output for reference categories, obtained from dummy.coef(). Defaults to |
reference_label |
Character string. When used will create an additional column in output with labels to indicate if terms correspond to reference categories. |
non_reference_label |
Character string. When |
exponentiate |
Logical indicating whether or not the results in broom::tidy.lm() are exponentiated. Defaults to |
n_level |
Logical indicating whether or not to include a column |
Expanded tibble::tibble() from the version passed to d
including additional columns:
variable |
The name of the variable that the regression term belongs to. |
level |
The level of the categorical variable that the regression term belongs to. Will be an the term name for numeric variables. |
effect |
The type of term ( |
reference |
The type of term ( |
n_level |
The the number of observations per category. If |
In addition, extra rows will be added, if include_reference
is set to FALSE
for the reference categories, obtained from dummy.coef()
Guy J. Abel
# strip ordering in factors (currently ordered factor not supported) library(dplyr) library(broom) m0 <- esoph %>% mutate_if(is.factor, ~factor(., ordered = FALSE)) %>% glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp, data = ., family = binomial()) # tidy tidy(m0) # add further columns to tidy output to help manage categorical variables m0 %>% tidy() %>% tidy_categorical(m = m0, include_reference = FALSE) # include reference categories and column to indicate the additional terms m0 %>% tidy() %>% tidy_categorical(m = m0) # coefficient plots d0 <- m0 %>% tidy(conf.int = TRUE) %>% tidy_categorical(m = m0) %>% # drop the intercept term slice(-1) d0 # typical coefficient plot library(ggplot2) library(tidyr) ggplot(data = d0 %>% drop_na(), mapping = aes(x = term, y = estimate, ymin = conf.low, ymax = conf.high)) + coord_flip() + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange() # enhanced coefficient plot using additional columns from tidy_categorical and ggforce::facet_row() library(ggforce) ggplot(data = d0, mapping = aes(x = level, colour = reference, y = estimate, ymin = conf.low, ymax = conf.high)) + facet_row(facets = vars(variable), scales = "free_x", space = "free") + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange() + theme(axis.text.x = element_text(angle = 45, hjust = 1))
# strip ordering in factors (currently ordered factor not supported) library(dplyr) library(broom) m0 <- esoph %>% mutate_if(is.factor, ~factor(., ordered = FALSE)) %>% glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp, data = ., family = binomial()) # tidy tidy(m0) # add further columns to tidy output to help manage categorical variables m0 %>% tidy() %>% tidy_categorical(m = m0, include_reference = FALSE) # include reference categories and column to indicate the additional terms m0 %>% tidy() %>% tidy_categorical(m = m0) # coefficient plots d0 <- m0 %>% tidy(conf.int = TRUE) %>% tidy_categorical(m = m0) %>% # drop the intercept term slice(-1) d0 # typical coefficient plot library(ggplot2) library(tidyr) ggplot(data = d0 %>% drop_na(), mapping = aes(x = term, y = estimate, ymin = conf.low, ymax = conf.high)) + coord_flip() + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange() # enhanced coefficient plot using additional columns from tidy_categorical and ggforce::facet_row() library(ggforce) ggplot(data = d0, mapping = aes(x = level, colour = reference, y = estimate, ymin = conf.low, ymax = conf.high)) + facet_row(facets = vars(variable), scales = "free_x", space = "free") + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange() + theme(axis.text.x = element_text(angle = 45, hjust = 1))