I want to have IQR for dates but when I run the code there are columns with NA...
library(tibble)
library(gtsummary)
set.seed(123) # Set seed for reproducibility
date_tbl <- tibble(
start_date = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"),
by = "day"), 100, replace = TRUE),
end_date = sample(seq(as.Date("2024-01-01"), as.Date("2024-12-31"),
by = "day"), 100, replace = TRUE),
country = sample(c("Kenya", "uganda", "Rwanda", "Burundi"), 100,
replace = TRUE)
)
date_tbl |>
tbl_summary(by = "country")
#> The following errors were returned during `tbl_summary()`:
#> ✖ For variable `end_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `end_date` (`country = "uganda"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "uganda"`) and "p75" statistic: * not
#> defined for "Date" objects
Created on 2025-03-24 with reprex v2.1.1
I want to have IQR for dates but when I run the code there are columns with NA...
library(tibble)
library(gtsummary)
set.seed(123) # Set seed for reproducibility
date_tbl <- tibble(
start_date = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"),
by = "day"), 100, replace = TRUE),
end_date = sample(seq(as.Date("2024-01-01"), as.Date("2024-12-31"),
by = "day"), 100, replace = TRUE),
country = sample(c("Kenya", "uganda", "Rwanda", "Burundi"), 100,
replace = TRUE)
)
date_tbl |>
tbl_summary(by = "country")
#> The following errors were returned during `tbl_summary()`:
#> ✖ For variable `end_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `end_date` (`country = "uganda"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "uganda"`) and "p75" statistic: * not
#> defined for "Date" objects
Created on 2025-03-24 with reprex v2.1.1
Share Improve this question asked Mar 24 at 8:14 MosesMoses 1,51617 silver badges35 bronze badges2 Answers
Reset to default 1I want to have IQR for dates
1) A single numeric
value (it's very definition)
Set-up your IQR
-function with type=1
(see Details: Types in help file of quantile
).
library(gtsummary)
iqr_date = \(x) IQR(x, type=1)
date_tbl |>
tbl_summary(statistic=list(start_date ~ '{iqr_date}',
end_date ~ '{iqr_date}'),
by='country')
2) A date range
Date of q25 to date of q75 as character
. This might be what you want.
date_iq_range = \(x) quantile(x, probs=c(.25, .75), type=1) |>
paste0(collapse='-to-')
date_tbl |>
tbl_summary(statistic=list(start_date ~ '{date_iq_range}',
end_date ~ '{date_iq_range}'),
by='country')
You might want to use a different separator than -to-
. Maybe |> toString() |> paste0('(', ...=_, ')')
instead of |> paste0(collapse='-to-')
.
A) Data
set.seed(123)
date_tbl = tibble::tibble(
start_date = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"),
by = "day"), 100, replace = TRUE),
end_date = sample(seq(as.Date("2024-01-01"), as.Date("2024-12-31"),
by = "day"), 100, replace = TRUE),
country = sample(c("Kenya", "uganda", "Rwanda", "Burundi"), 100,
replace = TRUE)
)
I don't have a great answer for you. But the issue is not related to gtsummary, and perhaps may be a bug in the quantile()
function?
Running the code you provided, the the quantiles can be calculated for 2 of the countries, while the other two result in errors. I did some poking around, but didn't see a clear pattern in the data that resulted in an error vs a returned quantile value.
library(gtsummary)
set.seed(123) # Set seed for reproducibility
date_tbl <- tibble::tibble(
start_date = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"),
by = "day"), 100, replace = TRUE),
end_date = sample(seq(as.Date("2024-01-01"), as.Date("2024-12-31"),
by = "day"), 100, replace = TRUE),
country = sample(c("Kenya", "uganda", "Rwanda", "Burundi"), 100,
replace = TRUE)
)
date_tbl |> tbl_summary(by = country) |> as_kable()
#> The following errors were returned during `as_kable()`:
#> ✖ For variable `end_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "Burundi"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `end_date` (`country = "uganda"`) and "p25" and "p75"
#> statistics: * not defined for "Date" objects
#> ✖ For variable `start_date` (`country = "uganda"`) and "p75" statistic: * not
#> defined for "Date" objects
**Characteristic** | **Burundi** N = 16 | **Kenya** N = 43 | **Rwanda** N = 17 | **uganda** N = 24 |
---|---|---|---|---|
start_date | 2023-03-30 (NA, NA) | 2023-06-07 (2023-03-13, 2023-09-13) | 2023-08-24 (2023-04-26, 2023-10-26) | 2023-07-16 (2023-05-17, NA) |
end_date | 2024-08-26 (NA, NA) | 2024-07-16 (2024-03-24, 2024-10-12) | 2024-07-18 (2024-05-04, 2024-11-05) | 2024-06-17 (NA, NA) |
# Error for Burundi, but no error for Kenya
date_tbl |>
dplyr::filter(country == "Burundi") |>
dplyr::pull(start_date) |>
quantile(probs = 0.25, type = 2)
#> Error in Ops.Date((1 - h), x[j + 2L]): * not defined for "Date" objects
date_tbl |>
dplyr::filter(country == "Kenya") |>
dplyr::pull(start_date) |>
quantile(probs = 0.25, type = 2)
#> 25%
#> "2023-03-13"
# excluding the first obs, no error
date_tbl$start_date[2:100] |>
quantile(probs = 0.25, type = 2)
#> 25%
#> "2023-03-22"
# including all obs, ERROR
date_tbl$start_date[1:100] |>
quantile(probs = 0.25, type = 2)
#> Error in Ops.Date((1 - h), x[j + 2L]): * not defined for "Date" objects
# excluding the last 5 obs, no error
date_tbl$start_date[1:95] |>
quantile(probs = 0.25, type = 2)
#> 25%
#> "2023-03-19"
<sup>Created on 2025-03-24 with [reprex v2.1.1](https://reprex.tidyverse.)</sup>
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744254385a4565323.html
评论列表(0条)