| Title: | Provides helper functions for ASReml |
|---|---|
| Description: | This package provides helper functions for ASReml, for use with small-plot and on-farm experimentation trials. |
| Authors: | Kai Bagley [aut, cre] (ORCID: <https://orcid.org/0009-0004-6579-6959>), Jordan Brown [aut] (ORCID: <https://orcid.org/0009-0008-5014-0593>), Matthew Nguyen [aut] (ORCID: <https://orcid.org/0009-0006-6773-8263>), Braden Thorne [aut] (ORCID: <https://orcid.org/0000-0003-0497-1675>), Adam H. Sparks [aut] (ORCID: <https://orcid.org/0000-0002-0061-8359>), Grains Research and Development Corporation [fnd, cph] (Project: GRDC Project CUR2210-005OPX (AAGI-CU), ROR: 02xwr1996) |
| Maintainer: | Kai Bagley <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.1 |
| Built: | 2026-05-28 05:55:11 UTC |
| Source: | https://github.com/AAGI-AUS/CBADASReml |
This function allows you to observe the ANOVA table for multiple ASReml or glmmTMB models.
anova_table(..., n_digits = 3)anova_table(..., n_digits = 3)
... |
The models to use in the ANOVA table comparison Their values may be:
|
n_digits |
|
data.frame
A dataframe containing all ANOVA tables. Can be used with xtable to
produce a LaTeX table.
library(CBADASReml) library(asreml) test_data <- oats test_data["yield2"] <- oats["yield"] * runif(nrow(oats["yield"])) mod1 <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = test_data ) mod2 <- asreml( fixed = yield2 ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = test_data ) anova_table(mod1, mod2) ## With glmmTMB models library(glmmTMB) Salamanders$count2 <- runif(nrow(Salamanders), 0, 10) mod3 <- glmmTMB( # Zero inflated model count ~ spp * mined + (1 | site), zi = ~ spp * mined, data = Salamanders, family = nbinom2 ) mod4 <- glmmTMB( # Hurdle model count2 ~ spp * mined + (1 | site), zi = ~ spp * mined, data = Salamanders, family = truncated_nbinom2 ) anova_table(mod3, mod4)library(CBADASReml) library(asreml) test_data <- oats test_data["yield2"] <- oats["yield"] * runif(nrow(oats["yield"])) mod1 <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = test_data ) mod2 <- asreml( fixed = yield2 ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = test_data ) anova_table(mod1, mod2) ## With glmmTMB models library(glmmTMB) Salamanders$count2 <- runif(nrow(Salamanders), 0, 10) mod3 <- glmmTMB( # Zero inflated model count ~ spp * mined + (1 | site), zi = ~ spp * mined, data = Salamanders, family = nbinom2 ) mod4 <- glmmTMB( # Hurdle model count2 ~ spp * mined + (1 | site), zi = ~ spp * mined, data = Salamanders, family = truncated_nbinom2 ) anova_table(mod3, mod4)
Calculates model-based efficiency for a design using the treatment information matrix induced by an AR1 x AR1 spatial dependence structure. All this function does is evaluate how much information the design contains about treatment contrasts, after adjusting for nuisance effects (e.g. blocks , etc.)
design_efficiency( design, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.1, rho_col = 0.1, alpha = 0.05, tolerance = 0.0000000001 )design_efficiency( design, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.1, rho_col = 0.1, alpha = 0.05, tolerance = 0.0000000001 )
design |
|
treatment_cols |
|
row_col |
|
column_col |
|
block_col |
|
rho_row |
|
rho_col |
|
alpha |
significance level |
tolerance |
|
The function computes the treatment information matrix
where is the treatment indicator matrix and is the
covariance-adjusted projection matrix that removes nuisance effects:
Pairwise treatment contrast variances are then calculated from ,
the Moore-Penrose pseudoinverse of the treatment information matrix.
A list containing:
The treatment information matrix, .
Positive eigenvalues of the treatment information matrix.
Numerical rank of the treatment information matrix.
Whether all treatment contrasts are estimable. For
treatment levels, this requires rank at least .
A-optimality score, calculated as the sum of inverse positive eigenvalues. Smaller values indicate better average treatment contrast precision.
D-optimality score, calculated as the negative sum of log positive eigenvalues. Smaller values indicate greater overall information volume.
Data frame containing pairwise treatment comparisons, estimability indicators, contrast variances, and iid-equivalent effective replicates.
List of spatial correlation and design assumptions used in the calculation.
A list with a lot of info in it.
## Not run: # RCBD df <- data.frame( row = rep(1:6, each = 4), col = rep(1:4, times = 6), treatment = rep(LETTERS[1:8], 3), block = rep(1:3, each = 8) ) # Optimise while respecting blocks result <- speed::speed(df, "treatment", swap_within = "block", iterations = 5000, seed = 42 ) ## test on latin square latinsquare <- data.frame( row = rep(1:4, each = 4), col = rep(1:4, times = 4), treatment = c( "A", "B", "C", "D", "B", "C", "D", "A", "C", "D", "A", "B", "D", "A", "B", "C" ) ) res_eff <- design_efficiency( design = latinsquare, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.3, rho_col = 0.3 ) ## End(Not run)## Not run: # RCBD df <- data.frame( row = rep(1:6, each = 4), col = rep(1:4, times = 6), treatment = rep(LETTERS[1:8], 3), block = rep(1:3, each = 8) ) # Optimise while respecting blocks result <- speed::speed(df, "treatment", swap_within = "block", iterations = 5000, seed = 42 ) ## test on latin square latinsquare <- data.frame( row = rep(1:4, each = 4), col = rep(1:4, times = 4), treatment = c( "A", "B", "C", "D", "B", "C", "D", "A", "C", "D", "A", "B", "D", "A", "B", "C" ) ) res_eff <- design_efficiency( design = latinsquare, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.3, rho_col = 0.3 ) ## End(Not run)
Calculate pairwise treatment-comparison power from the Fisher information of the provided experimental design using the treatment information matrix induced by an AR1 x AR1 spatial dependence structure. This function evaluates how much "power" the design has to detect a specified treatment difference after adjusting for nuisance effects (e.g. blocks, etc.)
design_power( design, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.1, rho_col = 0.1, sigma2 = 1, delta = 1, alpha = 0.05, tolerance = 0.0000000001 )design_power( design, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.1, rho_col = 0.1, sigma2 = 1, delta = 1, alpha = 0.05, tolerance = 0.0000000001 )
design |
|
treatment_cols |
|
row_col |
|
column_col |
|
block_col |
|
rho_row |
|
rho_col |
|
sigma2 |
|
delta |
|
alpha |
|
tolerance |
|
The calculation is based on the treatment information matrix
where is the treatment indicator matrix and is the
covariance-adjusted projection matrix that removes nuisance effects:
For a treatment contrast , the contrast standard error
is computed from
where is the Moore-Penrose pseudoinverse of the treatment
information matrix. The effect size delta is then converted into a
noncentrality parameter,
and power is calculated using a two-sided known-covariance Z-test approximation.
The spatial covariance parameters rho_row, rho_col, and
sigma2 are treated as assumed planning values, not estimated from the
data. The returned power is therefore conditional on the supplied covariance
structure, the chosen effect size delta, and the significance level
alpha.
Thus, you as the statistician must select four parameters based on previous trials, domain knowledge, or conservative assumptions:
AR1 row dependence parameter.
How correlated are observations that are one row apart? This is hard to
select, I would recommend testing multiple different values such as
for no dependence, for moderate, and for
strong dependence, and reporting each.
AR1 column dependence parameter.
How correlated are observations that are one column apart? This is hard
to select, I would recommend testing multiple different values such as
for no dependence, for moderate, and for
strong dependence, and reporting each.
The residual variance.
This should be based on previous similar experiments, or a conservative
estimate. Underestimating will result in overstated
power.
Detectable treatment difference.
Choose as the smallest treatment
difference that would matter scientifically, agronomically,
commercially, etc. So if yield differences of less than t/ha
are not important to the farmer, then use . Otherwise,
test multiple deltas and report the power of each of them.
A list containing:
Data frame of pairwise treatment-comparison standard errors, noncentrality parameters, power values, and effective replicates.
The minimum pairwise power across all estimable treatment comparisons.
The average pairwise power across all estimable treatment comparisons.
The treatment comparison with the lowest power.
The treatment information matrix.
The model-based covariance matrix of treatment
estimates, equal to .
Positive eigenvalues of the treatment information matrix.
Numerical rank of the treatment information matrix.
List of covariance, effect-size, and testing assumptions used.
## Not run: # RCBD df <- data.frame( row = rep(1:6, each = 4), col = rep(1:4, times = 6), treatment = rep(LETTERS[1:8], 3), block = rep(1:3, each = 8) ) # Optimise while respecting blocks result <- speed::speed(df, "treatment", swap_within = "block", iterations = 5000, seed = 42 ) ## test on latin square latinsquare <- data.frame( row = rep(1:4, each = 4), col = rep(1:4, times = 4), treatment = c( "A", "B", "C", "D", "B", "C", "D", "A", "C", "D", "A", "B", "D", "A", "B", "C" ) ) res_power <- design_power( design = latinsquare, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.3, rho_col = 0.3, sigma2 = 1, delta = 1, alpha = 0.05 ) ## End(Not run)## Not run: # RCBD df <- data.frame( row = rep(1:6, each = 4), col = rep(1:4, times = 6), treatment = rep(LETTERS[1:8], 3), block = rep(1:3, each = 8) ) # Optimise while respecting blocks result <- speed::speed(df, "treatment", swap_within = "block", iterations = 5000, seed = 42 ) ## test on latin square latinsquare <- data.frame( row = rep(1:4, each = 4), col = rep(1:4, times = 4), treatment = c( "A", "B", "C", "D", "B", "C", "D", "A", "C", "D", "A", "B", "D", "A", "B", "C" ) ) res_power <- design_power( design = latinsquare, treatment_cols = "treatment", row_col = "row", column_col = "col", block_col = NULL, rho_row = 0.3, rho_col = 0.3, sigma2 = 1, delta = 1, alpha = 0.05 ) ## End(Not run)
This function allows you to observe directional variograms in the 0 (vertical) and 90 (horizontal) directions for gridded small-plot trial data.
directional_variograms(model)directional_variograms(model)
model |
|
gstat::variogram
A Gstat variogram.
library(asreml) model <- asreml( fixed = yield ~ weed, random = ~idv(Variety), residual = ~ar1v(Row):id(Column), data = wheat ) mod <- directional_variograms(model) plot(mod, multipanel = FALSE) ## Not run: # You can also plot using ggplot if you wish ggplot( mod, aes(x = dist, y = gamma, group = dir.hor, colour = factor(dir.hor)) ) + geom_point() + geom_line() + facet_grid(~dir.hor, scales = "free_x") ## End(Not run)library(asreml) model <- asreml( fixed = yield ~ weed, random = ~idv(Variety), residual = ~ar1v(Row):id(Column), data = wheat ) mod <- directional_variograms(model) plot(mod, multipanel = FALSE) ## Not run: # You can also plot using ggplot if you wish ggplot( mod, aes(x = dist, y = gamma, group = dir.hor, colour = factor(dir.hor)) ) + geom_point() + geom_line() + facet_grid(~dir.hor, scales = "free_x") ## End(Not run)
This function outputs the summary statistics of the given data.
exploratory_table(data, response, groupby)exploratory_table(data, response, groupby)
data |
|
response |
|
groupby |
|
data.frame
Summary table
exploratory_table(mtcars, response = "mpg", groupby = c("cyl", "gear"))exploratory_table(mtcars, response = "mpg", groupby = c("cyl", "gear"))
Generates a table of least significant differences (LSDs) for a given model.
lsd_graph(model, classify, ...)lsd_graph(model, classify, ...)
model |
The model to generate LSDs from. The value may be:
|
classify |
For |
... |
Arguments to pass to |
ggplot object
Returns a ggplot2 plot of the LSDs.
library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) lsd_graph(model, classify = "Variety")library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) lsd_graph(model, classify = "Variety")
Intended as a replacement of the agricolae::orderPvalue() function, but
with (maybe) a better algorithm.
lsd_group(treatments, means, alpha, pvalues)lsd_group(treatments, means, alpha, pvalues)
treatments |
|
means |
|
alpha |
|
pvalues |
|
data.frame Dataframe of each treatment and their associated LSD group
library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) pred <- asremlPlus::predictPlus.asreml( model, classify = "Variety", wald.tab = as.data.frame(asreml::wald(model, denDF = "algebraic")$Wald) ) prob.matrix <- ifelse(is.na(pred$p.differences), 1, pred$p.differences) treatments <- colnames(prob.matrix) means <- pred$predictions$predicted.value alpha <- 0.05 lsd_group( treatments, means, alpha, prob.matrix )library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) pred <- asremlPlus::predictPlus.asreml( model, classify = "Variety", wald.tab = as.data.frame(asreml::wald(model, denDF = "algebraic")$Wald) ) prob.matrix <- ifelse(is.na(pred$p.differences), 1, pred$p.differences) treatments <- colnames(prob.matrix) means <- pred$predictions$predicted.value alpha <- 0.05 lsd_group( treatments, means, alpha, prob.matrix )
Generates a table of least significant differences (LSDs) for a given model.
lsd_table(model, classify, alpha = 0.05, ...)lsd_table(model, classify, alpha = 0.05, ...)
model |
The model object to calculate LSDs from. The value may be:
|
classify |
|
alpha |
significance level |
... |
Arguments to pass to |
A data.frame with the LSD values.
library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = oats ) lsd_table(model, classify = "Variety")library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = oats ) lsd_table(model, classify = "Variety")
Optimally rotate and grid a georeferenced dataframe
ofe_grid_data(data_in, rotation_angle, nrow, npe, ncol, trim_ends = TRUE)ofe_grid_data(data_in, rotation_angle, nrow, npe, ncol, trim_ends = TRUE)
data_in |
We assume there are only three column of actual interest in the dataframe, which are:
|
rotation_angle |
|
nrow |
|
npe |
|
ncol |
|
trim_ends |
|
gridded.ofe.
list containing the following items:
data.frame
The georeferenced gridded data.
data.frame
The input data with additional columns for visualising the rotation process.
Braden Thorne, [email protected]
library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) ofe_grid_data(data, -60, 80, 5, 18, trim_ends=FALSE)library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) ofe_grid_data(data, -60, 80, 5, 18, trim_ends=FALSE)
Rotate a provided georeferenced dataframe by a specified angle in degrees.
ofe_rotate_data(data, angle)ofe_rotate_data(data, angle)
data |
|
angle |
|
data.frame.
data with geometry now defined in rotated coordinates.
Contains the same columns as the input with the following additions:
The original x-coordinates before rotation.
The original y-coordinates before rotation.
Braden Thorne, [email protected]
library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) ofe_rotate_data(data, -60)library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) ofe_rotate_data(data, -60)
Provides a summary of the outliers present in the asreml model. Gives context to the outliers by showing the responses for the same factor combinations as the outliers
outlier_summary(model, cutoff = 3.5)outlier_summary(model, cutoff = 3.5)
model |
The model object to detect outliers in. The value may be:
|
cutoff |
|
NULL
Prints:
The number of outliers.
A table of the outliers, if any.
A table of relevant context to some factor combinations, if there are outliers.
library(asreml) model <- asreml( fixed = weight ~ littersize + Dose + Sex + Dose:Sex, random = ~ idv(Dam), residual = ~units, data = rats ) outlier_summary(model)library(asreml) model <- asreml( fixed = weight ~ littersize + Dose + Sex + Dose:Sex, random = ~ idv(Dam), residual = ~units, data = rats ) outlier_summary(model)
Returns a plot with the following attributes:
TOP LEFT. The raw data in the projected space.
TOP RIGHT Rotated original data with alternative strips highlighted. If each strip is not a single colour, the rotation is insufficient. In practice only the first strip needs to be identified correctly, however it is best to ensure all strips are identified.
BOTTOM LEFT The accurate rotated, gridded data. Strips should be vertically aligned. NAs will display in red. The number of NAs should be relatively small, concentrated around the edges where strips do not begin at the same point (if ends were trimmed these should not be present) and in holes where data is missing. Red horizontal lines indicate that too many rows have been specified. Large NA concentrations in all four corners indicates the rotation has likely failed.
BOTTOM RIGHT The gridded data projected back to the original data space. This should look similar to the original data - if it does not the process has likely failed and the other plots should be inspected to establish why.
plot_gridded_ofe(gridded_ofe)plot_gridded_ofe(gridded_ofe)
gridded_ofe |
|
Braden Thorne, [email protected]
library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) gridded_data <- ofe_grid_data(data, -60, 80, 5, 18, trim_ends=FALSE) plot_gridded_ofe(gridded_data)library(CBADASReml) data <- agridat::lasrosas.corn |> filter(year == 2001) |> rename( Yield = yield, Treatment = nf, Rep = rep ) |> dplyr::select(Yield, Treatment, Rep, long, lat) |> st_as_sf(coords = c("long", "lat"), crs = 4326) |> st_transform(3395) gridded_data <- ofe_grid_data(data, -60, 80, 5, 18, trim_ends=FALSE) plot_gridded_ofe(gridded_data)
Generate a table of predicted values and CIs from an
asreml or glmmTMB model, for
the specified variables asreml or glmmTMB object.
pred_table( mod, classify, link_fun = "identity", tmb_component = "cond", tmb_type = "response", factor_combine = TRUE, trt_col_label = "Treatment" )pred_table( mod, classify, link_fun = "identity", tmb_component = "cond", tmb_type = "response", factor_combine = TRUE, trt_col_label = "Treatment" )
mod |
The model object to get predictions from. The value may be:
|
classify |
For |
link_fun |
The value may be:
|
tmb_component |
The value may be:
|
tmb_type |
|
factor_combine |
Logical
Whether or not to combine the factors in |
trt_col_label |
|
data.frame.
Contains the predicted means, standard errors, and confidence intervals
for the specified variables. Includes the following columns:
The combined factor levels, if factor_combine is
TRUE.
The predicted mean values.
The standard errors of the predicted means.
The lower confidence limits.
The upper confidence limits.
If factor_combine is FALSE, the factors specified in
classify remain as separate columns.
Matthew Nguyen, [email protected]
library(CBADASReml) library(asreml) library(glmmTMB) mod1 <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = oats ) # Zero inflated model mod2 <- glmmTMB( count ~ spp * mined + (1|site), zi = ~ spp * mined, data = Salamanders, family = nbinom2 ) pred_table(mod1, classify = "Variety")library(CBADASReml) library(asreml) library(glmmTMB) mod1 <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~ idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~ idv(units), data = oats ) # Zero inflated model mod2 <- glmmTMB( count ~ spp * mined + (1|site), zi = ~ spp * mined, data = Salamanders, family = nbinom2 ) pred_table(mod1, classify = "Variety")
This function outputs the predictions for each factor for a single model. Each factor is put on a separate Excel sheet.
report_tables(model, classify)report_tables(model, classify)
model |
The model to generate predictions for. The value may be:
|
classify |
|
list of data.frame
A list of data frames. The first data frame is the ANOVA for the model.
The remaining data frames are the prediction tables from the classify
object.
library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) report_tables( model, classify = "Variety:Nitrogen" ) ## Not run: # Using it to write with writexl tables <- excel_prediction_file( model, classify = "Variety:Nitrogen", ) writexl::write_xlsx(x = tables, path = "Prediction_Tables.xlsx") ## End(Not run)library(asreml) model <- asreml( fixed = yield ~ Variety + Nitrogen + Variety:Nitrogen, random = ~idv(Blocks) + idv(Blocks):idv(Wplots), residual = ~idv(units), data = oats ) report_tables( model, classify = "Variety:Nitrogen" ) ## Not run: # Using it to write with writexl tables <- excel_prediction_file( model, classify = "Variety:Nitrogen", ) writexl::write_xlsx(x = tables, path = "Prediction_Tables.xlsx") ## End(Not run)