| Title: | Automated Multi-Outcome Machine Learning Combination Models |
|---|---|
| Description: | Provides automated machine learning workflows for survival analysis, binary classification, continuous outcomes, and ordinal outcomes. The package trains and combines model variants across user-supplied multi-cohort data, evaluates survival models by leave-one-out cross-validation using Harrell's concordance index, binary models by leave-one-out cross-validation using receiver operating characteristic area under the curve, continuous models by out-of-fold root mean squared error and R-squared, and ordinal models by out-of-fold quadratic weighted kappa. It renders reproducible reports in Hypertext Markup Language (HTML) with figures and diagnostics. The survival workflow supports penalized and tree-based Cox proportional hazards models, stepwise Cox models, partial least squares regression for Cox models, supervised principal components, gradient boosting machine Cox models, survival support vector machines (survival-SVM), random survival forests, and optional 'CoxBoost'. The binary workflow supports penalized logistic regression, logistic baselines, gradient boosting machines, random forests, principal component analysis (PCA) logistic regression, and Gaussian naive Bayes variants. Continuous and ordinal workflows reuse an 18-variant regression registry with penalized, linear, boosted, forest, PCA, and baseline families. The optional 'CoxBoost' model is enabled when the suggested 'CoxBoost' package is installed; it is used conditionally and is not a strong dependency. Optional model backends are checked at run time so missing backend packages skip only the affected model variants rather than blocking installation of the whole package. Methods build on Friedman et al. (2010) <doi:10.18637/jss.v033.i01>, Bair and Tibshirani (2004) <doi:10.1371/journal.pbio.0020108>, Ishwaran et al. (2008) <doi:10.1214/08-AOAS169>, Blanche et al. (2013) <doi:10.1002/sim.5958>, and Binder and Schumacher (2008) <doi:10.1186/1471-2105-9-14>. |
| Authors: | Peng Luo [aut, cre] |
| Maintainer: | Peng Luo <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-06-09 09:02:48 UTC |
| Source: | https://github.com/cran/AutoMLR |
Converts an object returned by prepare_binary_cohort_input() into the X,
y, and cohort-label objects used by lower-level binary functions such as
evaluate_binary_algorithm_loocv(), evaluate_binary_algorithms_loocv(),
and evaluate_binary_combinations().
automlr_input_to_binary_xy(input)automlr_input_to_binary_xy(input)
input |
An object returned by |
A list with components X (numeric feature matrix), y (0/1
outcome), stability_groups (cohort labels aligned to rows), cohort
(alias of stability_groups), and data (combined modeling data frame).
Converts an object returned by prepare_continuous_cohort_input() into the
X, y, and cohort-label objects used by lower-level continuous-outcome
functions.
automlr_input_to_continuous_xy(input)automlr_input_to_continuous_xy(input)
input |
An object returned by |
A list with components X (numeric feature matrix), y (numeric
outcome), stability_groups (cohort labels aligned to rows), cohort
(alias of stability_groups), and data (combined modeling data frame).
Converts an object returned by prepare_ordinal_cohort_input() into the
X, y, and cohort-label objects used by lower-level ordinal-outcome
functions.
automlr_input_to_ordinal_xy(input)automlr_input_to_ordinal_xy(input)
input |
An object returned by |
A list with components X (numeric feature matrix), y (integer
ordinal outcome codes), stability_groups (cohort labels aligned to rows),
cohort (alias of stability_groups), and data (combined modeling data
frame).
Converts an object returned by prepare_cohort_input() into the X, y,
and cohort-label objects used by lower-level survival functions such as
evaluate_algorithm_loocv(), evaluate_algorithms_loocv(), and
evaluate_surv_combinations().
automlr_input_to_surv_xy(input)automlr_input_to_surv_xy(input)
input |
An object returned by |
A list with components:
Numeric feature matrix restricted to shared features.
A survival::Surv outcome object.
Cohort labels aligned to rows of X.
Alias of stability_groups.
The combined modeling data frame.
Returns a flat list of knobs consumed by data preparation, feature screening, model fitting, and evaluation.
automlr_parameters( seed = 123L, screen_by_univariate_cox = TRUE, univariate_cox_p_cutoff = 0.05, screen_by_variance = TRUE, variance_quantile_cutoff = 0.1, loocv = TRUE, min_cindex_accept = 0.6, auto_min_cindex = FALSE, auto_quantile = 0.5, eval_times = c(365, 1095, 1825), time_unit = c("auto", "days", "months", "years"), algorithms = NULL, surv_svm_resampling = c("kfold", "loocv"), surv_svm_k_folds = 5L, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )automlr_parameters( seed = 123L, screen_by_univariate_cox = TRUE, univariate_cox_p_cutoff = 0.05, screen_by_variance = TRUE, variance_quantile_cutoff = 0.1, loocv = TRUE, min_cindex_accept = 0.6, auto_min_cindex = FALSE, auto_quantile = 0.5, eval_times = c(365, 1095, 1825), time_unit = c("auto", "days", "months", "years"), algorithms = NULL, surv_svm_resampling = c("kfold", "loocv"), surv_svm_k_folds = 5L, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )
seed |
Base random seed. |
screen_by_univariate_cox |
Logical, run univariate Cox p-value screen. |
univariate_cox_p_cutoff |
P-value cutoff for the univariate Cox screen. |
screen_by_variance |
Logical, drop low-variance features. |
variance_quantile_cutoff |
Drop features with variance below this quantile of all feature variances (e.g. 0.1 = drop bottom 10%). |
loocv |
Logical, use LOOCV during model selection (V1 evaluation protocol; paper 1 recipe). |
min_cindex_accept |
Minimum LOOCV C-index to accept a model. |
auto_min_cindex |
Logical. If |
auto_quantile |
Quantile used when an automatic threshold is requested.
|
eval_times |
Vector of time points (same unit as |
time_unit |
Time unit for survival time and |
algorithms |
Character vector of algorithm keys to run; default = all
10 in |
surv_svm_resampling |
Resampling method for survival support vector
machine candidates. The default |
surv_svm_k_folds |
Number of folds used when |
n_cores |
Integer, workers for parallel execution. |
stability_resamples |
Integer, number of repeated subsamples used for
optional stability diagnostics. |
stability_fraction |
Fraction of samples used in each stability subsample. |
stability_weight |
Non-negative penalty multiplier used only when a stability-weighted ranking or weight method is explicitly requested. |
verbose |
Logical, print progress messages. |
A named list.
ROC AUC for binary outcomes.
binary_auc(y, prob)binary_auc(y, prob)
y |
Binary 0/1 outcome. |
prob |
Predicted probability for the positive class. |
Scalar ROC AUC, or NA_real_ when not estimable.
Precision-recall AUC for binary outcomes.
binary_pr_auc(y, prob)binary_pr_auc(y, prob)
y |
Binary 0/1 outcome. |
prob |
Predicted probability for the positive class. |
Scalar PR-AUC, or NA_real_ when not estimable.
The binary workflow mirrors the survival workflow, but uses cross-validated ROC AUC as the primary selection metric and keeps PR-AUC, threshold metrics, calibration, and cohort stability as diagnostics.
binarymlr_parameters( seed = 123L, algorithms = NULL, loocv = TRUE, resampling = NULL, k_folds = 5L, repeats = 1L, min_auc_accept = 0.6, auto_min_auc = FALSE, auto_quantile = 0.5, positive_class = 1, threshold_methods = c("youden", "fixed_0.5"), missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )binarymlr_parameters( seed = 123L, algorithms = NULL, loocv = TRUE, resampling = NULL, k_folds = 5L, repeats = 1L, min_auc_accept = 0.6, auto_min_auc = FALSE, auto_quantile = 0.5, positive_class = 1, threshold_methods = c("youden", "fixed_0.5"), missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )
seed |
Base random seed. |
algorithms |
Character vector of binary algorithm keys. Defaults to all
entries returned by |
loocv |
Logical, use leave-one-out cross-validation. |
resampling |
Resampling scheme: |
k_folds |
Number of folds when |
repeats |
Number of repeats for repeated k-fold CV. |
min_auc_accept |
Minimum AUC accepted by threshold-style selection. |
auto_min_auc |
Logical. If |
auto_quantile |
Quantile used when an automatic threshold is requested.
|
positive_class |
Positive-class label used during data preparation. |
threshold_methods |
Threshold summaries to export. Supported values are
|
missing_fraction_cutoff |
Drop features with missing fraction above this cutoff before modeling. |
screen_by_variance |
Logical, drop zero / low-variance features. |
variance_quantile_cutoff |
Optional lower variance quantile to drop. |
standardize_features |
Logical, center and scale features before modeling. |
n_cores |
Integer, number of fold workers. |
stability_resamples |
Number of optional stability subsamples. |
stability_fraction |
Fraction of samples in each stability subsample. |
stability_weight |
Penalty multiplier for stability-weighted ranking. |
verbose |
Logical, print progress. |
A named list.
AutoMLR keeps heavyweight model engines in Suggests so the package can be
installed even when some optional modelling backends are unavailable. This
helper reports which model variants and optional features are currently
available in the user's R library and which packages would be needed to
enable the skipped pieces.
check_automlr_dependencies(workflow = "all")check_automlr_dependencies(workflow = "all")
workflow |
Character vector. Use |
A list of class "automlr_dependency_report" with two data frames:
algorithms, containing one row per algorithm registry entry, and
optional_features, containing non-model optional capabilities such as
logging, parallel execution, and time-dependent ROC diagnostics.
Correlation between observed and predicted continuous outcomes.
continuous_cor(y, pred, method = c("pearson", "spearman"))continuous_cor(y, pred, method = c("pearson", "spearman"))
y |
Observed numeric outcome. |
pred |
Predicted numeric outcome. |
method |
Correlation method. |
A numeric scalar.
Mean absolute error for continuous predictions.
continuous_mae(y, pred)continuous_mae(y, pred)
y |
Observed numeric outcome. |
pred |
Predicted numeric outcome. |
A numeric scalar.
Coefficient of determination for continuous predictions.
continuous_r2(y, pred)continuous_r2(y, pred)
y |
Observed numeric outcome. |
pred |
Predicted numeric outcome. |
A numeric scalar.
Root mean squared error for continuous predictions.
continuous_rmse(y, pred)continuous_rmse(y, pred)
y |
Observed numeric outcome. |
pred |
Predicted numeric outcome. |
A numeric scalar.
Default parameters for AutoMLR continuous-outcome workflows.
continuousmlr_parameters( seed = 123L, algorithms = NULL, resampling = "loocv", k_folds = 5L, repeats = 1L, min_r2_accept = 0, auto_min_r2 = FALSE, auto_quantile = 0.5, missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )continuousmlr_parameters( seed = 123L, algorithms = NULL, resampling = "loocv", k_folds = 5L, repeats = 1L, min_r2_accept = 0, auto_min_r2 = FALSE, auto_quantile = 0.5, missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, stability_resamples = 0L, stability_fraction = 0.8, stability_weight = 0.1, verbose = TRUE )
seed |
Base random seed. |
algorithms |
Character vector of continuous algorithm keys. |
resampling |
Resampling scheme: |
k_folds |
Number of folds for k-fold CV. |
repeats |
Number of repeats for repeated k-fold CV. |
min_r2_accept |
Minimum R-squared accepted by threshold-style selection. |
auto_min_r2 |
Logical. If |
auto_quantile |
Quantile used when an automatic threshold is requested.
|
missing_fraction_cutoff |
Drop features above this missing fraction. |
screen_by_variance |
Logical, drop zero / low-variance features. |
variance_quantile_cutoff |
Optional lower variance quantile to drop. |
standardize_features |
Logical, center and scale features. |
n_cores |
Integer, number of fold workers. |
stability_resamples |
Number of optional stability subsamples. |
stability_fraction |
Fraction of samples in each stability subsample. |
stability_weight |
Penalty multiplier for stability-aware ranking. |
verbose |
Logical, print progress. |
A named list.
Count binary model combinations without fitting.
count_binary_combinations( params = binarymlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )count_binary_combinations( params = binarymlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )
params |
Output of |
algorithms |
Binary algorithm keys. |
min_size |
Minimum combination size. |
max_size |
Maximum combination size. |
allow_same_algorithm |
Logical, allow same base algorithm variants in one combination. |
A list with candidate and combination counts.
Count continuous model combinations without fitting.
count_continuous_combinations( params = continuousmlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )count_continuous_combinations( params = continuousmlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )
params |
Output of |
algorithms |
Continuous algorithm keys. |
min_size |
Minimum combination size. |
max_size |
Maximum combination size. |
allow_same_algorithm |
Logical. |
A list with candidate and combination counts.
Count ordinal model combinations without fitting.
count_ordinal_combinations( params = ordinalmlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )count_ordinal_combinations( params = ordinalmlr_parameters(), algorithms = params$algorithms, min_size = 1L, max_size = 2L, allow_same_algorithm = FALSE )
params |
Output of |
algorithms |
Ordinal algorithm keys. |
min_size |
Minimum combination size. |
max_size |
Maximum combination size. |
allow_same_algorithm |
Logical. |
A list with candidate and combination counts.
Count model combinations without fitting models.
count_surv_combinations( params = automlr_parameters(), algorithms = params$algorithms, min_size = 2L, max_size = 2L, allow_same_algorithm = FALSE )count_surv_combinations( params = automlr_parameters(), algorithms = params$algorithms, min_size = 2L, max_size = 2L, allow_same_algorithm = FALSE )
params |
Output of |
algorithms |
Character vector of registry keys. |
min_size |
Minimum combination size. |
max_size |
Maximum combination size. |
allow_same_algorithm |
Logical, allow two variants from the same base algorithm to appear in one combination. |
A list with candidate and combination counts.
Disable AutoMLR auto logging.
disable_auto_logging()disable_auto_logging()
Invisibly returns TRUE when logging was disabled and FALSE when
logging was already inactive.
Convenience wrapper: looks up algo_key in get_surv_registry(), takes the
first row of grid(params) as the hyperparameters, and calls
loocv_cindex().
evaluate_algorithm_loocv( algo_key, X, y, params = automlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE )evaluate_algorithm_loocv( algo_key, X, y, params = automlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE )
algo_key |
Character, one of |
X |
Feature matrix. |
y |
|
params |
Output of |
hparam |
Optional named list of hyperparameters. If |
candidate_key |
Optional identifier for this algorithm + hyperparameter variant. |
verbose |
Logical. |
A list — the output of loocv_cindex() plus algo_key,
algo_label, candidate_key, and hparam used.
Evaluate multiple survival model variants by LOOCV C-index.
evaluate_algorithms_loocv( X, y, params = automlr_parameters(), algorithms = params$algorithms, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, verbose = params$verbose )evaluate_algorithms_loocv( X, y, params = automlr_parameters(), algorithms = params$algorithms, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, verbose = params$verbose )
X |
Feature matrix, or an |
y |
|
params |
Output of |
algorithms |
Character vector of registry keys to expand and evaluate. |
stability_groups |
Optional group/queue labels used to compute per-candidate stability diagnostics from out-of-fold risks. |
stability_resamples |
Number of repeated subsamples for stability diagnostics. |
stability_fraction |
Fraction of samples in each stability subsample. |
verbose |
Logical. |
A list of class "automlr_loocv_set" with a summary table and raw
per-candidate results.
Evaluate one binary algorithm by LOOCV AUC.
evaluate_binary_algorithm_loocv( algo_key, X, y, params = binarymlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE, resampling_plan = NULL )evaluate_binary_algorithm_loocv( algo_key, X, y, params = binarymlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE, resampling_plan = NULL )
algo_key |
Algorithm key. |
X |
Feature matrix. |
y |
Binary 0/1 outcome. |
params |
Output of |
hparam |
Optional hyperparameters. |
candidate_key |
Optional variant identifier. |
verbose |
Logical. |
resampling_plan |
Optional internal resampling plan. |
A list with LOOCV results and metadata.
Evaluate binary model variants by LOOCV AUC.
evaluate_binary_algorithms_loocv( X, y, params = binarymlr_parameters(), algorithms = params$algorithms, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, verbose = params$verbose )evaluate_binary_algorithms_loocv( X, y, params = binarymlr_parameters(), algorithms = params$algorithms, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, verbose = params$verbose )
X |
Feature matrix. |
y |
Binary 0/1 outcome. |
params |
Output of |
algorithms |
Algorithm keys. |
stability_groups |
Optional cohort labels. |
stability_resamples |
Optional stability resamples. |
stability_fraction |
Stability subsample fraction. |
verbose |
Logical. |
A list of class "automlr_binary_loocv_set".
Evaluate all-subset binary probability combinations.
evaluate_binary_combinations( loocv_set, y, min_size = 1L, max_size = 2L, weight_method = c("auc", "equal", "auc_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_prob_sd = 1e-08, stability_groups = NULL, stability_resamples = loocv_set$params$stability_resamples %||% 0L, stability_fraction = loocv_set$params$stability_fraction %||% 0.8, rank_by = c("auc", "stability_weighted"), stability_weight = loocv_set$params$stability_weight %||% 0.1, top_n = 50L )evaluate_binary_combinations( loocv_set, y, min_size = 1L, max_size = 2L, weight_method = c("auc", "equal", "auc_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_prob_sd = 1e-08, stability_groups = NULL, stability_resamples = loocv_set$params$stability_resamples %||% 0L, stability_fraction = loocv_set$params$stability_fraction %||% 0.8, rank_by = c("auc", "stability_weighted"), stability_weight = loocv_set$params$stability_weight %||% 0.1, top_n = 50L )
loocv_set |
Output of |
y |
Binary 0/1 outcome. |
min_size |
Minimum member count. |
max_size |
Maximum member count. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
max_failed_fraction |
Maximum LOOCV failure fraction. |
min_prob_sd |
Minimum probability standard deviation. |
stability_groups |
Optional cohort labels. |
stability_resamples |
Optional stability resamples. |
stability_fraction |
Stability subsample fraction. |
rank_by |
Ranking method. |
stability_weight |
Stability penalty multiplier. |
top_n |
Number of rows to keep. |
A list of class "automlr_binary_combination_set".
Evaluate one continuous algorithm by out-of-fold performance.
evaluate_continuous_algorithm( algo_key, X, y, params = continuousmlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE, resampling_plan = NULL )evaluate_continuous_algorithm( algo_key, X, y, params = continuousmlr_parameters(), hparam = NULL, candidate_key = NULL, verbose = FALSE, resampling_plan = NULL )
algo_key |
Algorithm key. |
X |
Feature matrix. |
y |
Numeric outcome. |
params |
Output of |
hparam |
Optional hyperparameters. |
candidate_key |
Optional variant identifier. |
verbose |
Logical. |
resampling_plan |
Optional internal resampling plan. |
A list with OOF predictions and metrics.
Evaluate continuous model variants by out-of-fold performance.
evaluate_continuous_algorithms( X, y, params = continuousmlr_parameters(), algorithms = params$algorithms, verbose = params$verbose )evaluate_continuous_algorithms( X, y, params = continuousmlr_parameters(), algorithms = params$algorithms, verbose = params$verbose )
X |
Feature matrix. |
y |
Numeric outcome. |
params |
Output of |
algorithms |
Algorithm keys. |
verbose |
Logical. |
A list of class "automlr_continuous_resample_set".
Evaluate all-subset continuous prediction combinations.
evaluate_continuous_combinations( resample_set, y, min_size = 1L, max_size = 2L, weight_method = c("inverse_rmse", "equal", "r2"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_pred_sd = 1e-08, rank_by = c("rmse", "r2"), top_n = 50L )evaluate_continuous_combinations( resample_set, y, min_size = 1L, max_size = 2L, weight_method = c("inverse_rmse", "equal", "r2"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_pred_sd = 1e-08, rank_by = c("rmse", "r2"), top_n = 50L )
resample_set |
Output of |
y |
Numeric outcome. |
min_size |
Minimum member count. |
max_size |
Maximum member count. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
max_failed_fraction |
Maximum sample prediction failure fraction. |
min_pred_sd |
Minimum prediction standard deviation. |
rank_by |
Ranking method. |
top_n |
Number of rows to keep. |
A list of class "automlr_continuous_combination_set".
Ordinal outcomes are encoded as ordered integer scores and fitted with the continuous-model registry; predictions are rounded back to ordered classes for QWK, accuracy, balanced accuracy, and class MAE.
evaluate_ordinal_algorithms( X, y, params = ordinalmlr_parameters(), algorithms = params$algorithms, verbose = params$verbose )evaluate_ordinal_algorithms( X, y, params = ordinalmlr_parameters(), algorithms = params$algorithms, verbose = params$verbose )
X |
Feature matrix. |
y |
Ordinal positive integer outcome codes. |
params |
Output of |
algorithms |
Algorithm keys. |
verbose |
Logical. |
A list of class "automlr_ordinal_resample_set".
Evaluate all-subset ordinal score combinations.
evaluate_ordinal_combinations( resample_set, y, min_size = 1L, max_size = 2L, weight_method = c("qwk", "equal", "inverse_mae"), allow_same_algorithm = FALSE, rank_by = c("qwk", "class_mae"), top_n = 50L )evaluate_ordinal_combinations( resample_set, y, min_size = 1L, max_size = 2L, weight_method = c("qwk", "equal", "inverse_mae"), allow_same_algorithm = FALSE, rank_by = c("qwk", "class_mae"), top_n = 50L )
resample_set |
Output of |
y |
Ordinal integer outcome codes. |
min_size |
Minimum member count. |
max_size |
Maximum member count. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
rank_by |
Ranking method. |
top_n |
Number of rows to keep. |
A list of class "automlr_ordinal_combination_set".
Builds combinations from the out-of-fold risk scores in
evaluate_algorithms_loocv(). For each subset, member risks are
standardized and averaged using equal weights or C-index-derived weights;
the resulting combination is scored by Harrell's C-index.
evaluate_surv_combinations( loocv_set, y, min_size = 1L, max_size = 2L, weight_method = c("cindex", "equal", "cindex_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_risk_sd = 1e-08, stability_groups = NULL, stability_resamples = loocv_set$params$stability_resamples %||% 0L, stability_fraction = loocv_set$params$stability_fraction %||% 0.8, rank_by = c("cindex", "stability_weighted"), stability_weight = loocv_set$params$stability_weight %||% 0.1, diagnostic_times = NULL, top_n = 50L )evaluate_surv_combinations( loocv_set, y, min_size = 1L, max_size = 2L, weight_method = c("cindex", "equal", "cindex_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, min_risk_sd = 1e-08, stability_groups = NULL, stability_resamples = loocv_set$params$stability_resamples %||% 0L, stability_fraction = loocv_set$params$stability_fraction %||% 0.8, rank_by = c("cindex", "stability_weighted"), stability_weight = loocv_set$params$stability_weight %||% 0.1, diagnostic_times = NULL, top_n = 50L )
loocv_set |
Output of |
y |
A |
min_size |
Minimum number of member model variants in a combination. |
max_size |
Maximum number of member model variants in a combination. |
weight_method |
One of |
allow_same_algorithm |
Logical, allow two variants from the same base algorithm to appear in one combination. |
max_failed_fraction |
Maximum allowed LOOCV fold failure fraction for a candidate to enter combinations. |
min_risk_sd |
Minimum standard deviation of LOOCV risk scores for a candidate to enter combinations. |
stability_groups |
Optional group/queue labels used for combination stability diagnostics. |
stability_resamples |
Number of repeated subsamples for stability diagnostics. |
stability_fraction |
Fraction of samples in each stability subsample. |
rank_by |
Ranking method. |
stability_weight |
Non-negative multiplier for the stability penalty
when |
diagnostic_times |
Optional time points for time-dependent AUC
diagnostics. Requires the suggested |
top_n |
Number of top rows to keep in the returned summary. |
A list of class "automlr_combination_set".
Export binary AutoMLR results.
export_binary_results( object, output_dir = "automlr_binary_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )export_binary_results( object, output_dir = "automlr_binary_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )
object |
Object returned by |
output_dir |
Output directory. |
publication |
Logical, write publication-style figures. |
formats |
Figure formats: |
dpi |
PNG resolution. |
top_n |
Number of top rows. |
overwrite |
Logical. |
summary_language |
|
A list of exported paths.
Export continuous AutoMLR results.
export_continuous_results( object, output_dir = "automlr_continuous_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE )export_continuous_results( object, output_dir = "automlr_continuous_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE )
object |
Object returned by |
output_dir |
Output directory. |
publication |
Logical, write publication-style figures. |
formats |
Figure formats. |
dpi |
PNG resolution. |
top_n |
Number of top rows. |
overwrite |
Logical. |
A list of exported paths.
Writes the complete apparent C-index table, apparent top-N table, seed-search
table, best rows, failure diagnostics, summary CSVs, a Markdown summary
report, and a Morandi-toned audit figure set for an object returned by
extreme_surv_screen().
export_extreme_screen_results( x, output_dir, formats = c("png", "pdf"), top_n = 10L, top_seed_rows = 30L, dpi = 300L, summary_language = c("bilingual", "en", "zh") )export_extreme_screen_results( x, output_dir, formats = c("png", "pdf"), top_n = 10L, top_seed_rows = 30L, dpi = 300L, summary_language = c("bilingual", "en", "zh") )
x |
An object returned by |
output_dir |
Directory to write tables, figures, and the RDS object. |
formats |
Figure formats. Any subset of |
top_n |
Number of apparent top combinations to emphasize in figures. |
top_seed_rows |
Number of seed-search rows to show in ranked-row figures. |
dpi |
PNG resolution. |
summary_language |
Language used in |
A list with written table and figure paths.
Export ordinal AutoMLR results.
export_ordinal_results( object, output_dir = "automlr_ordinal_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE )export_ordinal_results( object, output_dir = "automlr_ordinal_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE )
object |
Object returned by |
output_dir |
Output directory. |
publication |
Logical, write publication-style figures. |
formats |
Figure formats. |
dpi |
PNG resolution. |
top_n |
Number of top rows. |
overwrite |
Logical. |
A list of exported paths.
Creates a directory containing an HTML report, publication figures, diagnostic figures, CSV tables, fitted R objects, risk scores, optional timeROC diagnostics, cohort diagnostics, a deduplicated final publication figure set, and session metadata.
export_surv_results( object, output_dir = "automlr_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )export_surv_results( object, output_dir = "automlr_results", publication = TRUE, formats = c("pdf", "png"), dpi = 300L, top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )
object |
An object returned by |
output_dir |
Directory for the exported result bundle. |
publication |
Logical, create publication-style figures. |
formats |
Figure formats. Supported values are |
dpi |
Resolution for PNG figures. |
top_n |
Number of top models / combinations to show in figures. |
overwrite |
Logical, overwrite report files where applicable. |
summary_language |
Language used in Markdown summaries, either
|
A list with paths to exported files.
Runs an optimistic "full data as train and validation" screen first, then
searches random 70/30 train-validation splits only for the top combinations.
The first stage estimates an apparent upper bound and should not be reported
as external validation performance. Returned apparent-stage tables include
performance_scope and performance_note columns carrying this warning.
extreme_surv_screen( X, y = NULL, params = automlr_parameters(), algorithms = params$algorithms, top_n = 5L, seeds = 1:500, train_fraction = 0.7, min_models = 1L, max_models = 2L, weight_method = c("cindex", "equal"), allow_same_algorithm = FALSE, min_risk_sd = 1e-08, rank_by = c("apparent_cindex", "mean_cohort_cindex"), stability_groups = NULL, n_cores = params$n_cores %||% 1L, verbose = params$verbose )extreme_surv_screen( X, y = NULL, params = automlr_parameters(), algorithms = params$algorithms, top_n = 5L, seeds = 1:500, train_fraction = 0.7, min_models = 1L, max_models = 2L, weight_method = c("cindex", "equal"), allow_same_algorithm = FALSE, min_risk_sd = 1e-08, rank_by = c("apparent_cindex", "mean_cohort_cindex"), stability_groups = NULL, n_cores = params$n_cores %||% 1L, verbose = params$verbose )
X |
Feature matrix, or an |
y |
|
params |
Output of |
algorithms |
Character vector of registry keys. |
top_n |
Number of apparent-screen combinations carried into the seed search. |
seeds |
Integer vector of random seeds used for 70/30 split search. |
train_fraction |
Fraction of samples assigned to training in the seed search. |
min_models |
Minimum combination size in the apparent screen. |
max_models |
Maximum combination size in the apparent screen. |
weight_method |
One of |
allow_same_algorithm |
Logical, allow multiple variants from the same base algorithm in one combination. |
min_risk_sd |
Minimum apparent risk-score standard deviation required for a candidate to enter the apparent combination screen. |
rank_by |
|
stability_groups |
Optional group/cohort labels. Automatically taken
from |
n_cores |
Reserved for future seed-level parallel execution. The seed search currently runs sequentially for reproducible access to package internals. |
verbose |
Logical. |
A list of class "automlr_extreme_screen" with apparent-screen
summaries, top combinations, seed-search results, and the best seed/model
row by validation C-index. The notes component stores interpretation
text for apparent performance and seed-search performance.
Fit a binary probability ensemble.
fit_binary_ensemble( X, y = NULL, params = binarymlr_parameters(), algorithms = params$algorithms, min_auc = params$min_auc_accept, auto_min_auc = params$auto_min_auc %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("auc", "equal", "auc_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, rank_by = c("auc", "stability_weighted"), stability_weight = params$stability_weight %||% 0.1, threshold_method = "youden", verbose = params$verbose )fit_binary_ensemble( X, y = NULL, params = binarymlr_parameters(), algorithms = params$algorithms, min_auc = params$min_auc_accept, auto_min_auc = params$auto_min_auc %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("auc", "equal", "auc_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, rank_by = c("auc", "stability_weighted"), stability_weight = params$stability_weight %||% 0.1, threshold_method = "youden", verbose = params$verbose )
X |
Feature matrix, or an |
y |
Binary 0/1 outcome. Leave |
params |
Output of |
algorithms |
Binary algorithm keys. |
min_auc |
Minimum AUC for threshold strategy. |
auto_min_auc |
Logical. If |
auto_quantile |
Quantile used for automatic threshold selection. |
strategy |
|
min_models |
Minimum combination size. |
max_models |
Maximum combination size. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
max_failed_fraction |
Maximum fold-failure fraction. |
stability_groups |
Optional cohort labels. |
stability_resamples |
Stability resamples. |
stability_fraction |
Stability subsample fraction. |
rank_by |
Ranking method. |
stability_weight |
Stability penalty. |
threshold_method |
Threshold used for final class labels. |
verbose |
Logical. |
An object of class "automlr_binary_ensemble".
Fit a continuous-outcome prediction ensemble.
fit_continuous_ensemble( X, y = NULL, params = continuousmlr_parameters(), algorithms = params$algorithms, min_r2 = params$min_r2_accept, auto_min_r2 = params$auto_min_r2 %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("inverse_rmse", "equal", "r2"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, rank_by = c("rmse", "r2"), verbose = params$verbose )fit_continuous_ensemble( X, y = NULL, params = continuousmlr_parameters(), algorithms = params$algorithms, min_r2 = params$min_r2_accept, auto_min_r2 = params$auto_min_r2 %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("inverse_rmse", "equal", "r2"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, rank_by = c("rmse", "r2"), verbose = params$verbose )
X |
Feature matrix, or an |
y |
Numeric outcome. Leave |
params |
Output of |
algorithms |
Continuous algorithm keys. |
min_r2 |
Minimum R-squared for threshold strategy. |
auto_min_r2 |
Logical. If |
auto_quantile |
Quantile used for automatic threshold selection. |
strategy |
|
min_models |
Minimum combination size. |
max_models |
Maximum combination size. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
max_failed_fraction |
Maximum failed sample fraction. |
rank_by |
Ranking method. |
verbose |
Logical. |
An object of class "automlr_continuous_ensemble".
Fit an ordinal-outcome ensemble.
fit_ordinal_ensemble( X, y = NULL, params = ordinalmlr_parameters(), algorithms = params$algorithms, min_qwk = params$min_qwk_accept, auto_min_qwk = params$auto_min_qwk %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("qwk", "equal", "inverse_mae"), allow_same_algorithm = FALSE, rank_by = c("qwk", "class_mae"), verbose = params$verbose )fit_ordinal_ensemble( X, y = NULL, params = ordinalmlr_parameters(), algorithms = params$algorithms, min_qwk = params$min_qwk_accept, auto_min_qwk = params$auto_min_qwk %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("qwk", "equal", "inverse_mae"), allow_same_algorithm = FALSE, rank_by = c("qwk", "class_mae"), verbose = params$verbose )
X |
Feature matrix, or an |
y |
Ordinal integer outcome codes. Leave |
params |
Output of |
algorithms |
Ordinal algorithm keys. |
min_qwk |
Minimum quadratic weighted kappa for threshold strategy. |
auto_min_qwk |
Logical. If |
auto_quantile |
Quantile used for automatic threshold selection. |
strategy |
|
min_models |
Minimum combination size. |
max_models |
Maximum combination size. |
weight_method |
One of |
allow_same_algorithm |
Logical. |
rank_by |
Ranking method. |
verbose |
Logical. |
An object of class "automlr_ordinal_ensemble".
The ensemble first estimates each candidate model's LOOCV C-index. With the
default "best_subset" strategy, it enumerates model subsets up to
max_models variants and chooses the subset with the highest combined
LOOCV C-index. With "threshold", it keeps all single models meeting
min_cindex (or the best finite model if none meet it).
fit_surv_ensemble( X, y = NULL, params = automlr_parameters(), algorithms = params$algorithms, min_cindex = params$min_cindex_accept, auto_min_cindex = params$auto_min_cindex %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("cindex", "equal", "cindex_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, rank_by = c("cindex", "stability_weighted"), stability_weight = params$stability_weight %||% 0.1, diagnostic_times = NULL, verbose = params$verbose )fit_surv_ensemble( X, y = NULL, params = automlr_parameters(), algorithms = params$algorithms, min_cindex = params$min_cindex_accept, auto_min_cindex = params$auto_min_cindex %||% FALSE, auto_quantile = params$auto_quantile %||% 0.5, strategy = c("best_subset", "threshold"), min_models = 1L, max_models = 2L, weight_method = c("cindex", "equal", "cindex_stability"), allow_same_algorithm = FALSE, max_failed_fraction = 0.2, stability_groups = NULL, stability_resamples = params$stability_resamples %||% 0L, stability_fraction = params$stability_fraction %||% 0.8, rank_by = c("cindex", "stability_weighted"), stability_weight = params$stability_weight %||% 0.1, diagnostic_times = NULL, verbose = params$verbose )
X |
Feature matrix, or an |
y |
|
params |
Output of |
algorithms |
Character vector of registry keys. |
min_cindex |
Minimum C-index for automatic inclusion. |
auto_min_cindex |
Logical. If |
auto_quantile |
Quantile used for automatic threshold selection. |
strategy |
One of |
min_models |
Minimum subset size for |
max_models |
Maximum subset size for |
weight_method |
One of |
allow_same_algorithm |
Logical, allow multiple variants from the same base algorithm in one selected combination. |
max_failed_fraction |
Maximum allowed LOOCV fold failure fraction for candidates entering combination search. |
stability_groups |
Optional group/queue labels used for stability diagnostics. |
stability_resamples |
Number of repeated subsamples for stability diagnostics. |
stability_fraction |
Fraction of samples in each stability subsample. |
rank_by |
Ranking method passed to |
stability_weight |
Non-negative stability penalty multiplier used when
|
diagnostic_times |
Optional time points for time-dependent AUC diagnostics in the combination table. |
verbose |
Logical. |
A list of class "automlr_surv_ensemble".
Return the binary-classification algorithm registry.
get_binary_registry()get_binary_registry()
A named list of algorithm specs.
Return the continuous-outcome algorithm registry.
get_continuous_registry()get_continuous_registry()
A named list of algorithm specs.
Return the ordinal-outcome algorithm registry.
get_ordinal_registry()get_ordinal_registry()
A named list of algorithm specs.
Return the full survival-algorithm registry.
get_surv_registry()get_surv_registry()
A named list of algorithm specs.
Writes a timestamped log file to log_dir (default:
file.path(getwd(), "automlr_logs")) and keeps at most max_log_files.
initialize_auto_logging(log_dir = NULL, max_log_files = 10)initialize_auto_logging(log_dir = NULL, max_log_files = 10)
log_dir |
Directory to write logs into. |
max_log_files |
Retain only the N most recent logs. |
Invisibly returns the log4r logger when available, otherwise
NULL.
List supported binary-classification algorithms.
list_binary_algorithms()list_binary_algorithms()
Character vector of algorithm keys.
List binary-classification model variants.
list_binary_model_variants( params = binarymlr_parameters(), algorithms = params$algorithms )list_binary_model_variants( params = binarymlr_parameters(), algorithms = params$algorithms )
params |
Output of |
algorithms |
Binary algorithm keys. |
A data.frame of concrete model variants.
List supported continuous-outcome algorithms.
list_continuous_algorithms()list_continuous_algorithms()
Character vector of algorithm keys.
List continuous-outcome model variants.
list_continuous_model_variants( params = continuousmlr_parameters(), algorithms = params$algorithms )list_continuous_model_variants( params = continuousmlr_parameters(), algorithms = params$algorithms )
params |
Output of |
algorithms |
Continuous algorithm keys. |
A data.frame of concrete model variants.
Each row is one candidate model that can enter a model combination. Multiple rows can come from the same base algorithm when its registry grid contains multiple hyperparameter settings.
list_model_variants( params = automlr_parameters(), algorithms = params$algorithms )list_model_variants( params = automlr_parameters(), algorithms = params$algorithms )
params |
Output of |
algorithms |
Character vector of registry keys. |
A data.frame of candidate model variants.
List supported ordinal-outcome algorithms.
list_ordinal_algorithms()list_ordinal_algorithms()
Character vector of algorithm keys.
List ordinal-outcome model variants.
list_ordinal_model_variants( params = ordinalmlr_parameters(), algorithms = params$algorithms )list_ordinal_model_variants( params = ordinalmlr_parameters(), algorithms = params$algorithms )
params |
Output of |
algorithms |
Ordinal algorithm keys. |
A data.frame of concrete model variants.
List the supported survival algorithms (keys).
list_surv_algorithms()list_surv_algorithms()
Character vector of algorithm keys.
Leave-one-out cross-validation AUC for one binary algorithm.
loocv_auc( X, y, fit_fn, predict_fn, hparam = list(), seed = NULL, verbose = FALSE, n_cores = 1L )loocv_auc( X, y, fit_fn, predict_fn, hparam = list(), seed = NULL, verbose = FALSE, n_cores = 1L )
X |
Numeric feature matrix. |
y |
Binary 0/1 outcome. |
fit_fn |
Function |
predict_fn |
Function |
hparam |
Hyperparameter list. |
seed |
Optional seed. |
verbose |
Logical. |
n_cores |
Integer number of fold workers. |
A list with AUC, PR-AUC, Brier score, probabilities, and failures.
Loops over every sample, fits the model on the remaining n-1, predicts
the held-out one, then computes Harrell's C-index on the full vector of
held-out risk scores. Any fold that errors records NA for that sample
and the error message; the C-index is computed on the remaining folds.
loocv_cindex( X, y, fit_fn, predict_fn, hparam = list(), seed = NULL, verbose = FALSE, n_cores = 1L )loocv_cindex( X, y, fit_fn, predict_fn, hparam = list(), seed = NULL, verbose = FALSE, n_cores = 1L )
X |
Numeric matrix or data.frame of features ( |
y |
A |
fit_fn |
Function |
predict_fn |
Function |
hparam |
Named list of hyperparameters passed to |
seed |
Integer seed set once before the loop for reproducibility;
|
verbose |
Logical, print a progress message every ~10%. |
n_cores |
Integer, number of workers for fold-level parallelism. |
A list with components:
Scalar Harrell's C-index on the aggregated predictions.
Numeric vector of length n with per-sample LOOCV risk
scores (NA where the fold errored).
n.
Number of folds that errored.
Character vector of error messages (may be length 0).
Wall-clock seconds.
Accuracy for ordinal class predictions.
ordinal_accuracy(y, pred_class)ordinal_accuracy(y, pred_class)
y |
Observed positive integer level codes. |
pred_class |
Predicted positive integer level codes. |
A numeric scalar.
Balanced accuracy for ordinal class predictions.
ordinal_balanced_accuracy(y, pred_class)ordinal_balanced_accuracy(y, pred_class)
y |
Observed positive integer level codes. |
pred_class |
Predicted positive integer level codes. |
A numeric scalar.
Mean absolute class error for ordinal predictions.
ordinal_mae(y, pred_class)ordinal_mae(y, pred_class)
y |
Observed positive integer level codes. |
pred_class |
Predicted positive integer level codes. |
A numeric scalar.
Quadratic weighted kappa for ordinal predictions.
ordinal_qwk(y, pred_class, n_levels = max(y, pred_class, na.rm = TRUE))ordinal_qwk(y, pred_class, n_levels = max(y, pred_class, na.rm = TRUE))
y |
Observed positive integer level codes. |
pred_class |
Predicted positive integer level codes. |
n_levels |
Number of ordered outcome levels. |
A numeric scalar.
Default parameters for AutoMLR ordinal-outcome workflows.
ordinalmlr_parameters( seed = 123L, algorithms = NULL, resampling = "loocv", k_folds = 5L, repeats = 1L, min_qwk_accept = 0, auto_min_qwk = FALSE, auto_quantile = 0.5, missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, verbose = TRUE )ordinalmlr_parameters( seed = 123L, algorithms = NULL, resampling = "loocv", k_folds = 5L, repeats = 1L, min_qwk_accept = 0, auto_min_qwk = FALSE, auto_quantile = 0.5, missing_fraction_cutoff = 0.2, screen_by_variance = TRUE, variance_quantile_cutoff = 0, standardize_features = FALSE, n_cores = 1L, verbose = TRUE )
seed |
Base random seed. |
algorithms |
Character vector of ordinal algorithm keys. |
resampling |
Resampling scheme. |
k_folds |
Number of folds for k-fold CV. |
repeats |
Number of repeats for repeated k-fold CV. |
min_qwk_accept |
Minimum quadratic weighted kappa for threshold strategy. |
auto_min_qwk |
Logical. If |
auto_quantile |
Quantile used when an automatic threshold is requested.
|
missing_fraction_cutoff |
Drop features above this missing fraction. |
screen_by_variance |
Logical, drop zero / low-variance features. |
variance_quantile_cutoff |
Optional lower variance quantile to drop. |
standardize_features |
Logical, center and scale features. |
n_cores |
Integer, number of fold workers. |
verbose |
Logical, print progress. |
A named list.
lapply that transparently falls back to sequential.Parallel lapply that transparently falls back to sequential.
parallel_lapply(X, FUN, ..., cores = NULL)parallel_lapply(X, FUN, ..., cores = NULL)
X |
Input list / vector. |
FUN |
Function. |
... |
Extra args to |
cores |
Optionally override the global core count. |
List of results.
Predict binary ensemble probabilities or classes.
## S3 method for class 'automlr_binary_ensemble' predict( object, newX, type = c("prob", "class"), threshold = object$threshold, ... )## S3 method for class 'automlr_binary_ensemble' predict( object, newX, type = c("prob", "class"), threshold = object$threshold, ... )
object |
Object returned by |
newX |
Feature matrix. |
type |
|
threshold |
Optional class threshold. |
... |
Ignored. |
Numeric vector.
Predict continuous ensemble values.
## S3 method for class 'automlr_continuous_ensemble' predict(object, newX, ...)## S3 method for class 'automlr_continuous_ensemble' predict(object, newX, ...)
object |
Object returned by |
newX |
Feature matrix. |
... |
Ignored. |
Numeric vector.
Predict ordinal ensemble scores or classes.
## S3 method for class 'automlr_ordinal_ensemble' predict(object, newX, type = c("score", "code", "class"), ...)## S3 method for class 'automlr_ordinal_ensemble' predict(object, newX, type = c("score", "code", "class"), ...)
object |
Object returned by |
newX |
Feature matrix. |
type |
|
... |
Ignored. |
Numeric scores/codes or class labels.
Predict weighted ensemble risk.
## S3 method for class 'automlr_surv_ensemble' predict(object, newX, ...)## S3 method for class 'automlr_surv_ensemble' predict(object, newX, ...)
object |
An object returned by |
newX |
Feature matrix. |
... |
Ignored. |
Numeric risk score; higher means higher predicted hazard.
Splits a long-format data frame by cohort, maps the outcome to 0/1, keeps numeric shared features across cohorts, and returns an object for binary AutoMLR workflows.
prepare_binary_cohort_input( data, cohort, outcome, id = NULL, positive_class = 1, negative_class = NULL, collapse_other = FALSE, drop_cohorts = NULL )prepare_binary_cohort_input( data, cohort, outcome, id = NULL, positive_class = 1, negative_class = NULL, collapse_other = FALSE, drop_cohorts = NULL )
data |
A data.frame with one row per sample. |
cohort |
Name of the cohort / dataset column. |
outcome |
Name of the binary outcome column. |
id |
Optional sample-id column. |
positive_class |
Value in |
negative_class |
Optional value in |
collapse_other |
Logical. If |
drop_cohorts |
Optional cohorts to exclude. |
An object of class "automlr_binary_input".
Splits data by the cohort column and computes the feature
intersection across cohorts. Returns a tidy object for downstream fitting
plus diagnostic info for the user.
prepare_cohort_input( data, cohort, time, status, id = NULL, drop_cohorts = NULL )prepare_cohort_input( data, cohort, time, status, id = NULL, drop_cohorts = NULL )
data |
A data.frame with one row per sample. |
cohort |
Name of the column identifying cohort membership. |
time |
Name of the survival time column (numeric, > 0). |
status |
Name of the event indicator column (0/1; 1 = event). |
id |
Optional name of a sample-id column; just passed through. |
drop_cohorts |
Optional character vector of cohorts to exclude. |
An S3 list of class "automlr_input" with components:
Named list of per-cohort data frames restricted to
shared_features + time + status (+ id).
Character vector of feature columns present in every cohort (the intersection).
Named list of each cohort's raw feature set.
Features present in at least one cohort but not all, therefore excluded from the intersection.
data.frame: cohort, n_samples, n_events, median_time, n_raw_features, n_shared_features.
Echo of column names used.
Splits a long-format data frame by cohort, keeps numeric shared features, and returns an object for continuous AutoMLR workflows.
prepare_continuous_cohort_input( data, cohort, outcome, id = NULL, drop_cohorts = NULL )prepare_continuous_cohort_input( data, cohort, outcome, id = NULL, drop_cohorts = NULL )
data |
A data.frame with one row per sample. |
cohort |
Name of the cohort / dataset column. |
outcome |
Name of the numeric outcome column. |
id |
Optional sample-id column. |
drop_cohorts |
Optional cohorts to exclude. |
An object of class "automlr_continuous_input".
Maps an ordered outcome to integer scores, keeps numeric shared features, and returns an object for ordinal AutoMLR workflows.
prepare_ordinal_cohort_input( data, cohort, outcome, ordered_levels = NULL, id = NULL, drop_cohorts = NULL )prepare_ordinal_cohort_input( data, cohort, outcome, ordered_levels = NULL, id = NULL, drop_cohorts = NULL )
data |
A data.frame with one row per sample. |
cohort |
Name of the cohort / dataset column. |
outcome |
Name of the ordinal outcome column. |
ordered_levels |
Optional ordered outcome levels from low to high. |
id |
Optional sample-id column. |
drop_cohorts |
Optional cohorts to exclude. |
An object of class "automlr_ordinal_input".
Print an AutoMLR dependency report.
## S3 method for class 'automlr_dependency_report' print(x, ...)## S3 method for class 'automlr_dependency_report' print(x, ...)
x |
An object returned by |
... |
Unused. |
Invisibly returns x.
Print method for extreme survival screening
## S3 method for class 'automlr_extreme_screen' print(x, ...)## S3 method for class 'automlr_extreme_screen' print(x, ...)
x |
An object returned by |
... |
Ignored. |
Invisibly returns x.
Uses the finite receiver operating characteristic area under the curve (AUC)
values in a binary candidate summary and returns the requested quantile as a
threshold for "threshold" strategy model selection.
recommend_binary_auc_threshold(loocv_set, auto_quantile = 0.5, minimum = 0.5)recommend_binary_auc_threshold(loocv_set, auto_quantile = 0.5, minimum = 0.5)
loocv_set |
An object returned by |
auto_quantile |
Numeric quantile in |
minimum |
Lower bound for the returned threshold. Defaults to |
A numeric scalar giving the recommended minimum AUC.
Uses finite out-of-fold R-squared values and returns the requested quantile
as a threshold for "threshold" strategy model selection.
recommend_continuous_r2_threshold( resample_set, auto_quantile = 0.5, minimum = 0 )recommend_continuous_r2_threshold( resample_set, auto_quantile = 0.5, minimum = 0 )
resample_set |
An object returned by |
auto_quantile |
Numeric quantile in |
minimum |
Lower bound for the returned threshold. Defaults to |
A numeric scalar giving the recommended minimum R-squared.
Uses finite out-of-fold quadratic weighted kappa values and returns the
requested quantile as a threshold for "threshold" strategy model selection.
recommend_ordinal_qwk_threshold(resample_set, auto_quantile = 0.5, minimum = 0)recommend_ordinal_qwk_threshold(resample_set, auto_quantile = 0.5, minimum = 0)
resample_set |
An object returned by |
auto_quantile |
Numeric quantile in |
minimum |
Lower bound for the returned threshold. Defaults to |
A numeric scalar giving the recommended minimum quadratic weighted kappa.
Uses the finite leave-one-out cross-validation concordance index (C-index)
values in a survival candidate summary and returns the requested quantile as
a threshold for "threshold" strategy model selection.
recommend_surv_cindex_threshold(loocv_set, auto_quantile = 0.5, minimum = 0.5)recommend_surv_cindex_threshold(loocv_set, auto_quantile = 0.5, minimum = 0.5)
loocv_set |
An object returned by |
auto_quantile |
Numeric quantile in |
minimum |
Lower bound for the returned threshold. Defaults to |
A numeric scalar giving the recommended minimum C-index.
Render an HTML report for a fitted binary ensemble.
render_binary_report( object, output_dir = "automlr_binary_report", report_file = "index.html", title = "AutoMLR Binary Report", top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )render_binary_report( object, output_dir = "automlr_binary_report", report_file = "index.html", title = "AutoMLR Binary Report", top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )
object |
Object returned by |
output_dir |
Report directory. |
report_file |
HTML file name. |
title |
Report title. |
top_n |
Number of top rows. |
overwrite |
Logical. |
summary_language |
|
Invisibly returns report path.
Render an HTML report for a fitted continuous ensemble.
render_continuous_report( object, output_dir = "automlr_continuous_report", report_file = "index.html", title = "AutoMLR Continuous Report", top_n = 20L, overwrite = TRUE )render_continuous_report( object, output_dir = "automlr_continuous_report", report_file = "index.html", title = "AutoMLR Continuous Report", top_n = 20L, overwrite = TRUE )
object |
Object returned by |
output_dir |
Report directory. |
report_file |
HTML file name. |
title |
Report title. |
top_n |
Number of top rows. |
overwrite |
Logical. |
Invisibly returns report path.
Render an HTML report for a fitted ordinal ensemble.
render_ordinal_report( object, output_dir = "automlr_ordinal_report", report_file = "index.html", title = "AutoMLR Ordinal Report", top_n = 20L, overwrite = TRUE )render_ordinal_report( object, output_dir = "automlr_ordinal_report", report_file = "index.html", title = "AutoMLR Ordinal Report", top_n = 20L, overwrite = TRUE )
object |
Object returned by |
output_dir |
Report directory. |
report_file |
HTML file name. |
title |
Report title. |
top_n |
Number of top rows. |
overwrite |
Logical. |
Invisibly returns report path.
Writes a self-contained HTML summary plus separate figures/ and tables/
folders. Model selection remains whatever was used by fit_surv_ensemble();
this function only reports diagnostics.
render_surv_report( object, output_dir = "automlr_report", report_file = "index.html", title = "AutoMLR Survival Report", top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )render_surv_report( object, output_dir = "automlr_report", report_file = "index.html", title = "AutoMLR Survival Report", top_n = 20L, overwrite = TRUE, summary_language = c("bilingual", "en", "zh") )
object |
An object returned by |
output_dir |
Directory where the report folder should be written. |
report_file |
HTML file name. |
title |
Report title. |
top_n |
Number of top single models / combinations to show. |
overwrite |
Logical, overwrite existing report files. |
summary_language |
Language used in |
Invisibly returns the HTML report path.
Print a binary cohort-intersection report.
report_binary_cohort_intersection(input)report_binary_cohort_intersection(input)
input |
An object returned by |
Invisibly returns input.
Tells the user:
per-cohort sample/event counts and median follow-up,
how many features each cohort has vs. the intersection,
how many features were dropped because they were missing in some cohorts.
report_cohort_intersection(input)report_cohort_intersection(input)
input |
An object returned by |
Invisibly returns input.
Print a continuous cohort-intersection report.
report_continuous_cohort_intersection(input)report_continuous_cohort_intersection(input)
input |
An object returned by |
Invisibly returns input.
Print an ordinal cohort-intersection report.
report_ordinal_cohort_intersection(input)report_ordinal_cohort_intersection(input)
input |
An object returned by |
Invisibly returns input.
Thin wrapper around future::plan() — see ?future::plan. If the optional
future package is unavailable, AutoMLR keeps using sequential execution.
start_parallel( cores = get_parallel_cores(), strategy = c("multisession", "multicore") )start_parallel( cores = get_parallel_cores(), strategy = c("multisession", "multicore") )
cores |
Integer, number of workers. |
strategy |
One of |
Invisibly TRUE when the backend was started, or FALSE when the
optional future package is unavailable.
Stop the parallel backend.
stop_parallel()stop_parallel()
Invisibly TRUE when a future backend was reset, or FALSE when
the optional future package is unavailable.
Summarize base-model screening results in Markdown
summarize_base_models( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_base_models( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
An object returned by |
top_n |
Number of top model rows to show. |
language |
Summary language: |
A Markdown string.
Summarize a complete binary AutoMLR analysis in Markdown.
summarize_binary_analysis_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_binary_analysis_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
A fitted binary ensemble. |
top_n |
Number of rows per section. |
language |
|
Markdown string.
Summarize binary base-model screening in Markdown.
summarize_binary_base_models( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_binary_base_models( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
A fitted binary ensemble. |
top_n |
Number of rows. |
language |
|
Markdown string.
Summarize binary data preparation in Markdown.
summarize_binary_data_preparation( x, top_n = 10L, language = c("bilingual", "en", "zh") )summarize_binary_data_preparation( x, top_n = 10L, language = c("bilingual", "en", "zh") )
x |
An |
top_n |
Number of rows. |
language |
|
Markdown string.
Summarize binary ensemble selection in Markdown.
summarize_binary_ensemble_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_binary_ensemble_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
A fitted binary ensemble. |
top_n |
Number of rows. |
language |
|
Markdown string.
Summarize binary explainability outputs in Markdown.
summarize_binary_explainability_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_binary_explainability_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
A fitted binary ensemble. |
top_n |
Number of rows. |
language |
|
Markdown string.
Creates a bilingual or single-language summary of cohort/sample/event counts, shared features, dropped non-shared features, and simple data-risk notes.
summarize_data_preparation( x, top_n = 10L, language = c("bilingual", "en", "zh") )summarize_data_preparation( x, top_n = 10L, language = c("bilingual", "en", "zh") )
x |
An |
top_n |
Number of cohort rows to show. |
language |
Summary language: |
A Markdown string.
Summarize ensemble-selection results in Markdown
summarize_ensemble_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_ensemble_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
An object returned by |
top_n |
Number of top combination rows to show. |
language |
Summary language: |
A Markdown string.
Summarize explainability and clinical-utility outputs in Markdown
summarize_explainability_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_explainability_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
An object returned by |
top_n |
Number of top feature/model rows to show. |
language |
Summary language: |
A Markdown string.
Creates a compact interpretation of an extreme_surv_screen() result:
apparent-screen leaders, best seed-search model, top seed rows, best rows
after seed de-duplication, combination-level stability, and failure notes.
summarize_extreme_screen_results( x, top_n = 3L, language = c("bilingual", "en", "zh") )summarize_extreme_screen_results( x, top_n = 3L, language = c("bilingual", "en", "zh") )
x |
An object returned by |
top_n |
Number of rows to include in each top-results section. |
language |
Summary language, either |
A single Markdown string.
Combines data-preparation, base-model, and ensemble-selection summaries.
This is the regular-analysis counterpart to
summarize_extreme_screen_results().
summarize_surv_analysis_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )summarize_surv_analysis_results( object, top_n = 5L, language = c("bilingual", "en", "zh") )
object |
An object returned by |
top_n |
Number of top rows to show per section. |
language |
Summary language: |
A Markdown string.