| Title: | Machine Learning for Runoff Prediction |
|---|---|
| Description: | Machine learning In k-fold cross validation . |
| Authors: | Dongdong Kong [aut, cre] (ORCID: <https://orcid.org/0000-0003-1836-8172>) |
| Maintainer: | Dongdong Kong <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-06-03 18:26:05 UTC |
| Source: | https://github.com/rpkgs/kfold |
kfold_calib
kfold_calib(X, Y, FUN = xgboost, index = NULL, ..., ratio_valid = 0.3)kfold_calib(X, Y, FUN = xgboost, index = NULL, ..., ratio_valid = 0.3)
index |
index of validation set |
kfold machine learning
kfold_ml(X, Y, kfold = 5, FUN, ..., .progress = TRUE) kfold_rf(X, Y, kfold = 5, FUN = ranger, ntree = 500, importance = "none", ...) kfold_xgboost(X, Y, kfold = 5, FUN = xgboost, nrounds = 500, ...) kfold_lm(X, Y, kfold = 5, ...)kfold_ml(X, Y, kfold = 5, FUN, ..., .progress = TRUE) kfold_rf(X, Y, kfold = 5, FUN = ranger, ntree = 500, importance = "none", ...) kfold_xgboost(X, Y, kfold = 5, FUN = xgboost, nrounds = 500, ...) kfold_lm(X, Y, kfold = 5, ...)
... |
Further arguments passed to or from other methods (currently ignored). |
importance |
Variable importance mode, one of 'none', 'impurity', 'impurity_corrected', 'permutation'. The 'impurity' measure is the Gini index for classification, the variance of the responses for regression and the sum of test statistics (see |
nrounds |
Number of boosting iterations / rounds. Note that the number of default boosting rounds here is not automatically tuned, and different problems will have vastly different optimal numbers of boosting rounds. |
ranger::ranger(), xgboost::xgboost()
set.seed(1) n <- 100 ; p <- 2 X <- matrix(rnorm(n * p), n, p) # no intercept! y <- as.matrix(rnorm(n)) ## kfold r_lm <- kfold_lm(X, y) r_xgb <- kfold_xgboost(X, y) # r_rf <- kfold_rf(X, y) ## 70%-30% split r = kfold_calib(X, y, ratio_valid = 0.7, nrounds=500, verbose=FALSE) r$gofset.seed(1) n <- 100 ; p <- 2 X <- matrix(rnorm(n * p), n, p) # no intercept! y <- as.matrix(rnorm(n)) ## kfold r_lm <- kfold_lm(X, y) r_xgb <- kfold_xgboost(X, y) # r_rf <- kfold_rf(X, y) ## 70%-30% split r = kfold_calib(X, y, ratio_valid = 0.7, nrounds=500, verbose=FALSE) r$gof
Good of fitting
NSE(yobs, ysim, w, ...) GOF(yobs, ysim, w, include.cv = FALSE, include.r = TRUE)NSE(yobs, ysim, w, ...) GOF(yobs, ysim, w, include.cv = FALSE, include.r = TRUE)
yobs |
Numeric vector, observations |
ysim |
Numeric vector, corresponding simulated values |
w |
Numeric vector, weights of every points. If w included, when calculating mean, Bias, MAE, RMSE and NSE, w will be taken into considered. |
include.cv |
If true, cv will be included. |
include.r |
If true, r and R2 will be included. |
RMSE root mean square error
NSE NASH coefficient
MAE mean absolute error
AI Agreement index (only good points (w == 1)) participate to
calculate. See details in Zhang et al., (2015).
Bias bias
Bias_perc bias percentage
n_sim number of valid obs
cv Coefficient of variation
R2 correlation of determination
R pearson correlation
pvalue pvalue of R
https://en.wikipedia.org/wiki/Coefficient_of_determination
https://en.wikipedia.org/wiki/Explained_sum_of_squares
https://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient
Zhang Xiaoyang (2015), http://dx.doi.org/10.1016/j.rse.2014.10.012
yobs <- rnorm(100) ysim <- yobs + rnorm(100) / 4 GOF(yobs, ysim)yobs <- rnorm(100) ysim <- yobs + rnorm(100) / 4 GOF(yobs, ysim)
previous_tn
previous_tn(x, n = 7, prefix = "", ...) ## Default S3 method: previous_tn(x, n = 7, prefix = "", ...) ## S3 method for class 'data.frame' previous_tn(x, n = 7, ...)previous_tn(x, n = 7, prefix = "", ...) ## Default S3 method: previous_tn(x, n = 7, prefix = "", ...) ## S3 method for class 'data.frame' previous_tn(x, n = 7, ...)
set.seed(1) x <- rnorm(10) previous_tn(x, 7, "R1_") # data.frame d = data.frame(x) previous_tn(d)set.seed(1) x <- rnorm(10) previous_tn(x, 7, "R1_") # data.frame d = data.frame(x) previous_tn(d)