| Title: | Well Water Level Analysis Tools |
|---|---|
| Description: | Well water level analysis tools for R. The current focus is on barometric efficiency. |
| Authors: | Jonathan Kennel [aut, cre], Beth Parker [ths] |
| Maintainer: | Jonathan Kennel <[email protected]> |
| License: | GPL-3 |
| Version: | 0.0.1 |
| Built: | 2026-05-28 14:38:10 UTC |
| Source: | https://github.com/rpkgs/waterlevel |
A more detailed description of what the package does. A length of about one to five lines is recommended.
This section should provide a more detailed overview of how to use the package, including the most important functions.
Your Name, email optional.
Maintainer: Your Name <[email protected]>
This optional section can contain literature or other references for background information.
Optional links to other man pages
## Not run: ## Optional simple examples of the most important functions ## These can be in \dontrun{} and \donttest{} blocks. ## End(Not run)## Not run: ## Optional simple examples of the most important functions ## These can be in \dontrun{} and \donttest{} blocks. ## End(Not run)
add_level_shifts
add_level_shifts(x, y)add_level_shifts(x, y)
x |
datetimes |
y |
shift vector |
vector of shifts
Extends the extractAIC method from the stats package to handle
multi-predictand linear models (objects of class mlm). FROM paleocar
AIC_mlm(fit, scale = 0, k = 2)AIC_mlm(fit, scale = 0, k = 2)
fit |
An object of class mlm. |
scale |
The estimate of the error variance.
|
k |
Numeric, the penalty per parameter to be used; the default k = 2 is the classical AIC. |
A list of length 2 giving
df The 'equivalent degrees of freedom' for the fitted model fit.
AIC A vector of the (generalized) Akaike Information Criterion for the fits.
Calculate the effect of elevation on barometric pressure.
baro_elev( elevation = 0, pressure_sea = 101325, temperature = 20, gravity = 9.80665 )baro_elev( elevation = 0, pressure_sea = 101325, temperature = 20, gravity = 9.80665 )
elevation |
vector elevations (numeric) |
pressure_sea |
standard pressure (numeric) |
temperature |
temperature in celcius (numeric) |
gravity |
gravity (numeric) |
barometric pressure dependence on elevation
plot(baro_elev(0:10000), type='l', xlab = "elev (m)", ylab = 'pressure')plot(baro_elev(0:10000), type='l', xlab = "elev (m)", ylab = 'pressure')
Implementation of Acworth et. al. 2016. This is basically the transfer function value for the semi-diurnal band.
be_acworth( dat, wl = "wl", ba = "ba", et = "et", method = "spec_pgram", inverse = TRUE, ... )be_acworth( dat, wl = "wl", ba = "ba", et = "et", method = "spec_pgram", inverse = TRUE, ... )
dat |
data that has the independent and dependent variables (data.table) |
wl |
name of the water level column (character). |
ba |
name of the barometric pressure column (character). |
et |
name of the Earth tide column (character). |
method |
either spec_pgram or spec_welch |
inverse |
|
... |
arguments to pass to transfer_acworth, spec_welch, spec_pgram |
barometric efficiency
Acworth, R. I., Halloran, L. J. S., Rau, G. C., Cuthbert, M. O., & Bernardi, T. L. (2016). An objective frequency-domain method for quantifying confined aquifer compressible storage using Earth and atmospheric tides. Geophysical Research Letters, 43 (November). doi:10.1002/2016GL071328
Acworth, R. I., G. C. Rau, L. J. S. Halloran, and W. A. Timms (2017), Vertical groundwater stor1077 age properties and changes in confinement determined using hydraulic head response to atmospheric tides, Water Resources Research, 53(4), 2983–2997, doi:10.1002/2016WR020311.
Implementation of Acworth et. al. 2016 equation 4.
be_acworth_eq_4(s2_gw, s2_et, s2_at, m2_gw, m2_et, d_phase, inverse = TRUE)be_acworth_eq_4(s2_gw, s2_et, s2_at, m2_gw, m2_et, d_phase, inverse = TRUE)
s2_gw |
|
s2_et |
|
s2_at |
|
m2_gw |
|
m2_et |
|
d_phase |
|
inverse |
|
barometric efficiency
Acworth, R. I., Halloran, L. J. S., Rau, G. C., Cuthbert, M. O., & Bernardi, T. L. (2016). An objective frequency-domain method for quantifying confined aquifer compressible storage using Earth and atmospheric tides. Geophysical Research Letters, 43 (November). doi:10.1002/2016GL071328
be_acworth_eq_4(s2_at = 7.461, s2_et=224.640, s2_gw=4.086, m2_gw = 0.471, m2_et = 492.526, d_phase=-56.709) be_acworth_eq_4(s2_at = 6.164, s2_et=270.463, s2_gw=0.329, m2_gw = 0.225, m2_et = 551.572, d_phase=-71.726) be_acworth_eq_4(s2_at = 5.897, s2_et=234.478, s2_gw=5.536, m2_gw = 0.773, m2_et = 558.075, d_phase=-70.393)be_acworth_eq_4(s2_at = 7.461, s2_et=224.640, s2_gw=4.086, m2_gw = 0.471, m2_et = 492.526, d_phase=-56.709) be_acworth_eq_4(s2_at = 6.164, s2_et=270.463, s2_gw=0.329, m2_gw = 0.225, m2_et = 551.572, d_phase=-71.726) be_acworth_eq_4(s2_at = 5.897, s2_et=234.478, s2_gw=5.536, m2_gw = 0.773, m2_et = 558.075, d_phase=-70.393)
Clark 1967 solution for calculating barometric efficiency.
be_clark( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, return_model = FALSE )be_clark( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, return_model = FALSE )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
lag_space |
space between difference calculation in number of observations |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
return_model |
whether to return the lm model or just the barometric/loading efficiency (logical).#' |
lm linear model for Clark's method. The coefficient is the BE/LE
Clark, W., 1967. Computing the Barometric Efficiency of a well. Proc. Am. Soc. Civ. Eng. J. Hydraul. Div. V. 93, 93–98.
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='sec' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro + rnorm(length(datetime), sd = 0.02) dat <- data.table(baro, wl, datetime) be_clark(dat, dep='wl', ind='baro', lag_space=1, inverse=TRUE)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='sec' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro + rnorm(length(datetime), sd = 0.02) dat <- data.table(baro, wl, datetime) be_clark(dat, dep='wl', ind='baro', lag_space=1, inverse=TRUE)
Adjust values based on the barometric efficiency
be_correct( dat, dep = "wl", ind = "baro", be = 0, inverse = TRUE, known_mean = NULL )be_correct( dat, dep = "wl", ind = "baro", be = 0, inverse = TRUE, known_mean = NULL )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
be |
|
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
known_mean |
|
numeric vector of corrected values
library(data.table) baro = rnorm(1000, sd = 0.01) + 9 wl <- baro * 0.4 + 18 dat <- data.table(baro, wl) dat$wl + be_correct(dat, be=0.4, inverse = FALSE) dat$wl + be_correct(dat, be=0.4, inverse = FALSE, known_mean = 9) # should return ~21.6 = 18 + 0.4 * 9library(data.table) baro = rnorm(1000, sd = 0.01) + 9 wl <- baro * 0.4 + 18 dat <- data.table(baro, wl) dat$wl + be_correct(dat, be=0.4, inverse = FALSE) dat$wl + be_correct(dat, be=0.4, inverse = FALSE, known_mean = 9) # should return ~21.6 = 18 + 0.4 * 9
Uses the maximum and minimum values for a time period to calculate be.
be_high_low( dat, dep = "wl", ind = "baro", time = "datetime", time_group = "%Y-%m-%d" )be_high_low( dat, dep = "wl", ind = "baro", time = "datetime", time_group = "%Y-%m-%d" )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
time |
name of the column containing the time (character) |
time_group |
format for grouping time (ie: "%Y-%m-%d") (character) |
barometric efficiency based on high and low values
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 00:00:00", tz = 'UTC'), as.POSIXct("2016-01-06 00:00:00", tz = 'UTC'), by='hour') baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- 0.4 * baro + rnorm(length(datetime), sd = 0.01) dat <- data.table(baro, wl, datetime) be_hl <- be_high_low(dat) be_hl <- be_high_low(dat, time_group = 86400)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 00:00:00", tz = 'UTC'), as.POSIXct("2016-01-06 00:00:00", tz = 'UTC'), by='hour') baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- 0.4 * baro + rnorm(length(datetime), sd = 0.01) dat <- data.table(baro, wl, datetime) be_hl <- be_high_low(dat) be_hl <- be_high_low(dat, time_group = 86400)
Calculate the barometric efficiency by using least squares.
be_least_squares( dat, dep = "wl", ind = "baro", inverse = TRUE, return_model = FALSE )be_least_squares( dat, dep = "wl", ind = "baro", inverse = TRUE, return_model = FALSE )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
return_model |
whether to return the lm model or just the barometric/loading efficiency (logical).#' |
barometric efficiency calculated by least squares
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro dat <- data.table(baro, wl, datetime) be_least_squares(dat)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro dat <- data.table(baro, wl, datetime) be_least_squares(dat)
Calculate the barometric efficiency by using the least squares with differences
be_least_squares_diff( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, return_model = FALSE )be_least_squares_diff( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, return_model = FALSE )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
lag_space |
space between difference calculation in number of observations |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
return_model |
whether to return the lm model or just the barometric/loading efficiency (logical).#' |
barometric efficiency calculated by least squares
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro dat <- data.table(baro, wl, datetime) be_least_squares_diff(dat, lag_space = 1)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- -0.4 * baro dat <- data.table(baro, wl, datetime) be_least_squares_diff(dat, lag_space = 1)
Rahi 2010 solution for calculating barometric efficiency.
be_rahi(dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE)be_rahi(dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE)
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
lag_space |
space between difference calculation in number of observations |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
barometric efficiency using Rahi's method
Rahi, K. A. (2010). Estimating the hydraulic parameters of the Arbuckle-Simpson aquifer by analysis of naturally-induced stresses (Doctoral dissertation, Oklahoma State University).
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2 * pi, length.out = length(datetime))) noise <- rnorm(length(datetime), sd = 0.01) wl <- -0.4 * baro + noise dat <- data.table(baro, wl, datetime) be_rahi(dat, dep = 'wl', ind = 'baro', lag_space = 1)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2 * pi, length.out = length(datetime))) noise <- rnorm(length(datetime), sd = 0.01) wl <- -0.4 * baro + noise dat <- data.table(baro, wl, datetime) be_rahi(dat, dep = 'wl', ind = 'baro', lag_space = 1)
Calculate the barometric efficiency using the mean (or other statistic) of the water level change divided by the barometric pressure change . There is the option to only include responses that are large.
be_ratio( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, quant = 0.9, stat = mean, ... )be_ratio( dat, dep = "wl", ind = "baro", lag_space = 1, inverse = TRUE, quant = 0.9, stat = mean, ... )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
lag_space |
space between difference calculation in number of observations |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
quant |
quantile cutoff value that differences need to be this large (numeric) |
stat |
the result to return (mean, median, quantile) (function) |
... |
other arguments to pass to "stat" |
barometric efficiency calculated by using ratio method
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) noise <- rnorm(length(datetime), sd = 0.01) wl <- -0.4 * baro + noise dat <- data.table(baro, wl, datetime) be_ratio(dat, quant = 0.5, stat=median) be_ratio(dat, quant = 0.9, stat=mean)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) noise <- rnorm(length(datetime), sd = 0.01) wl <- -0.4 * baro + noise dat <- data.table(baro, wl, datetime) be_ratio(dat, quant = 0.5, stat=median) be_ratio(dat, quant = 0.9, stat=mean)
This function estimates the storage coefficient from physical properties and the barometric efficiency The equation used is from Batu 1998, eq 2-101, pg. 72
be_storage(be, n, b, gamma = 9799.74, beta = 4.786e-10)be_storage(be, n, b, gamma = 9799.74, beta = 4.786e-10)
be |
the barometric efficiency (numeric) |
n |
the porosity 0 to 1 (numeric) |
b |
aquifer thickness (m) (numeric) |
gamma |
the specific weight of water (N/m^3) (numeric) |
beta |
the compressibility of water (4.786e-10 m^2/N) (numeric) |
storage calculated from barometric efficiency
Batu, V. (1998). Aquifer hydraulics: a comprehensive guide to hydrogeologic data analysis. John Wiley & Sons.
be_storage(0.5, 0.32, 45)be_storage(0.5, 0.32, 45)
Generate dataset for comparing barometric efficiency
be_visual( dat, dep = "wl", ind = "baro", time = "datetime", be_tests = seq(0, 1, 0.1), inverse = TRUE, subsample = TRUE )be_visual( dat, dep = "wl", ind = "baro", time = "datetime", be_tests = seq(0, 1, 0.1), inverse = TRUE, subsample = TRUE )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
time |
name of the column containing the time (character) |
be_tests |
vector of barometric efficiencies to test (between 0 and 1) (numeric) |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
subsample |
should the data be subsampled for plotting? (logical) |
data.table of barometric efficiency compensated datasets
Smith, L. A., van der Kamp, G., & Hendry, M. J. (2013). A new technique for obtaining high‐resolution pore pressure records in thick claystone aquitards and its use to determine in situ compressibility. Water Resources Research, 49(2), 732-743. doi:10.1002/wrcr.20084
library(data.table) be <- 0.43 x <- seq(0, 28*pi, pi / (12*12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) be_visual(dat)library(data.table) be <- 0.43 x <- seq(0, 28*pi, pi / (12*12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) be_visual(dat)
Generate dataset for comparing barometric efficiency
be_visual_data( dat, dep = "wl", ind = "baro", be_tests = seq(0, 1, 0.1), inverse = TRUE )be_visual_data( dat, dep = "wl", ind = "baro", be_tests = seq(0, 1, 0.1), inverse = TRUE )
dat |
data that has the independent and dependent variables (data.table) |
dep |
name of the dependent variable column (character). This is typically the name for the column holding your water level data. |
ind |
name of the independent variable column (character). This is typically the name for the column holding your barometric pressure data. |
be_tests |
vector of barometric efficiencies to test (between 0 and 1) (numeric) |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
data.table of barometric efficiency compensated datasets
library(data.table) be <- 0.43 x <- seq(0, 28*pi, pi / (12*12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) be_visual_data(dat)library(data.table) be <- 0.43 x <- seq(0, 28*pi, pi / (12*12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) be_visual_data(dat)
Plot to compare barometric efficiency. Large datasets may take a long time to plot. Subsample should be set to TRUE
be_visual_plot(dat, time = "datetime", subsample = TRUE)be_visual_plot(dat, time = "datetime", subsample = TRUE)
dat |
data that has the independent and dependent variables (data.table) |
time |
name of the column containing the time (character) |
subsample |
should the data be subsampled for plotting? (logical) |
plotly graph for barometric efficiency estimation with Smith method
library(plotly) library(data.table) be <- 0.43 x <- seq(0, 28 * pi, pi / (12 * 12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) dat_be <- be_visual_data(dat) #be_visual_plot(dat_be) #not runlibrary(plotly) library(data.table) be <- 0.43 x <- seq(0, 28 * pi, pi / (12 * 12)) baro <- sin(x) + rnorm(length(x), sd = 0.04) wl <- -sin(x) * be + rnorm(length(x), sd = 0.04) dat <- data.table(datetime = as.POSIXct(x * 86400 / (2 * pi), origin = '1970-01-01', tz = 'UTC'), wl = wl, baro = baro) dat_be <- be_visual_data(dat) #be_visual_plot(dat_be) #not run
Calculate the barometric response function from the transfer function
brf_from_frf(x)brf_from_frf(x)
x |
the complex transfer function vector |
vector of the barometric response function
calculate phase and coherency from the spectra and cross-spectra
coh_phase(pgram)coh_phase(pgram)
pgram |
|
list of coherency and phase
convolve_fft
convolve_fft(x, y)convolve_fft(x, y)
x |
numeric vector |
y |
convolution kernel |
numeric vector result of convolution
cross_basis_fft
cross_basis_fft(basisvar, basislag)cross_basis_fft(basisvar, basislag)
basisvar |
numeric vector |
basislag |
lagging matrix |
numeric vector result of convolution
Determinant for an array in parallel
det_parallel(a)det_parallel(a)
a |
|
vector of determinants
Calculate lagged differences padded with NA values
diff_shift(x, lag_space = 1)diff_shift(x, lag_space = 1)
x |
vector to difference (numeric) |
lag_space |
spacing for lags, useful for higher frequency monitoring (integer). |
lagged differences
diff_shift(1:100, 2)diff_shift(1:100, 2)
Depending on the vector to lag and the maximum knot, distributed_lag will either use an FFT (no NA and large maximum knot), or a parallel method (NA, or small maximum knot).
distributed_lag( x, knots, spline_fun = ns, lag_name = "", n_subset = 1, n_shift = 0 )distributed_lag( x, knots, spline_fun = ns, lag_name = "", n_subset = 1, n_shift = 0 )
x |
numeric vector to lag |
knots |
specific knots for the lagging process |
spline_fun |
spline function to use i.e. splines::ns, splines::bs |
lag_name |
name of variable to lag |
n_subset |
for distributed_lag parallel |
n_shift |
for distributed_lag parallel |
matrix with distributed lag terms
This method calculates the basis for a distributed lag in parallel. It is currently slow.
distributed_lag_parallel(x, bl, lag_max, n_subset = 1L, n_shift = 0L)distributed_lag_parallel(x, bl, lag_max, n_subset = 1L, n_shift = 0L)
x |
matrix value of lag |
bl |
matrix the basis lags |
lag_max |
integer maximum number of lags |
n_subset |
integer subset the data |
n_shift |
integer to shift |
distributed lag basis
energy_density_wang
energy_density_wang(magnitude, distance)energy_density_wang(magnitude, distance)
magnitude |
Richter magnitude |
distance |
epicentral distance |
value of the energy density
energy_density_wang(5, 9)energy_density_wang(5, 9)
Do convolution for a data series and matrix of equal length filters. For long data series this function will be much faster than non-fft methods.
fftw_convolve(x, y, normalize = TRUE, align = "center")fftw_convolve(x, y, normalize = TRUE, align = "center")
x |
|
y |
|
normalize |
|
align |
|
convolution of data series and filter
library(splines) knots <- round(10^seq(0.5, 4.635484, length.out = 9)) n_lags <- 86400 * 2 bla <- ns(0:(n_lags), knots = knots, Boundary.knots = c(0, n_lags) ) bv <- rnorm(86400 * 2 + 3600 * 24 * 4) system.time({ dl <- fftw_convolve(bv, bla, normalize = FALSE) })library(splines) knots <- round(10^seq(0.5, 4.635484, length.out = 9)) n_lags <- 86400 * 2 bla <- ns(0:(n_lags), knots = knots, Boundary.knots = c(0, n_lags) ) bv <- rnorm(86400 * 2 + 3600 * 24 * 4) system.time({ dl <- fftw_convolve(bv, bla, normalize = FALSE) })
find_gaps
find_gaps(x, dep_var = "val", time_var = "datetime", time_interval = 1)find_gaps(x, dep_var = "val", time_var = "datetime", time_interval = 1)
x |
the data set (data.table) |
dep_var |
the dependent variable name (character) |
time_var |
the time variable name (character) |
time_interval |
the delta t (numeric) |
data.table of gaps
find_level_shift
find_level_shift(x, dep_var = "val", time_var = "datetime", time_interval = 1)find_level_shift(x, dep_var = "val", time_var = "datetime", time_interval = 1)
x |
the data set (data.table) |
dep_var |
the dependent variable name (character) |
time_var |
the time variable name (character) |
time_interval |
the delta t (numeric) |
data.table of gaps
fit_gaps
fit_gaps(x, recipe, time_var = "datetime")fit_gaps(x, recipe, time_var = "datetime")
x |
the data set (data.table) |
recipe |
the recipe to apply |
time_var |
the time variable name (character) |
data.table of fit results
formula_from_recipe guess formula from recipe
formula_from_recipe(recipe)formula_from_recipe(recipe)
recipe |
recipe to apply |
formula
frf_predict_terms
frf_predict_terms( frf, x, time_col = "datetime", vars = c("wl", "baro"), coherency_cut = 0.3, limit = 2 )frf_predict_terms( frf, x, time_col = "datetime", vars = c("wl", "baro"), coherency_cut = 0.3, limit = 2 )
frf |
frequency response function |
x |
input variable dataset |
time_col |
datetime |
vars |
variables used in the transfer function call |
coherency_cut |
values must have at least this coherency |
limit |
gain must be lower than this value. Often useful for Earth tides. |
data.table of results
gap_fill
gap_fill( x, gaps, recipe, dep_var = "wl", time_var = "datetime", time_interval = 1L, buffer_start = 86400 * 4, buffer_end = 86400 * 4, max_interp = 86400 * 7, include_level_shift = TRUE )gap_fill( x, gaps, recipe, dep_var = "wl", time_var = "datetime", time_interval = 1L, buffer_start = 86400 * 4, buffer_end = 86400 * 4, max_interp = 86400 * 7, include_level_shift = TRUE )
x |
data.table of water levels |
gaps |
data.table of gaps |
recipe |
recipe to apply |
dep_var |
the time variable name (character) |
time_var |
the time variable name (character) |
time_interval |
the delta t (numeric) |
buffer_start |
how much buffer on each side of gap |
buffer_end |
how much buffer on each side of gap |
max_interp |
largest gap to interpolate |
include_level_shift |
include level shift in adjustment |
data.table of predictions
gap_fill2
gap_fill2(x, y)gap_fill2(x, y)
x |
gap data.table |
y |
interpolated values |
data.table with gap values adjusted for range
get_fit_summary
get_fit_summary(x, recipe, start_reg, end_reg, start_gap, end_gap)get_fit_summary(x, recipe, start_reg, end_reg, start_gap, end_gap)
x |
data.table of water levels |
recipe |
recipe to apply |
start_reg |
start subset time |
end_reg |
end subset time |
start_gap |
start subset time |
end_gap |
end subset time |
data.table summary
get_intercept_stats
get_intercept_stats(x)get_intercept_stats(x)
x |
data.table of level shifts from regression |
data.table of shifts
get_level_shift_coef
get_level_shift_coef(x)get_level_shift_coef(x)
x |
data.table of level shifts from regression |
data.table of shifts
get_shift
get_shift(x, recipe, start, end)get_shift(x, recipe, start, end)
x |
data.table of water levels |
recipe |
recipe to apply |
start |
start subset time |
end |
end subset time |
data.table of predictions
calculate sin and cos curves from POSIXct times (serial)
harmonic(times, freq)harmonic(times, freq)
times |
|
freq |
|
sin and cos curves
This method does konno ohmachi smoothing
konno_ohmachi_parallel(x, f, b = 10L)konno_ohmachi_parallel(x, f, b = 10L)
x |
vector to smooth |
f |
vector of frequencies |
b |
integer even magnitude to smooth |
konno ohmachi smoothed results
This method does konno ohmachi smoothing
konno_ohmachi_serial(x, f, b = 10L)konno_ohmachi_serial(x, f, b = 10L)
x |
|
f |
|
b |
|
vector of konno ohmachi smoothed results
Calculate differences for the dependent and independent variables and remove any NA values. This modifies the data.table in place.
lag_difference( dat, var_name = "wl", lag_space = 1, inverse = FALSE, remove_na = TRUE )lag_difference( dat, var_name = "wl", lag_space = 1, inverse = FALSE, remove_na = TRUE )
dat |
data that has the independent and dependent variables (data.table) |
var_name |
name of the column to lag (character) |
lag_space |
space between difference calculation in number of observations |
inverse |
whether the barometric relationship is inverse (TRUE means that when the barometric pressure goes up the measured water level goes down (vented transducer, depth to water), FALSE means that when the barometric pressure goes up so does the measured pressure (non-vented transducer)) (logical). |
remove_na |
remove NA values (logical) |
data.table with lagged differences
library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- 0.4 * baro dat <- data.table(baro, wl, datetime) lag_difference(dat, lag_space = 1, remove_na = FALSE)library(data.table) datetime <- seq.POSIXt(as.POSIXct("2016-01-01 12:00:00"), as.POSIXct("2016-01-05 12:00:00"), by='hour' ) baro <- sin(seq(0, 2*pi, length.out = length(datetime))) wl <- 0.4 * baro dat <- data.table(baro, wl, datetime) lag_difference(dat, lag_space = 1, remove_na = FALSE)
lag data and subset the results
lag_matrix(x, lags, n_subset = 1L, n_shift = 0L, var_name = "lag")lag_matrix(x, lags, n_subset = 1L, n_shift = 0L, var_name = "lag")
x |
|
lags |
|
n_subset |
|
n_shift |
|
var_name |
|
matrix with lagged values
level_shift_to_datetime
level_shift_to_datetime(x)level_shift_to_datetime(x)
x |
character string that includes the date |
datetime
Generate logarithmically spacing for lags. Note: the number of lags will not exactly equal n unless max_time_lag is large or n is very small
log_lags(n, max_time_lag)log_lags(n, max_time_lag)
n |
number of lags (integer) |
max_time_lag |
maximum lag (integer) |
integer vector of lags
log_lags(12, 86401)log_lags(12, 86401)
Pad a smoothing kernel
pad_kernel(k, n)pad_kernel(k, n)
k |
|
n |
|
vector of the padded kernel
As steps are estimated by 'prep', these operations are applied to the training set. Rather than running 'bake' to duplicate this processing, this function will return variables from the processed training set.
portion(object, ...)portion(object, ...)
object |
A 'recipe' object that has been prepared with the option 'retain = TRUE'. |
... |
One or more selector functions to choose which variables will be
returned by the function. See |
When preparing a recipe, if the training data set is retained using 'retain = TRUE', there is no need to 'bake' the recipe to get the preprocessed training set.
'portion' will return the results of a recipes where _all steps_ have been applied to the data, irrespective of the value of the step's 'skip' argument.
[recipe()] [prep.recipe()] [bake.recipe()] [juice.recipe()]
predict_gaps
predict_gaps(x, fits, term_labels = NULL)predict_gaps(x, fits, term_labels = NULL)
x |
the data set (data.table) |
fits |
data.table of fits |
term_labels |
names of the group (character) |
data.table of predictions
predict by term for 'mlm' or 'lm' object
predict_terms(object, ...)predict_terms(object, ...)
object |
model fit from lm object |
... |
currently not used |
return matrix of predictions
Produce an un-normalized psd based on an fft and a vector of optimal sine tapers
resample_fft_parallel( fftz, tapers, verbose = TRUE, dbl = TRUE, tapcap = 10000L )resample_fft_parallel( fftz, tapers, verbose = TRUE, dbl = TRUE, tapcap = 10000L )
fftz |
complex; a matrix representing the dual-length |
tapers |
integer; a vector of tapers |
verbose |
logical; should messages be given? |
dbl |
logical; should the code assume |
tapcap |
integer; the maximum number of tapers which can be applied; note that the length is automatically limited by the length of the series. |
To produce a psd estimate with our adaptive spectrum estimation method, we need only make one
fft calculation initially and then
apply the weighting factors given by parabolic_weights_rcpp, which this
function does.
riedsid
fftz <- complex(real=1:8, imaginary = 1:8) taps <- 1:4 try(resample_fft_rcpp2(fftz, taps))fftz <- complex(real=1:8, imaginary = 1:8) taps <- 1:4 try(resample_fft_rcpp2(fftz, taps))
Produce an un-normalized psd based on an fft and a vector of optimal sine tapers
resample_mvfft(fftz, tapers, verbose = TRUE, dbl = TRUE, tapcap = 10000L)resample_mvfft(fftz, tapers, verbose = TRUE, dbl = TRUE, tapcap = 10000L)
fftz |
complex; a matrix representing the dual-length |
tapers |
integer; a vector of tapers |
verbose |
logical; should messages be given? |
dbl |
logical; should the code assume |
tapcap |
integer; the maximum number of tapers which can be applied; note that the length is automatically limited by the length of the series. |
To produce a psd estimate with our adaptive spectrum estimation method, we need only make one
fft calculation initially and then
apply the weighting factors given by parabolic_weights_field, which this
function does.
riedsid
fftz <- complex(real=1:8, imaginary = 1:8) taps <- 1:4 try(resample_fft_rcpp2(fftz, taps))fftz <- complex(real=1:8, imaginary = 1:8) taps <- 1:4 try(resample_fft_rcpp2(fftz, taps))
response_from_fit
response_from_fit(x, ...)response_from_fit(x, ...)
x |
model fit object, commonly lm |
... |
additional arguments |
tibble of model fit
Create a factor column with the data subseted according to break points.
set_level_shift(x, level_breaks = NULL)set_level_shift(x, level_breaks = NULL)
x |
vector of numeric |
level_breaks |
break points in the data |
vector of type factor
set_level_shift(1:100, c(2, 40, 86))set_level_shift(1:100, c(2, 40, 86))
lag data and subset the results
shift_subset(x, lag = 0L, n_subset = 1L, n_shift = 0L)shift_subset(x, lag = 0L, n_subset = 1L, n_shift = 0L)
x |
|
lag |
|
n_subset |
|
n_shift |
|
vector with lagged values
Calculate the transfer functions for an array in parallel
solve_tf_parallel(a)solve_tf_parallel(a)
a |
|
vector of transfer functions
spec_from_pgram
spec_from_pgram(pgram, method = "spec_pgram", ...)spec_from_pgram(pgram, method = "spec_pgram", ...)
pgram |
the periodogram |
method |
either spec_pgram or spec_welch |
... |
additional arguments |
spectrum from the pgram
calculate the spectra and cross-spectra
spec_pgram( x, spans = NULL, kernel = NULL, taper = 0.1, pad = 0, fast = TRUE, demean = FALSE, detrend = TRUE, plot = TRUE, na.action = na.fail, ... )spec_pgram( x, spans = NULL, kernel = NULL, taper = 0.1, pad = 0, fast = TRUE, demean = FALSE, detrend = TRUE, plot = TRUE, na.action = na.fail, ... )
x |
univariate or multivariate time series. |
spans |
vector of odd integers giving the widths of modified Daniell smoothers to be used to smooth the periodogram. |
kernel |
alternatively, a kernel smoother of class
|
taper |
specifies the proportion of data to taper. A split cosine bell taper is applied to this proportion of the data at the beginning and end of the series. |
pad |
proportion of data to pad. Zeros are added to the end of
the series to increase its length by the proportion |
fast |
logical; if |
demean |
logical. If |
detrend |
logical. If |
plot |
plot the periodogram? |
na.action |
|
... |
graphical arguments passed to |
array of spectra
calculate the spectra and cross-spectra. The scaling of individual series requires a closer look.
spec_welch( x, n_subsets = 10, overlap = 0.5, window = window_hann, demean = TRUE, detrend = TRUE, ... )spec_welch( x, n_subsets = 10, overlap = 0.5, window = window_hann, demean = TRUE, detrend = TRUE, ... )
x |
|
n_subsets |
number of subsets (integer) |
overlap |
amount of overlap 0.0 < overlap < 1.0 (numeric) |
window |
window function to apply (function). The options are window_hann, window_hamming, window_blackman, window_nuttall, window_blackman_nuttall, window_blackman_harris |
demean |
should the mean be removed from x prior to calculation (logical) |
detrend |
should the x be detrended to calculation (logical) |
... |
not used |
array of spectra
spec_welch_tf_error
spec_welch_tf_error(coh, amp, overlap = 0.5, n_subsets = 10, return = "amp")spec_welch_tf_error(coh, amp, overlap = 0.5, n_subsets = 10, return = "amp")
coh |
coherency |
amp |
amplitude |
overlap |
overlap fraction from spec_welch |
n_subsets |
number of subsets in spec welch |
return |
either 'amp', or 'phase' |
error
'step_distributed_lag' creates a *specification* of a recipe step that are distributed lag versions of a particular variable. Uses FFT for fast calculation with a large maximum lag and many observations
step_distributed_lag( recipe, ..., role = "distributed_lag", trained = FALSE, knots = 1, spline_fun = splines::ns, n_subset = 1, n_shift = 0, prefix = "distributed_lag_", default = NA, columns = NULL, skip = FALSE, id = rand_id("distributed_lag") )step_distributed_lag( recipe, ..., role = "distributed_lag", trained = FALSE, knots = 1, spline_fun = splines::ns, n_subset = 1, n_shift = 0, prefix = "distributed_lag_", default = NA, columns = NULL, skip = FALSE, id = rand_id("distributed_lag") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. For the 'tidy' method, these are not currently used. |
role |
Defaults to "distributed_lag" |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
knots |
specific knots for the lagging process |
spline_fun |
spline function to use i.e. splines::ns, splines::bs |
n_subset |
for distributed_lag parallel |
n_shift |
for distributed_lag parallel |
prefix |
A prefix for generated column names, default to "distributed_lag_". |
default |
Passed to |
columns |
A character string of the selected variable names. This field
is a placeholder and will be populated once |
skip |
A logical. Should the step be skipped when the recipe is baked by
|
id |
A character string that is unique to this step to identify it. |
'step_distributed_lag' calculates the earthtide response and then lags (leads) the terms.
An updated version of 'recipe' with the new step added to the sequence of existing steps (if any). For the 'tidy' method, a tibble with columns 'terms' which is the columns that will be affected and 'holiday'.
[step_lag_matrix()] [recipe()] [prep.recipe()] [bake.recipe()]
data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl, baro)] ) with_et <- rec %>% step_distributed_lag(baro, knots = c(0, 10, 100)) %>% step_naomit(everything()) %>% prep() %>% juice()data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl, baro)] ) with_et <- rec %>% step_distributed_lag(baro, knots = c(0, 10, 100)) %>% step_naomit(everything()) %>% prep() %>% juice()
'step_earthtide' creates a *specification* of a recipe step that are the Earth tide harmonics for a particular location. Requires the *earthtide* package
step_earthtide( recipe, ..., role = "earthtide", trained = FALSE, method = "gravity", astro_update = 1L, latitude = 0, longitude = 0, elevation = 0, azimuth = 0, gravity = 0, earth_radius = 6378136.3, earth_eccen = 0.0066943979514, cutoff = 1e-06, wave_groups = NULL, catalog = "ksm04", eop = NULL, scale = TRUE, prefix = "earthtide_", default = NA, columns = NULL, skip = FALSE, id = rand_id("earthtide") )step_earthtide( recipe, ..., role = "earthtide", trained = FALSE, method = "gravity", astro_update = 1L, latitude = 0, longitude = 0, elevation = 0, azimuth = 0, gravity = 0, earth_radius = 6378136.3, earth_eccen = 0.0066943979514, cutoff = 1e-06, wave_groups = NULL, catalog = "ksm04", eop = NULL, scale = TRUE, prefix = "earthtide_", default = NA, columns = NULL, skip = FALSE, id = rand_id("earthtide") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. For the 'tidy' method, these are not currently used. |
role |
Defaults to "earthtide" |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
method |
One or more of "gravity", "tidal_potential", "tidal_tilt", "vertical_displacement", "horizontal_displacement", "n_s_displacement", "e_w_displacement", "vertical_strain", "areal_strain", "volume_strain", "horizontal_strain", or "ocean_tides", "pole_tide", "lod_tide". The pole tide and lod_tide are used in predict mode even if do_predict is FALSE. More than one value can only be used if do_predict == TRUE. |
astro_update |
How often to update astro parameters in number of samples. This speeds up code but may make it slightly less accurate. |
latitude |
The station latitude (numeric) defaults to 0. |
longitude |
The station longitude (numeric) defaults to 0. |
elevation |
The station elevation (m) (numeric) defaults to 0. |
azimuth |
Earth azimuth (numeric) defaults to 0. |
gravity |
Gravity at the station (m/s^2) (numeric) 0 to estimate gravity from elevation and latitude. |
earth_radius |
Radius of earth (m) (numeric) defaults to 6378136.3 |
earth_eccen |
Eccentricity of earth (numeric) defaults to 6.69439795140e-3 |
cutoff |
Cutoff amplitude for constituents (numeric) defaults to 1e-6. |
wave_groups |
Two column data.frame having start and end of frequency groups (data.frame). This data.frame must have two columns with the names 'start', and 'end' signifying the start and end of the wave groupings. An optional third column 'multiplier' can be provided to scale the particular wave group. If column names do no match, the inferred column positions are start, end, multiplier. |
catalog |
Use the "hw95s" catalog or "ksm04" catalog (character). |
eop |
User defined Earth Orientation Parameter (EOP) data.frame with the following columns: datetime, ddt, ut1_utc, lod, x, y, dx, dy |
scale |
Scale results when do_predict is FALSE |
prefix |
A prefix for generated column names, default to "earthtide_". |
default |
Passed to |
columns |
A character string of the selected variable names. This field
is a placeholder and will be populated once |
skip |
A logical. Should the step be skipped when the recipe is baked by
|
id |
A character string that is unique to this step to identify it. |
'step_earthtide' calculates the Earth tide harmonics
An updated version of 'recipe' with the new step added to the sequence of existing steps (if any). For the 'tidy' method, a tibble with columns 'terms' which is the columns that will be affected and 'holiday'.
[step_earthtide()] [recipe()] [prep.recipe()] [bake.recipe()]
library(earthtide) data(transducer) data(eterna_wavegroups) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) wg <- na.omit(eterna_wavegroups[eterna_wavegroups$time == '1 month',]) with_et <- rec %>% step_earthtide(datetime, latitude = 34, longitude = -118.5, wave_groups = wg) %>% prep() %>% juice()library(earthtide) data(transducer) data(eterna_wavegroups) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) wg <- na.omit(eterna_wavegroups[eterna_wavegroups$time == '1 month',]) with_et <- rec %>% step_earthtide(datetime, latitude = 34, longitude = -118.5, wave_groups = wg) %>% prep() %>% juice()
'step_harmonic2' creates a *specification* of a recipe step for harmonics (sin and cos terms).
step_harmonic2( recipe, ..., role = "harmonic", trained = FALSE, freq = 1, prefix = "harmonic_", default = NA, columns = NULL, skip = FALSE, id = rand_id("harmonic") )step_harmonic2( recipe, ..., role = "harmonic", trained = FALSE, freq = 1, prefix = "harmonic_", default = NA, columns = NULL, skip = FALSE, id = rand_id("harmonic") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. For the 'tidy' method, these are not currently used. |
role |
Defaults to "harmonic" |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
freq |
|
prefix |
A prefix for generated column names, default to "harmonic_". |
default |
Passed to |
columns |
A character string of the selected variable names. This field
is a placeholder and will be populated once |
skip |
A logical. Should the step be skipped when the recipe is baked by
|
id |
A character string that is unique to this step to identify it. |
'step_harmonic2' calculates the Earth tide harmonics
An updated version of 'recipe' with the new step added to the sequence of existing steps (if any). For the 'tidy' method, a tibble with columns 'terms' which is the columns that will be affected and 'holiday'.
[step_earthtide()] [recipe()] [prep.recipe()] [bake.recipe()]
data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) with_et <- rec %>% step_harmonic2(datetime, freq = c(1, 1.93, 2)) %>% prep() %>% juice()data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) with_et <- rec %>% step_harmonic2(datetime, freq = c(1, 1.93, 2)) %>% prep() %>% juice()
'step_lag_earthtide' creates a *specification* of a recipe step that are lagged versions of the Earth tide response at a particular location. Requires the *earthtide* package
step_lag_earthtide( recipe, ..., role = "lag_earthtide", trained = FALSE, method = "gravity", lag = 0, astro_update = 1L, latitude = 0, longitude = 0, elevation = 0, azimuth = 0, gravity = 0, earth_radius = 6378136.3, earth_eccen = 0.0066943979514, cutoff = 1e-06, wave_groups = NULL, catalog = "ksm04", eop = NULL, prefix = "lag_earthtide_", default = NA, columns = NULL, skip = FALSE, id = rand_id("lag_earthtide") )step_lag_earthtide( recipe, ..., role = "lag_earthtide", trained = FALSE, method = "gravity", lag = 0, astro_update = 1L, latitude = 0, longitude = 0, elevation = 0, azimuth = 0, gravity = 0, earth_radius = 6378136.3, earth_eccen = 0.0066943979514, cutoff = 1e-06, wave_groups = NULL, catalog = "ksm04", eop = NULL, prefix = "lag_earthtide_", default = NA, columns = NULL, skip = FALSE, id = rand_id("lag_earthtide") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. For the 'tidy' method, these are not currently used. |
role |
Defaults to "lag_earthtide" |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
method |
One or more of "gravity", "tidal_potential", "tidal_tilt", "vertical_displacement", "horizontal_displacement", "n_s_displacement", "e_w_displacement", "vertical_strain", "areal_strain", "volume_strain", "horizontal_strain", or "ocean_tides", "pole_tide", "lod_tide". The pole tide and lod_tide are used in predict mode even if do_predict is FALSE. More than one value can only be used if do_predict == TRUE. |
lag |
A vector of positive integers. Each specified column will be lagged for each value in the vector. |
astro_update |
How often to update astro parameters in number of samples. This speeds up code but may make it slightly less accurate. |
latitude |
The station latitude (numeric) defaults to 0. |
longitude |
The station longitude (numeric) defaults to 0. |
elevation |
The station elevation (m) (numeric) defaults to 0. |
azimuth |
Earth azimuth (numeric) defaults to 0. |
gravity |
Gravity at the station (m/s^2) (numeric) 0 to estimate gravity from elevation and latitude. |
earth_radius |
Radius of earth (m) (numeric) defaults to 6378136.3 |
earth_eccen |
Eccentricity of earth (numeric) defaults to 6.69439795140e-3 |
cutoff |
Cutoff amplitude for constituents (numeric) defaults to 1e-6. |
wave_groups |
Two column data.frame having start and end of frequency groups (data.frame). This data.frame must have two columns with the names 'start', and 'end' signifying the start and end of the wave groupings. An optional third column 'multiplier' can be provided to scale the particular wave group. If column names do no match, the inferred column positions are start, end, multiplier. |
catalog |
Use the "hw95s" catalog or "ksm04" catalog (character). |
eop |
User defined Earth Orientation Parameter (EOP) data.frame with the following columns: datetime, ddt, ut1_utc, lod, x, y, dx, dy |
prefix |
A prefix for generated column names, default to "lag_earthtide_". |
default |
Passed to |
columns |
A character string of the selected variable names. This field
is a placeholder and will be populated once |
skip |
A logical. Should the step be skipped when the recipe is baked by
|
id |
A character string that is unique to this step to identify it. |
'step_lag_earthtide' calculates the earthtide response and then lags (leads) the terms.
An updated version of 'recipe' with the new step added to the sequence of existing steps (if any). For the 'tidy' method, a tibble with columns 'terms' which is the columns that will be affected and 'holiday'.
[step_lag_earthtide()] [recipe()] [prep.recipe()] [bake.recipe()]
data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) with_et <- rec %>% step_lag_earthtide(datetime, latitude = 34, longitude = -118.5, lag = -1:1) %>% prep() %>% juice()data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl)]) with_et <- rec %>% step_lag_earthtide(datetime, latitude = 34, longitude = -118.5, lag = -1:1) %>% prep() %>% juice()
'step_lag_matrix' creates a *specification* of a recipe step that will add new columns of lagged data. Lagged data will by default include NA values where the lag was induced. These can be removed with [step_naomit()], or you may specify an alternative filler value with the 'default' argument. This method is faster than [step_lag()] and allows for negative values.
step_lag_matrix( recipe, ..., role = "lag_matrix", trained = FALSE, lag = 1, n_subset = 1, n_shift = 0, prefix = "lag_matrix_", default = NA, columns = NULL, skip = FALSE, id = rand_id("lag_matrix") )step_lag_matrix( recipe, ..., role = "lag_matrix", trained = FALSE, lag = 1, n_subset = 1, n_shift = 0, prefix = "lag_matrix_", default = NA, columns = NULL, skip = FALSE, id = rand_id("lag_matrix") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are affected by the step. See [selections()] for more details. |
role |
Defaults to "predictor" |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
lag |
A vector of integers. They can be positive, negative or zero. Each specified column will be lagged for each value in the vector. |
n_subset |
subset every n_subset values |
n_shift |
shift the data n_shift values |
prefix |
A prefix for generated column names, default to "lag_". |
default |
Passed to |
columns |
A character string of variable names that will be populated (eventually) by the 'terms' argument. |
skip |
A logical. Should the step be skipped when the recipe is baked by [bake.recipe()]? While all operations are baked when [prep.recipe()] is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using 'skip = TRUE' as it may affect the computations for subsequent operations |
id |
A character string that is unique to this step to identify it. |
The step assumes that the data are already _in the proper sequential order_ for lagging.
An updated version of 'recipe' with the new step added to the sequence of existing steps (if any).
[recipe()] [step_lag()] [prep.recipe()] [bake.recipe()] [step_naomit()]
data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl, baro)]) with_et <- rec %>% step_lag_matrix(baro, lag = -1:1) %>% step_naomit(everything()) %>% prep() %>% juice()data(transducer) rec <- recipe(wl ~ ., data = transducer[1:1000, list(datetime, wl, baro)]) with_et <- rec %>% step_lag_matrix(baro, lag = -1:1) %>% step_naomit(everything()) %>% prep() %>% juice()
stretch_interp
stretch_interp(start_val = NA, end_val = NA, values)stretch_interp(start_val = NA, end_val = NA, values)
start_val |
starting value to match |
end_val |
end value to match |
values |
set of values |
shifted vector of values
summarize_coef
summarize_coef(fit)summarize_coef(fit)
fit |
model fit from lm object |
data.table with coefficient results
summarize_lm
summarize_lm(fit, ...)summarize_lm(fit, ...)
fit |
model fit from lm object |
... |
additional arguments |
data.table with fit results
This function is used for testing purposes
synthetic( sd_noise = 2e-04, sd_noise_trend = 3e-05, linear_trend = 1e-07, n = 14 * 86400, seed = NULL, scale = 0.5, baro_kernel = NULL )synthetic( sd_noise = 2e-04, sd_noise_trend = 3e-05, linear_trend = 1e-07, n = 14 * 86400, seed = NULL, scale = 0.5, baro_kernel = NULL )
sd_noise |
standard deviation of random noise to add (numeric) |
sd_noise_trend |
standard deviation of noise to add to generate a trend (numeric) |
linear_trend |
magnitude of linear trend in time (numeric) |
n |
length of time series in seconds (integer) |
seed |
random number seed for reproducibility (numeric) |
scale |
multiplier for barometric pressure (numeric) |
baro_kernel |
vector values to convolve with barometric pressure |
data.table of synthetic water levels and barometric pressure
This function is used for testing purposes
synthetic_wl( baro, datetime, sd_noise = 0, linear_trend = 0, intercept = 0, seed = NULL, kernel = NULL )synthetic_wl( baro, datetime, sd_noise = 0, linear_trend = 0, intercept = 0, seed = NULL, kernel = NULL )
baro |
barometric pressure (numeric vector) |
datetime |
POSIXct dates |
sd_noise |
standard deviation of random noise to add (numeric) |
linear_trend |
magnitude of linear trend in time (numeric) |
intercept |
|
seed |
random number seed for reproducibility (numeric) |
kernel |
data.table of synthetic water levels and barometric pressure
This data.frame contains the water levels, barometric pressure, and synthtetic earthtides from 2016-08-25 to 2016-10-15 (subsampled to every 2 minutes).
transducertransducer
A data.frame The columns are:
datetimePOSIXct date and time
wltransducer pressure of water column(dbar)
barobarometric pressure (dbar)
etsynthetic gravity
utils::data(transducer)utils::data(transducer)
transfer_fun
transfer_fun(dat, vars, time = "datetime", method = "spec_pgram", ...)transfer_fun(dat, vars, time = "datetime", method = "spec_pgram", ...)
dat |
data that has the independent and dependent variables (data.table) |
vars |
variables to include for transfer function calculation |
time |
the name of the column that contains the POSIXct date and time |
method |
either spec_pgram, spec_welch, or spec_multitaper |
... |
arguments to pass to method |
transfer function, gain and phase, coherency
Blackman window for FFT
window_blackman(n)window_blackman(n)
n |
length of signal (integer) |
window
window_blackman(100)window_blackman(100)
Blackman-Harris window for FFT
window_blackman_harris(n)window_blackman_harris(n)
n |
length of signal (integer) |
window
window_blackman_harris(100)window_blackman_harris(100)
Blackman-Nuttall window for FFT
window_blackman_nuttall(n)window_blackman_nuttall(n)
n |
length of signal (integer) |
window
window_blackman_nuttall(100)window_blackman_nuttall(100)
First derivative window for FFT
window_first_deriv(n, a0, a1, a2, a3)window_first_deriv(n, a0, a1, a2, a3)
n |
length of signal (integer) |
a0 |
|
a1 |
|
a2 |
|
a3 |
|
window
# nuttall window window_first_deriv(100, 0.355768, 0.487396, 0.144232, 0.012604)# nuttall window window_first_deriv(100, 0.355768, 0.487396, 0.144232, 0.012604)
Gaussian window for FFT https://en.wikipedia.org/wiki/Window_function
window_gaussian(n, sigma)window_gaussian(n, sigma)
n |
length of signal (integer) |
sigma |
the standard deviation in periods (numeric) |
window
window_gaussian(100, 0.3)window_gaussian(100, 0.3)
Hamming window for FFT
window_hamming(n)window_hamming(n)
n |
length of signal (integer) |
window
window_hamming(100)window_hamming(100)
Hann window for FFT
window_hann(n)window_hann(n)
n |
length of signal (integer) |
window
window_hann(100)window_hann(100)
Nuttall window for FFT
window_nuttall(n)window_nuttall(n)
n |
length of signal (integer) |
window
window_nuttall(100)window_nuttall(100)
Rectangular window for FFT
window_rectangular(n)window_rectangular(n)
n |
length of signal (integer) |
window
window_rectangular(100)window_rectangular(100)
This data.frame contains the water levels, barometric pressure, and earthtides for wipp30. This is the dataset included in BETCO.
wipp30wipp30
A data.frame The columns are:
timetime in hours
wltransducer pressure of water column
barobarometric pressure
etsynthetic gravity
utils::data(wipp30)utils::data(wipp30)