Convenience Functions, Moving Window Statistics, and Graphics

Dane Van Domelen
vandomed@gmail.com 2020-04-11

Introduction

This package contains miscellaneous functions that I think are useful for various purposes, e.g. for:

Running and summarizing statistical simulation studies (sumsim, iterate)
Visualizing data (histo, cart_app)
Calculating moving/sliding statistics (sliding_cov, sliding_cor, moving_mean)
Doing something convenient (bmi3, cleancut ral)

In this README, I’ll showcase a few functions.

sumsim

This function creates tables summarizing results of statistical simulations, providing common metrics of performance like mean bias, standard deviation, mean standard error, mean squared error, and confidence interval coverage.

To illustrate, suppose (X_1, ..., X_n \sim N(\mu, \sigma^2)), and we wish to compare two estimators for (\sigma^2): the MLE ((n) in denominator) vs. the sample variance ((n-1) in denominator).

MLE <- c()
s2 <- c()
for (ii in 1: 1000) {
   x <- rnorm(n = 25)
   MLE[ii] <- sum((x - mean(x))^2) / 25
   s2[ii] <- sum((x - mean(x))^2) / 24
 }
kable(sumsim(estimates = cbind(MLE, s2), truth = 1))

	Mean bias	SD	MSE
MLE	-0.036	0.275	0.077
s2	0.004	0.286	0.082

You can request different performance metrics through the statistics input; some of them, like confidence interval coverage, require specifying ses with standard errors.

histo

This function is similar to the base R function hist, but with two added features:

Can overlay one or more fitted probability density/mass functions (PDFs/PMFs) for any univariate distribution supported in R (see ?Distributions).
Can generate more of a barplot type histogram, where each possible value gets its own bin centered over its value (useful for discrete variables with not too many possible values).

Here are two examples:

# Histogram for 1,000 values from Bin(8, 0.25)
x <- rbinom(n = 1000, size = 5, prob = 0.25)
histo(x, dis = "binom", size = 5, colors = "blue", points_list = list(type = "b"))

# Histogram for 10,000 values from lognormal(0, 0.35) and various fitted PDFs.
x <- rlnorm(n = 10000, meanlog = 0, sdlog = 0.35)
histo(x, c("lnorm", "norm", "gamma"), main = "X ~ Lognormal(0, 0.35)")

moving_mean

The function moving_mean is one of dozens of moving average functions available in R. I’m not sure it’s the absolute fastest, but it is much faster than roll_mean in RcppRoll.

library("RcppRoll")
lengths <- c(10, 100, 1000, 10000)
multiples1 <- multiples2 <- c()
for (ii in 1: 4) {
  n <- lengths[ii]
  x <- rnorm(n)
  medians <- summary(microbenchmark(roll_mean(x, 5), moving_mean(x, 5),
                                    roll_mean(x, n / 5), moving_mean(x, n / 5),
                                    times = 50))$median
  multiples1[ii] <- medians[1] / medians[2]
  multiples2[ii] <- medians[3] / medians[4]
}
par(mfrow = c(1, 2))
plot(1: 4, multiples1, type = "b", col = "blue", main = "5-unit MA", 
     ylab = "Speed multiple", xlab = "Vector length", xaxt = "n", 
     ylim = c(0, max(multiples1) * 1.05))
axis(side = 1, at = 1: 4, labels = lengths)
abline(h = 1)

plot(1: 4, multiples2, type = "b", col = "blue", main = "length(x)/5-unit MA", 
     ylab = "Speed multiple", xlab = "Vector length", xaxt = "n", 
     ylim = c(0, max(multiples2) * 1.05))
axis(side = 1, at = 1: 4, labels = lengths)
abline(h = 1)

cleancut

Whenever I try to use cut to categorize a continuous variable, I find myself taking a suboptimal approach: (1) Call cut without specifying labels, and with arguments I think will create the groups I want (\Rightarrow) (2) Run table to see if it worked (\Rightarrow) (3) Return to (1) if necessary (Rightarrow) (4) Call cut once again with labels specified.

The idea of cleancut is to provide a simple character string-based alternative. To illustrate, here’s how you break a continuous variable into “low” (< -1), “medium” (-1 to 1, inclusive), and “high” (> 1). I’ll do it two ways, once without and once with labels:

x <- rnorm(100)
y.nolabels <- cleancut(x, "(-Inf, -1), [-1, 1], [1, Inf)")
y.labels <- cleancut(x, "(-Inf, -1), [-1, 1], [1, Inf)", labels = c("low", "medium", "high"))
table(y.nolabels, y.labels)

	low	medium	high
(-Inf, -1)	20	0	0
[-1, 1]	0	64	0
[1, Inf)	0	0	16

References

Eddelbuettel, Dirk. 2013. Seamless R and C++ Integration with Rcpp. New York: Springer. https://doi.org/10.1007/978-1-4614-6868-4.

Eddelbuettel, Dirk, and James Joseph Balamuta. 2017. “Extending extitR with extitC++: A Brief Introduction to extitRcpp.” PeerJ Preprints 5 (August): e3188v1. https://doi.org/10.7287/peerj.preprints.3188v1.

Eddelbuettel, Dirk, and Romain François. 2011. “Rcpp: Seamless R and C++ Integration.” Journal of Statistical Software 40 (8): 1–18. https://doi.org/10.18637/jss.v040.i08.

Ushey, Kevin. 2015. RcppRoll: Efficient Rolling / Windowed Operations. https://CRAN.R-project.org/package=RcppRoll.

Xie, Yihui. 2017. Printr: Automatically Print R Objects to Appropriate Formats According to the ’Knitr’ Output Format. https://CRAN.R-project.org/package=printr.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
R		R
man		man
src		src
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README-unnamed-chunk-2-1.png		README-unnamed-chunk-2-1.png
README-unnamed-chunk-2-2.png		README-unnamed-chunk-2-2.png
README-unnamed-chunk-3-1.png		README-unnamed-chunk-3-1.png
README-unnamed-chunk-4-1.png		README-unnamed-chunk-4-1.png
README-unnamed-chunk-5-1.png		README-unnamed-chunk-5-1.png
README-unnamed-chunk-5-2.png		README-unnamed-chunk-5-2.png
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convenience Functions, Moving Window Statistics, and Graphics

Introduction

sumsim

histo

moving_mean

cleancut

References

About

Releases

Packages

Languages

vandomed/dvmisc

Folders and files

Latest commit

History

Repository files navigation

Convenience Functions, Moving Window Statistics, and Graphics

Introduction

sumsim

histo

moving_mean

cleancut

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages