I don’t mind for loops in R—they’re easy to write, easy to read, and can be very fast (e.g. if the output vector or matrix is pre-allocated). But they aren’t very “elegant” (no pipes?!?!) and they require a lot of keystrokes. So I recently decided to look for a go-to alternative.

The *apply* family is great for replacing for loops, but you often end up with a bunch of *lapply*’s, *sapply*’s, *rbind*’s, and so on, which gets ugly pretty fast.

The *pmap* function in **purrr** (Henry and Wickham 2017) is almost exactly what I was looking for. You specify a function and a “list of lists” of parameter values to loop over, and it does it. But I wanted slightly different functionality, namely:

- Loop over all
*combinations*of input values (as opposed to 1st set, 2nd set, …) - Run multiple trials for each combination
- Avoid defining lists in the function call.

So I wrote a function called *iterate* and added it to my **dvmisc** package (Van Domelen 2018). You can install **dvmisc** from GitHub by loading **devtools** and running `install_github("vandomed/dvmisc")`

.

Suppose you have a function called *f.reg* that fits a (possibly log-transformed) linear regression relating user-specified \(Y\) and \(X\) variables in a particular dataset (default **mtcars**):

```
f.reg <- function(data = mtcars, y, x, log = FALSE) {
if (log) {
formula <- paste("log(", y, ") ~ ", x, sep = "")
} else {
formula <- paste(y, " ~ ", x, sep = "")
}
summary.fit <- summary(lm(formula, data = data))
return(list(r2 = summary.fit$r.squared, p = summary.fit$coef[2, 4]))
}
```

And suppose you want to regress gas mileage (`mpg`

) on 3 separate predictors (`wt`

= weight, `hp`

= horsepower, `qsec`

= 1/4 mile time), with and without log-transforming `mpg`

in each case. To fit these 6 models via *iterate*, you can run:

```
library("dvmisc")
iterate(f.reg, y = "mpg", x = c("wt", "hp", "qsec"), log = c(TRUE, FALSE))
```

```
## y x log r2 p
## 1: mpg wt TRUE 0.7976 6.31e-12
## 2: mpg wt FALSE 0.7528 1.294e-10
## 3: mpg hp TRUE 0.6233 7.853e-08
## 4: mpg hp FALSE 0.6024 1.788e-07
## 5: mpg qsec TRUE 0.1728 0.01796
## 6: mpg qsec FALSE 0.1753 0.01708
```

Or with pipes:

`f.reg %>% iterate(y = "mpg", x = c("wt", "hp", "qsec"), log = c(TRUE, FALSE))`

```
## y x log r2 p
## 1: mpg wt TRUE 0.7976 6.31e-12
## 2: mpg wt FALSE 0.7528 1.294e-10
## 3: mpg hp TRUE 0.6233 7.853e-08
## 4: mpg hp FALSE 0.6024 1.788e-07
## 5: mpg qsec TRUE 0.1728 0.01796
## 6: mpg qsec FALSE 0.1753 0.01708
```

You could get the same thing with *pmap*, but it takes a little more work, and this particular solution doesn’t make it obvious what each set of inputs is:

```
library("purrr")
do.call(rbind, pmap(.l = list(y = rep("mpg", 6),
x = rep(c("wt", "hp", "qsec"), each = 2),
log = rep(c(TRUE, FALSE), 3)),
.f = f.reg))
```

```
## r2 p
## [1,] 0.7976 6.31e-12
## [2,] 0.7528 1.294e-10
## [3,] 0.6233 7.853e-08
## [4,] 0.6024 1.788e-07
## [5,] 0.1728 0.01796
## [6,] 0.1753 0.01708
```

You can also pass non-scalar inputs that you want *iterate* to treat as fixed through the `fix`

argument. For example, to fit `Sepal.Length`

vs. `Sepal.Width`

and `Petal.Length`

in the `iris`

dataset:

```
f.reg %>% iterate(y = "Sepal.Length", x = c("Sepal.Width", "Petal.Length"),
fix = list(data = iris))
```

```
## y x r2 p
## 1: Sepal.Length Sepal.Width 0.01382 0.1519
## 2: Sepal.Length Petal.Length 0.76 1.039e-47
```

For simulations, you typically want to run multiple trials of several different scenarios. You can do that with *iterate* pretty easily. Suppose you want to calculate empirical power of the one-sample t-test in various scenarios. Your “one-trial” function might be something like:

```
f.power <- function(n = 50, mu = 0, sigsq = 1, mu0 = 0, alpha = 0.05) {
x <- rnorm(n = n, mean = mu, sd = sqrt(sigsq))
fit <- t.test(x, mu = mu0)
return(list(t = fit$statistic, p = fit$p.value))
}
```

To run 2 trials for three different true means and two different sample sizes:

`f.power %>% iterate(mu = c(0.1, 0.25, 0.5), n = c(25, 50), trials = 2)`

```
## mu n t p
## 1: 0.10 25 -1.025 0.3158
## 2: 0.10 25 0.5205 0.6075
## 3: 0.10 50 1.314 0.1951
## 4: 0.10 50 0.7288 0.4696
## 5: 0.25 25 0.6227 0.5394
## 6: 0.25 25 3.049 0.005525
## 7: 0.25 50 1.716 0.09254
## 8: 0.25 50 1.427 0.1601
## 9: 0.50 25 0.3716 0.7134
## 10: 0.50 25 2.414 0.02378
## 11: 0.50 50 3.22 0.00228
## 12: 0.50 50 4.774 1.673e-05
```

Ramping it up to 1,000 trials each, and actually calculate the empirical power afterwards:

```
f.power %>% iterate(mu = c(0.1, 0.25, 0.5), n = c(25, 50), trials = 1000) %>%
dplyr::group_by(mu, n) %>%
dplyr::summarise(empirical.power = mean(p < 0.05))
```

```
## # A tibble: 6 x 3
## # Groups: mu [?]
## mu n empirical.power
## <dbl> <dbl> <dbl>
## 1 0.100 25.0 0.0700
## 2 0.100 50.0 0.111
## 3 0.250 25.0 0.225
## 4 0.250 50.0 0.390
## 5 0.500 25.0 0.660
## 6 0.500 50.0 0.929
```

Miraculously, we find that power increases with sample size and effect size. *Biometrics* here we come!

Henry, Lionel, and Hadley Wickham. 2017. *Purrr: Functional Programming Tools*. https://CRAN.R-project.org/package=purrr.

Van Domelen, Dane R. 2018. *Dvmisc: Convenience Functions, Moving Window Statistics, and Graphics*.