Sample Size Determination for Continuous Endpoint
The Null and Alternative hypotheses are \[ \begin{array}{l} H_{0}: \mu=\mu_{0} \\ H_{1}: \mu \neq \mu_{0} \end{array} \] Formulas compute sample size and power, respectively: \[n=\left(\sigma \frac{z_{1-\alpha / 2}+z_{1-\beta}}{\mu-\mu_{0}}\right)^{2}\] \[ 1-\beta=\Phi\left(z-z_{1-\alpha / 2}\right)+\Phi\left(-z-z_{1-\alpha / 2}\right) \quad, \quad z=\frac{\mu-\mu_{0}}{\sigma / \sqrt{n}} \]
mu=2
mu0=1.5
sd=1
alpha=0.05
beta=0.20
(n=(sd*(qnorm(1-alpha/2)+qnorm(1-beta))/(mu-mu0))^2)
ceiling(n)
z=(mu-mu0)/sd*sqrt(n)
(Power=pnorm(z-qnorm(1-alpha/2))+pnorm(-z-qnorm(1-alpha/2)))
\[ \begin{array}{l} H_{0}: \mu-\mu_{0} \leq \delta \\ H_{1}: \mu-\mu_{0}>\delta \end{array} \] Sample Size \[n=\left(\sigma \frac{z_{1-\alpha}+z_{1-\beta}}{\mu-\mu_{0}-\delta}\right)^{2}\] Power \[1-\beta=\Phi\left(z-z_{1-\alpha}\right)+\Phi\left(-z-z_{1-\alpha}\right) \quad, \quad z=\frac{\mu-\mu_{0}-\delta}{\sigma / \sqrt{n}}\]
\[ \begin{array}{l} H_{0}:\left|\mu-\mu_{0}\right| \geq \delta \\ H_{1}:\left|\mu-\mu_{0}\right|<\delta \end{array} \] Sample Size \[n=\left(\sigma \frac{z_{1-\alpha}+z_{1-\beta / 2}}{\delta-\left|\mu-\mu_{0}\right|}\right)^{2}\] Power \[1-\beta=2\left[\Phi\left(z-z_{1-\alpha}\right)+\Phi\left(-z-z_{1-\alpha}\right)\right]-1 \quad, \quad z=\frac{\left|\mu-\mu_{0}\right|-\delta}{\sigma / \sqrt{n}}\]
mu=2
mu0=2
delta=0.05
sd=0.10
alpha=0.05
beta=0.20
(n=(sd*(qnorm(1-alpha)+qnorm(1-beta/2))/(delta-abs(mu-mu0)))^2)
ceiling(n)
z=(abs(mu-mu0)-delta)/sd*sqrt(n)
(Power=2*(pnorm(z-qnorm(1-alpha))+pnorm(-z-qnorm(1-alpha)))-1)
Two Sided
\[ \begin{array}{l} H_{0}: \mu_{A}-\mu_{B}=0 \\ H_{1}: \mu_{A}-\mu_{B} \neq 0 \end{array} \] where the ratio between the sample sizes of the two groups is \[ \kappa=\frac{n_{A}}{n_{B}} \]
Sample size \[ n_{A}=\kappa n_{B} \] \[ n_{B}=\left(1+\frac{1}{\kappa}\right)\left(\sigma \frac{z_{1-\alpha / 2}+z_{1-\beta}}{\mu_{A}-\mu_{B}}\right)^{2} \] Power \[ 1-\beta=\Phi\left(z-z_{1-\alpha / 2}\right)+\Phi\left(-z-z_{1-\alpha / 2}\right) \quad, \quad z=\frac{\mu_{A}-\mu_{B}}{\sigma \sqrt{\frac{1}{n_{A}}+\frac{1}{n_{B}}}} \]
One Sided
\[ \begin{array}{l} H_{0}: \mu_{A}=\mu_{B} \\ H_{1}: \mu_{A}>\mu_{B} \end{array} \] where the ratio between the sample sizes of the two groups is \[ \kappa=\frac{n_{B}}{n_{A}} \] Sample size and power, respectively: \[ \begin{array}{c} n_{A}=\left(\sigma_{A}^{2}+\sigma_{B}^{2} / \kappa\right)\left(\frac{z_{1-\alpha}+z_{1-\beta}}{\mu_{A}-\mu_{B}}\right)^{2} \\ n_{B}=\kappa n_{A} \\ 1-\beta=\Phi\left(\frac{\left|\mu_{A}-\mu_{B}\right| \sqrt{n_{A}}}{\sqrt{\sigma_{A}^{2}+\sigma_{B}^{2} / \kappa}}-z_{1-\alpha}\right) \end{array} \]
\[ \begin{array}{l} H_{0}: \mu_{A}-\mu_{B} \leq \delta \\ H_{1}: \mu_{A}-\mu_{B}>\delta \end{array} \] where \(\delta\) is the superiority or non-inferiority margin and the ratio between the sample sizes of the two groups is \[ \kappa=\frac{n_{A}}{n_{B}} \] Formulas of sample size and power, respectively: \[ \begin{array}{c} n_{A}=\kappa n_{B} \\ n_{B}=\left(1+\frac{1}{\kappa}\right)\left(\sigma \frac{z_{1-\alpha}+z_{1-\beta}}{\mu_{A}-\mu_{B}-\delta}\right)^{2} \\ 1-\beta=\Phi\left(z-z_{1-\alpha}\right)+\Phi\left(-z-z_{1-\alpha}\right) \quad, \quad z=\frac{\mu_{A}-\mu_{B}-\delta}{\sigma \sqrt{\frac{1}{n_{A}}+\frac{1}{n_{B}}}} \end{array} \]
\[ \begin{array}{l} H_{0}:\left|\mu_{A}-\mu_{B}\right| \geq \delta \\ H_{1}:\left|\mu_{A}-\mu_{B}\right|<\delta \end{array} \] where \(\delta\) is the superiority or non-inferiority margin and the ratio between the sample sizes of the two groups is \[ \kappa=\frac{n_{1}}{n_{2}} \] Formulas of sample size and power, respectively: \[ \begin{array}{c} n_{A}=\kappa n_{B} \\ n_{B}=\left(1+\frac{1}{\kappa}\right)\left(\sigma \frac{z_{1-\alpha}+z_{1-\beta / 2}}{\left|\mu_{A}-\mu_{B}\right|-\delta}\right)^{2} \\ 1-\beta=2\left[\Phi\left(z-z_{1-\alpha}\right)+\Phi\left(-z-z_{1-\alpha}\right)\right]-1 \quad, \quad z=\frac{\left|\mu_{A}-\mu_{B}\right|-\delta}{\sigma \sqrt{\frac{1}{n_{A}}+\frac{1}{n_{B}}}} \end{array} \]
muA=5
muB=4
delta=5
kappa=1
sd=10
alpha=0.05
beta=0.20
(nB=(1+1/kappa)*(sd*(qnorm(1-alpha)+qnorm(1-beta/2))/(abs(muA-muB)-delta))^2)
ceiling(nB)
z=(abs(muA-muB)-delta)/(sd*sqrt((1+1/kappa)/nB))
(Power=2*(pnorm(z-qnorm(1-alpha))+pnorm(-z-qnorm(1-alpha)))-1)
***********************************************************************
* This is a program that illustrates the use of PROC POWER to *
* calculate sample size when comparing two normal means in an *
* equivalence trial. *
***********************************************************************;
proc power;
twosamplemeans dist=normal groupweights=(1 1) alpha=0.05 power=0.9 stddev=0.75
lower=-0.10 upper=0.10 meandiff=0.05 test=equiv_diff ntotal=.;
plot min=0.1 max=0.9;
title "Sample Size Calculation for Comparing Two Normal Means (1:1 Allocation)in an Equivalence Trial";
run;
ssc.twosamplemean.ztest(design = 1L, ratio = 1,
alpha = 0.05, power = 0.8,
sd = 0.1, theta = 0.2)
##
## Sample Size Calculation for Clinical Trials: Test for equilit
## Alpha 0.05
## Power 80 %
## Ratio between subjects in the Treatment Group and in the Control Group: 1
## Standard deviation: 0.1
## Mean difference between two arms: 0.2
ssc.twosamplemean.ztest(design = 1L, ratio = 1,
alpha = 0.025, power = 0.8,
sd = 0.1, theta = 0.2)
##
## Sample Size Calculation for Clinical Trials: Test for equilit
## Alpha 0.025
## Power 80 %
## Ratio between subjects in the Treatment Group and in the Control Group: 1
## Standard deviation: 0.1
## Mean difference between two arms: 0.2
ssc.twosamplemean.ztest(design = 2L, ratio = 1,
alpha = 0.05, power = 0.8,
sd = 0.1, theta = 0, delta = 0.05)
##
## Sample Size Calculation for Clinical Trials: Test for superiority
## Alpha 0.05
## Power 80 %
## Ratio between subjects in the Treatment Group and in the Control Group: 1
## Standard deviation: 0.1
## Mean difference between two arms: 0
## Superiority Margin 0.05
ssc.twosamplemean.ztest(design = 3L, ratio = 1,
alpha = 0.05, power = 0.8,
sd = 0.1, theta = 0, delta = -0.05)
##
## Sample Size Calculation for Clinical Trials: Test for non-inferiority
## Alpha 0.05
## Power 80 %
## Ratio between subjects in the Treatment Group and in the Control Group: 1
## Standard deviation: 0.1
## Mean difference between two arms: 0
## Non-inferiority margin -0.05
ssc.twosamplemean.ztest(design = 4L, ratio = 1,
alpha = 0.05, power = 0.8,
sd = 0.1, theta = 0, delta = 0.05)
##
## Sample Size Calculation for Clinical Trials: Test for equivalence
## Alpha 0.05
## Power 80 %
## Ratio between subjects in the Treatment Group and in the Control Group: 1
## Standard deviation: 0.1
## Mean difference between two arms: 0
## Equivalence margin 0.05
pwr.t.test
: Power calculations for t-tests of means
(one sample, two samples and paired samples)pwr.t2n.test
: Power calculations for two samples
(different sizes) t-tests of meanspwrss.t.2means
: Difference between Two Means (t or z
Test for Independent or Paired Samples)pwr.t.test(n = NULL, d = NULL, sig.level = 0.05, power = NULL,
type = c("two.sample"),
alternative = c("two.sided", "less", "greater"))
pwr.t2n.test(n1 = NULL, n2= NULL, d = NULL,
sig.level = 0.05, power = NULL,
alternative = c("two.sided", "less","greater"))
TTest <- function(alpha,mean,std,power,side){
d <- mean/std
# d: Effect size (Cohen's d) - difference between the means divided by the pooled standard deviation
samplesize <- pwr.t.test(d=d,
power=power,
sig.level=alpha,
type="two.sample",
alternative=side)
CI.Left <- mean-qnorm(1-alpha/2)*std/sqrt(samplesize$n)
CI.Right <- mean+qnorm(1-alpha/2)*std/sqrt(samplesize$n)
CI <- paste("[",round(CI.Left,3), ",", round(CI.Right,3), "]",
sep = "",collapse = NULL)
results <- data.frame(alpha = alpha,
mean = mean,
sd = std,
power = samplesize$power,
side = samplesize$alternative,
n = samplesize$n,
CI = CI)
return(results)
}
TTest(alpha = 0.05, mean = 0.42, std = 0.7, power = 0.8, side = "two.sided")
pwr.t.test(d=0.42/0.7,
power=0.8,
sig.level=0.05,
type="two.sample",
alternative="two.sided")
##
## Two-sample t test power calculation
##
## n = 44.58577
## d = 0.6
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
## 0.7 is pooled variance
## pwrss.t.2means
ssc.twosamplemean.ttest(mu1 = 29.42, mu2 = 29, sd1 = 0.7, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 45
## n2 = 45
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 88
## Non-centrality parameter = 2.846
## Type I error rate = 0.05
## Type II error rate = 0.2
## 0.42/0.7 as Cohen'd
## pwrss.t.2means
ssc.twosamplemean.ttest(mu1 = 0.42/0.7, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 45
## n2 = 45
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 88
## Non-centrality parameter = 2.846
## Type I error rate = 0.05
## Type II error rate = 0.2
## pwrss.t.2means
ssc.twosamplemean.ttest(mu1 = 29.42, mu2 = 29, sd1 = 0.7, kappa = 1,
power = .80, alpha = 0.025,
alternative = "greater")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 > mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 45
## n2 = 45
## ------------------------------
## Alternative = "greater"
## Degrees of freedom = 88
## Non-centrality parameter = 2.846
## Type I error rate = 0.025
## Type II error rate = 0.2
## Cohen'd = 0.10
pwrss.t.2means(mu1 = 0.25, mu2 = 0.15,
margin = 0.05, power = 0.80,
alternative = "superior")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 - mu2 <= margin
## HA: mu1 - mu2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 4947
## n2 = 4947
## ------------------------------
## Alternative = "superior"
## Degrees of freedom = 9892
## Non-centrality parameter = 2.487
## Type I error rate = 0.05
## Type II error rate = 0.2
## Cohen'd = 0.10
pwrss.t.2means(mu1 = 0.25, mu2 = 0.15,
margin = -0.05, power = 0.80,
alternative = "non-inferior")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 - mu2 <= margin
## HA: mu1 - mu2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 551
## n2 = 551
## ------------------------------
## Alternative = "non-inferior"
## Degrees of freedom = 1100
## Non-centrality parameter = 2.49
## Type I error rate = 0.05
## Type II error rate = 0.2
## Cohen'd = 0
pwrss.t.2means(mu1 = 0.25, mu2 = 0.25,
margin = 0.05, power = 0.80,
alternative = "equivalent")
## Difference between Two means
## (Independent Samples t Test)
## H0: |mu1 - mu2| >= margin
## HA: |mu1 - mu2| < margin
## ------------------------------
## Statistical power = 0.8
## n1 = 6852
## n2 = 6852
## ------------------------------
## Alternative = "equivalent"
## Degrees of freedom = 13702
## Non-centrality parameter = -2.927
## Type I error rate = 0.05
## Type II error rate = 0.2
pwr.t.test(n = NULL, d = NULL, sig.level = 0.05, power = NULL,
type = c("paired"),
alternative = c("two.sided", "less", "greater"))
Suppose you are designing a study hoping to show that a new (less expensive) manufacturing process does not produce appreciably more pollution than the current process. Quantifying “appreciably worse” as \(10 \%\), you seek to show that the mean pollutant level from the new process is less than \(110 \%\) of that from the current process. In standard hypothesis testing notation, you seek to reject \[ H_0: \frac{\mu_{\text {new }}}{\mu_{\text {current }}} \geq 1.10 \] in favor of \[ H_A: \frac{\mu_{\text {new }}}{\mu_{\text {current }}}<1.10 \]
An appropriate test for this situation is the common two-group \(t\) test on log-transformed data. The hypotheses become \[ \begin{aligned} H_0: \log \left(\mu_{\text {new }}\right)-\log \left(\mu_{\text {current }}\right) & \geq \log (1.10) \\ H_A: \log \left(\mu_{\text {new }}\right)-\log \left(\mu_{\text {current }}\right) & <\log (1.10) \end{aligned} \] Measurements of the pollutant level will be taken by using laboratory models of the two processes and will be treated as independent lognormal observations with a coefficient of variation \((\sigma / \mu)\) between 0.5 and 0.6 for both processes. You will end up with 300 measurements for the current process and 180 for the new one. It is important to avoid a Type I error here, so you set the Type I error rate to 0.01 . Your theoretical work suggests that the new process will actually reduce the pollutant by about \(10 \%\) (to \(90 \%\) of current), but you need to compute and graph the power of the study if the new levels are actually between \(70 \%\) and \(120 \%\) of current levels.
ods graphics on;
proc power;
twosamplemeans test=ratio
meanratio = 0.7 to 1.2 by 0.1
nullratio = 1.10
sides = L
alpha = 0.01
cv = 0.5 0.6
groupns = (300 180)
power = .;
plot x=effect step=0.05;
run;
ods graphics off;
Supported Designs of Package PowerTOST
design | name | df |
---|---|---|
parallel | 2 parallel groups | n-2 |
2x2 | 2x2 crossover | n-2 |
2x2x2 | 2x2x2 crossover | n-2 |
3x3 | 3x3 crossover | 2*n-4 |
3x6x3 | 3x6x3 crossover | 2*n-4 |
4x4 | 4x4 crossover | 3*n-6 |
2x2x3 | 2x2x3 replicate crossover | 2*n-3 |
2x2x4 | 2x2x4 replicate crossover | 3*n-4 |
2x4x4 | 2x4x4 replicate crossover | 3*n-4 |
2x3x3 | partial replicate (2x3x3) | 2*n-3 |
2x4x2 | Balaam’s (2x4x2) | n-2 |
2x2x2r | Liu’s 2x2x2 repeated x-over | 3*n-2 |
paired | paired means | n-1 |
For various methods power can be calculated based on
For all methods the sample size can be estimated based on
By Defaults
Parameter | Argument | Purpose | Default |
---|---|---|---|
\(\small{\alpha}\) | alpha |
Nominal level of the test | 0.05 |
\(\small{\pi}\) | targetpower |
Minimum desired power | 0.80 |
logscale | logscale |
Analysis on log-transformed or original scale? | TRUE |
\(\small{\theta_0}\) | theta0 |
‘True’ or assumed deviation of T from R | see below |
\(\small{\theta_1}\) | theta1 |
Lower BE limit | see below |
\(\small{\theta_2}\) | theta2 |
Upper BE limit | see below |
CV | CV |
CV | none |
design | design |
Planned design | "2x2" |
method | method |
Algorithm | "exact" |
robust | robust |
‘Robust’ evaluation (Senn’s basic estimator) | FALSE |
print |
Show information in the console? | TRUE |
|
details | details |
Show details of the sample size search? | FALSE |
imax | imax |
Maximum number of iterations | 100 |
Defaults depending on the argument logscale
:
Parameter | Argument | logscale = TRUE |
logscale = FALSE |
---|---|---|---|
\(\small{\theta_0}\) | theta0 |
0.95 |
+0.05 |
\(\small{\theta_1}\) | theta1 |
0.80 |
−0.20 |
\(\small{\theta_2}\) | theta2 |
1.25 |
+0.20 |
power.TOST(CV = 0.35, n = c(52, 49), design = "parallel")
## [1] 0.8011186
sampleN.TOST(CV = 0.30, details = FALSE, print = FALSE)[["Sample size"]]
## [1] 40
Note that sampleN.TOST() is not vectorized. If we are interested in combinations of assumed values:
sampleN.TOST.vectorized <- function(CVs, theta0s, ...) {
n <- power <- matrix(ncol = length(CVs), nrow = length(theta0s))
for (i in seq_along(theta0s)) {
for (j in seq_along(CVs)) {
tmp <- sampleN.TOST(CV = CVs[j], theta0 = theta0s[i], ...)
n[i, j] <- tmp[["Sample size"]]
power[i, j] <- tmp[["Achieved power"]]
}
}
DecPlaces <- function(x) match(TRUE, round(x, 1:15) == x)
fmt.col <- paste0("CV %.", max(sapply(CVs, FUN = DecPlaces),
na.rm = TRUE), "f")
fmt.row <- paste0("theta %.", max(sapply(theta0s, FUN = DecPlaces),
na.rm = TRUE), "f")
colnames(power) <- colnames(n) <- sprintf(fmt.col, CVs)
rownames(power) <- rownames(n) <- sprintf(fmt.row, theta0s)
res <- list(n = n, power = power)
return(res)
}
CVs <- seq(0.20, 0.40, 0.05)
theta0s <- seq(0.90, 0.95, 0.01)
x <- sampleN.TOST.vectorized(CV = CVs, theta0 = theta0s,
details = FALSE, print = FALSE)
cat("Sample size\n"); print(x$n); cat("Achieved power\n"); print(signif(x$power, digits = 5))
## Sample size
## CV 0.20 CV 0.25 CV 0.30 CV 0.35 CV 0.40
## theta 0.90 38 56 80 106 134
## theta 0.91 32 48 66 88 112
## theta 0.92 28 40 56 76 96
## theta 0.93 24 36 50 66 84
## theta 0.94 22 32 44 58 74
## theta 0.95 20 28 40 52 66
## Achieved power
## CV 0.20 CV 0.25 CV 0.30 CV 0.35 CV 0.40
## theta 0.90 0.81549 0.80358 0.80801 0.80541 0.80088
## theta 0.91 0.81537 0.81070 0.80217 0.80212 0.80016
## theta 0.92 0.82274 0.80173 0.80021 0.80678 0.80238
## theta 0.93 0.81729 0.81486 0.81102 0.80807 0.80655
## theta 0.94 0.83063 0.81796 0.81096 0.80781 0.80740
## theta 0.95 0.83468 0.80744 0.81585 0.80747 0.80525
One-Way ANOVA Pairwise, 2-Sided Equality using Bonferroni Adjustment
In more general terms, we may have \(k\) groups, meaning there are a total of \(K \equiv\left(\begin{array}{c}k \\ 2\end{array}\right)=k(k-1) / 2\) possible pairwise comparisons. When we test \(\tau \leq K\) of these pairwise comparisons, we have \(\tau\) hypotheses of the form \[ \begin{array}{l} H_{0}: \mu_{A}=\mu_{B} \\ H_{1}: \mu_{A} \neq \mu_{B} \end{array} \] where \(\mu_{A}\) and \(\mu_{B}\) represent the means of two of the \(k\) groups, groups ‘A’ and ‘B’. We’ll compute the required sample size for each of the \(\tau\) comparisons, and total sample size needed is the largest of these. In the formula below, \(n\) represents the sample size in any one of these \(\tau\) comparisons; that is, there are \(n / 2\) people in the ‘A’ group, and \(n / 2\) people in the ‘B’ group.
Formulas to compute sample size and power, respectively: \[ \begin{array}{c} n=2\left(\sigma \frac{z_{1-\alpha /(2 \tau)}+z_{1-\beta}}{\mu_{A}-\mu_{B}}\right)^{2} \\ 1-\beta=\Phi\left(z-z_{1-\alpha /(2 \tau)}\right)+\Phi\left(-z-z_{1-\alpha /(2 \tau)}\right) \quad, \quad z=\frac{\mu_{A}-\mu_{B}}{\sigma \sqrt{\frac{2}{n}}} \end{array} \]
See more under multiple tests.
Perform power analysis on balanced one-way analysis of variance, where k is the number of groups, and n is the sample size in each group.
Now do a one-way analysis of variance for the five groups to achieve a power of 0.8, an effect value of 0.25, and a significance level of 0.05 to calculate the sample size required for each group
Where effect value is calculated using \[f=\sqrt{\frac{\sum_{i=1}^k p_i \times (\mu_i - \mu)^2}{\sigma^2}}\]
pwr.anova.test(k = 5, f = 0.25, sig.level = 0.05, power = 0.8)
##
## Balanced one-way analysis of variance power calculation
##
## k = 5
## n = 39.1534
## f = 0.25
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
# Sample sizes for detecting significant effects in a One-Way ANOVA
es <- seq(0.1, 0.5, 0.01)
nes <- length(es)
samsize <- NULL
for (i in 1:nes) {
result <- pwr.anova.test(k = 5, f = es[i],
sig.level = 0.05,
power = 0.9)
samsize[i] <- ceiling(result$n)
}
plot(samsize, es, type = "l", lwd = 2, col = "red",
ylab = "Effect Size", xlab = "Sample Size (per cell)",
main = "One Way ANOVA with Power=.90 and Alpha=.05")