Same or Different? Comparing the Coverage Rate of Five Different Approaches for Testing the Difference of Two Groups Means

testing for statistically significant differences between two group means is one of the most common requirements in psychological research, for example, after an experiment has been conducted. While the classical t-test is probably the most popular approach, its deficiencies under violated assumptio...

Full description

Saved in:
Bibliographic Details
Main Author: Bittmann, Felix
Format: Article
Language:English
Published: Université d'Ottawa 2025-02-01
Series:Tutorials in Quantitative Methods for Psychology
Subjects:
Online Access:https://www.tqmp.org/RegularArticles/vol21-1/p001/p001.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:testing for statistically significant differences between two group means is one of the most common requirements in psychological research, for example, after an experiment has been conducted. While the classical t-test is probably the most popular approach, its deficiencies under violated assumptions have been acknowledged, and various alternative tests have been developed. In this research paper, five widely available methods are compared to investigate the coverage of the generated 95\% confidence intervals. We utilize the coverage of confidence intervals as it corresponds to nominal type-I-error rates (Alpha), yet is more adequate since confidence intervals are preferred in contrast to p-values, which often facilitate binary conclusions. The approaches tested are the classical t-test, Welch’s t-test, OLS regressions with robust standard errors, and two flavors of bootstrapping (normal and bias-corrected). Three different outcome distributions are generated (normal, uniform, skewed), and 75,000 simulations with a wide range of sample sizes (15 to 200 per group) and standard deviations are conducted for each. The results outline that Welch’s t-test and the regression approach perform best. The bootstrap approaches tend to consistent undercoverage. The regular t-test produces larger deviations when its assumptions, especially the equality of variances, are violated. When distributions are skewed, all approaches result in undercoverage.
ISSN:1913-4126