Correlation-based Multiple Testing Procedures
Priscilla Bacino's Ph.D. Thesis Proposal (Dept. of Mathematical Sciences, MSU)
08/16/2022
Abstract:
Testing many hypotheses simultaneously without adjustment leads to an increased chance of false discoveries. Traditional methods for controlling the rate of spurious detections aim to obtain adjusted evidence for each hypothesis individually. This can lead to diminished power in large-scale testing situations, especially when the single effects may be too weak to be identified.
A combination of signal detection among groups of hypotheses and signal identification of individual hypotheses has the potential to improve statistical power and provide enhanced interpretation of results. An efficient way to test groups of hypotheses and single hypotheses together is to begin testing with the global null hypothesis of no real effect for all variables, then proceed in a hierarchical manner to test smaller and smaller partitions until no more evidence is found or individual hypotheses are encountered.
We consider a hierarchical multiple testing method of the manner described above. The method was initially proposed by Meinshausen (2008) for selecting correlated variables in linear regression. We focus on applying the method to testing for multiple inter-group differences under the two-group model. The properties of the multiple testing procedure under marginal associations are studied under theory and using simulated data. Simulation and theoretical results show that the method provides family-wise error control. We also see in simulations that there is improvement in performance even at the individual variable level compared to traditional alternative methods. We illustrate the proposed method with data from a study on the response of plasma metabolites to Type I Diabetes Mellitus disease progression in non-obese mice.
Modifications to improve the power of the method as well as extensions to more complex model structures are considered and discussed as next steps.