Talk by Peng Zhang (Georgia Institute of Technology)

01/26/2021  3:00pm  WebEx Meeting

 

Abstract:  One task of data science is to analyze massive data, using tools such as linear equations, linear programs, and optimization. This task can be simplified if better data has been collected, for example, from carefully planned experiments. In this talk, I will discuss my work on the design of fast algorithms for these two problems.
In the first part of the talk, I will present an efficient algorithm that improves the design of randomized controlled trials (RCTs). RCTs are widely used to test the effectiveness of new drugs and interventions. In a RCT, we randomly assign subjects to different treatment groups to balance covariates — characteristics of the subjects we know before conducting the experiment. Randomness allows us to make valid statistical inferences and reduce the impact of unobserved biases; balancing covariates improves the precision of estimating treatment effects if covariates are predictive of treatment outcomes. Our algorithm guarantees both randomness and covariate balance simultaneously. In the second part of the talk, I will discuss my work on designing and understanding the limit of fast algorithms for solving linear equations and linear programs with additional structures that arise commonly in practice, such as geometric structures, spectral properties, non-negativity of variables and coefficients.