机器学习实验室博士生系列论坛(第三十二期)—— Data Fusion under the Assumption of Transportability

Abstract: 
Combining information from multiple data sources collected under heterogeneous conditions is important in the modern era of big data. The increasing diversity of data sources can help us generalize the scientific findings to a broader population. Each type of data source has its advantages and limitations. Through data fusion, they can complement and promote each other. The information can take the form of individual-level data, summary-level data and measurement of disparate covariates. In order to combine the information, the data sources need to share some common distributional features which are often called transportability. 

In this talk, we will introduce the method of data fusion and illustrate the importance of the assumption of transportability through a series of problems in treatment effect estimation and regression analysis.