学术报告-孙法省

学术报告


题      目:Orthogonal Array based Subsampling for Big Data Linear Models


报  告  人:孙法省  教授  (邀请人:吴琴 )

                                    东北师范大学


时      间:3月22日  10:30-11:15


地     点:数科院西楼中心会议室


报告人简介:

        孙法省,教授、博导,教育部 “长江学者奖励计划”青年学者,吉林省优秀教师。主要研究方向包括计算机试验、大数据抽样、高维数据分析,及统计学在机器学习与人工智能领域的应用。


摘      要:

        Linear mixed model is a popular and common modeling method in statistical analysis. It is computationally difficult to obtain parameter estimates in linear mixed model for big data. The current subsampling methods are mainly aimed at the situation where the data is independent, without considering the correlation within the data. We provide some theoretical results on information matrix for linear mixed model. Based on these findings, an optimal subsampling method for linear mixed model is proposed, which maximizes the determinant of the variance-covariance matrix of the subsampling estimator. Besides, the proposed subsampling procedure is also optimal under A-optimality criterion, which minimizes the trace of the variance-covariance matrix of the subsampling estimator. Furthermore, asymptotic property of the subsampling estimator is established. In addition, we apply the sampling method to large data sub-sampling of linear models containing categorical variables, and obtain corresponding theoretical results. Numerical examples based on both simulated and real data are provided to illustrate the proposed subsampling method.

     

        欢迎老师、同学们参加、交流!