美国卡内基梅隆大学博士学位,美国密歇根州立大学终身教授。金榕教授长期致力于统计机器学习研究,重点关注大数据分析及其在互联网信息检索、电子商务等领域中的应用,在随机优化、在线学习、核学习、度量学习、半监督学习、主动学习和众包等领域提出了一系列原创算法和理论。共发表200多篇国际会议和期刊论文,在本领域的顶级期刊如JMLR、TPAMI、PNAS上发表论文32篇,在本领域的顶级国际会议如ICML、NIPS、COLT上发表论文147篇,研究成果他引7000余次。曾担任NIPS、SIGIR等顶级国际会议领域主席,KDD、AAAI、IJCAI等顶级会议高级程序委员会委员。金榕教授获得过美国国家科学基金会NSF Career Award。
演讲题目:Making the Impossible Possible: Randomized Machine Learning Algorithms for Big Data
摘要:We are continuing to encounter an explosive growth in data: the number of web pages grows from 300 million in 1997 to 50 billion in 2013; about 10 billion images are indexed by Google and 6 billion videos are indexed by YouTube; Alibaba’s ecommerce platform receives billions of requests on a daily basis. This data explosion poses a great challenge in data analysis. Randomized algorithms have attracted significant interests in the recent studies of machine learning, mostly due to its computational efficiency. But, on the other hand, the formal limitations of randomized algorithms have been established for various learning tasks, making them less effective in exploiting the massive amount of data that is available to computer programs. In this talk, I will discuss, based on two examples, how to overcome the limitation of randomized machine learning algorithms by exploiting either the side information or prior knowledge of data. We have shown, both theoretically and empirically, that with a slight modification, it is possible to dramatically improve the effectiveness of randomized algorithms for machine learning. I will also introduce the successful cases of applying randomized algorithms in Alibaba.