ICMR 2017

Industry keynotes

"With 5G Approaching, How will Audio/Video Technology that Serves 800 Million QQ Users Bring Forth New Ideas" Xiaozheng Huang (Tencent, China)

Abstract: Back to 1999, a popular IM QQ in China, stilled called OICQ at that time, released a new version, which included the functionality of audio call for the first time. Not much time later, video call was also enabled. After 18 years of fast growing, QQ has 800 million monthly active users. QQ users spend 1.2 billion minutes for audio and video call every single day.

With QQ’s fast growing, the audio and video technology behind it also evolves tremendously. We build our own audio/video technology center, which grows to Tencent Audio/Video Lab, and develops our own SDK when OEM cannot meet our needs. The new generation of audio/video communication engine “SPEAR”, developed by our own, serves 800 million QQ users today. Our web broadcasting solution serves China’s 10 top web broadcasting platforms, with 200 million user base and 70% market share of China. With 5G approaching, how will audio/video technology that serves 800 million QQ users bring forth new ideas?

In this presentation, I will firstly introduce how the audio/video technology develops in Tencent Audio/Video Lab while internet transferring from PC to mobile. Secondly, I will explain the capability of our technology in the field of audio/video web communication, web broadcasting and image/audio/video processing. Thirdly, I will present our new research results and how they are used in our products and services. Then, I will talk a little about our future plan.

Bio: Xiaozheng Huang is a senior scholar and engineer from Tencent Audio/Video Lab.He obtained his bachelor degree from Zhejiang University, China in 2007 and M.A.Sc degree from Simon Fraser University, Canada in 2011. His work is mainly focused on multimedia communication research and application development. Before he joined Tencent, he is core member of audio/video development team in the Skype devision of Microsoft in Redmond, US. He actively participated in the developmen of moduls of video codec, video call and screen sharing for Skype, Microsoft Office Lync and Windows. As a tech lead in Tencent Audio/Video Lab, his work includes SDK of image/video procesing/compression and audio/video communication, which is used in shipped products and services of 800 million user base. He made many outstanding achievements like “Key Tech Breakthrough Award” and “Excellent Operation Award” in Tencent. He also filed several patents, which is adopted by the national standard.

"Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies", Matei Stroilă (HERE, USA)

Abstract: HERE Technologies provides real-time location services that enable people, enterprises, and cities around the world to harness the power of location and create innovative solutions for a safer and more efficient living. Multimedia retrieval techniques and sensor fusion approaches are essential for enriching location services and for keeping the underlying map up to date. In this talk, I will give an overview of some of the work we do in the CTO Research group to support existing location services and enable future ones. We aim to automatically extract useful information from massive collections of images, LiDAR point clouds, car sensor data and open web data. I will present work related to image recognition for map making purposes, information retrieval for points of interest enrichment, and work related to creating a highly accurate map of the roads and cities for the future autonomous navigation services.

Bio: Matei Stroila is a Senior Manager of Research at HERE Technologies, currently leading a team of researchers focused on developing innovative solutions for map making techniques and map experiences. He has been with HERE (former NAVTEQ/NOKIA) for more than 10 years, both in individual contributor and managerial roles in Research and Development. During this time, he contributed numerous issued patents and scientific papers in location services relevant areas. In the past, he held research roles in academia, including a research professorship at the University of Virginia, where he worked on medical imaging and computer visualization techniques for brain surgery, and a postdoctoral position at the University of Illinois at Urbana-Champaign, working on shape modeling and vector graphics rendering techniques. He holds a Ph.D. degree in Mathematics from the University of Southern California and an MS degree in Computer Science from the University of Illinois at Urbana-Champaign.

"Intelligently Connecting People with Information", Changhu Wang (Toutiao AI Lab, China)

Abstract: How to effectively connect people with information is a fundamental problem in human society. We are now in the era of mobile first, and everything is digitally connected. With the advent of diverse social contents, information feeds have become a new way to connect people with information. Thus, there is a pretty good opportunity for artificial intelligence (AI) to make innovations in this direction. AI can make more efficient and intelligent the creation, moderation, dissemination, searching, consumption, and interaction of information and contents.

As an industry leader in the product platform and service of information feeds, Toutiao takes the lead to develop and leverage diverse machine learning techniques to efficiently process, analyze, mine, understand, and organize a large amount of multimedia data. Meanwhile, owning to its rich application scenarios and active users all over the world, we have accumulated huge amount of training data, which makes the machine learning system form a closed feedback loop and thus can continually improve and evolve itself. This closed-loop system enables Toutiao to develop core AI technologies in largescale machine learning, text analysis, natural language processing, computer vision, and data mining.

In this talk, I will share some personal opinions to the development prospects of AI in this fundamental area, including my understanding to AI, important research progress in recent years, the influence of AI to the software industry, and how to build the core competence strategy of AI in a company. Moreover, I will also introduce some research progress of Toutiao AI Lab.

Bio: Changhu Wang is currently Technical Director of Toutiao AI Lab, Beijing, China. He received B.E. and Ph.D. degrees from the University of Science and Technology of China, in 2004 and 2009, respectively. Before joining Toutiao AI Lab, he worked as a lead researcher in Microsoft Research from 2009 to 2017. He worked as a research engineer at the department of Electrical and Computer Engineering in National University of Singapore in 2008.

His current research interests include computer vision, multimedia analysis, and machine learning. In particular, he is interested in applying the techniques from these areas to a broad range of multimedia and vision applications, such as image/video understanding, sketch visual analysis, multimedia search, and deep learning.

Changhu has authored or co-authored over 50 papers in highquality journals and conferences. He has shipped multiple technologies to Microsoft products, such as Bing, Office, XiaoIce, etc. He holds 9 U.S. granted patents. He also served as program committee member and reviewer for 20+ high-quality conferences and journals. He built one of the first million-level sketch-based image search systems, as well as the first billion-level system in the world. He also built one of the first general sketch recognition systems. He was awarded Microsoft Gold Star Award in 2011, in the recognition of his important contribution to Microsoft’s success. He received a Best Demo Award in ACM International Conference on Multimedia 2010 in the recognition of MindFinder system. He was awarded Microsoft Fellowship in 2007.

Association for Computing Machinery

University Politehnica of Bucharest

University of Trento

June 6-9, Bucharest, Romania

ACM International Conference

on Multimedia Retrieval

Industry keynotes

"With 5G Approaching, How will Audio/Video Technology that Serves 800 Million QQ Users Bring Forth New Ideas" Xiaozheng Huang (Tencent, China)

"Information Retrieval from Multi-Sensor Data for Enriching Location Services at HERE Technologies", Matei Stroilă (HERE, USA)

"Intelligently Connecting People with Information", Changhu Wang (Toutiao AI Lab, China)