Location: Intel SC-12 Auditorium, 3600 Juliette Lane, Santa Clara, CA 95054 

Time: Sep 27, 2017 Wednesday, 1:30pm - 4:30pm         Agenda        See Who's attended         

13th BAMMF event: Special Session on Deep Learning


C.-C. Jay Kuo, Professor, University of Southern California

Bio: Dr. C.-C. Jay Kuo received his Ph.D. degree from the Massachusetts Institute of Technology in 1987. He is now with the University of Southern California (USC) as Director of the Media Communications Laboratory and Dean’s Professor in Electrical Engineering-Systems. His research interests are in the areas of digital media processing, compression, communication and networking technologies.

Dr. Kuo was the Editor-in-Chief for the IEEE Trans. on Information Forensics and Security in 2012-2014. He was the Editor-in-Chief for the Journal of Visual Communication and Image Representation in 1997-2011, and served as Editor for 10 other international journals.

Dr. Kuo received the 1992 National Science Foundation Young Investigator (NYI) Award, the 1993 National Science Foundation Presidential Faculty Fellow (PFF) Award, the 2010 Electronic Imaging Scientist of the Year Award, the 2010-11 Fulbright-Nokia Distinguished Chair in Information and Communications Technologies, the 2011 Pan Wen-Yuan Outstanding Research Award, the 2014 USC Northrop Grumman Excellence in Teaching Award, the 2016 USC Associates Award for Excellence in Teaching, the 2016 IEEE Computer Society Taylor L. Booth Education Award, the 2016 IEEE Circuits and Systems Society John Choma Education Award, the 2016 IS&T Raymond C. Bowman Award, and the 2017 IEEE Leon K. Kirchmayer Graduate Teaching Award.

Dr. Kuo is a Fellow of AAAS, IEEE and SPIE. He has guided 140 students to their Ph.D. degrees and supervised 25 postdoctoral research fellows. Dr. Kuo is a co-author of about 250 journal papers, 900 conference papers and 14 books.

Title: Why Deep Learning Networks Work So Well?

Abstract: Deep learning networks, including convolution and recurrent neural networks (CNN and RNN), provide a powerful tool for image, video and speech processing and understanding nowadays. However, their superior performance has not been well understood. In this talk, I will unveil the myth of the superior performance of CNNs. To begin with, I will describe network architectural evolution in three generations: first, the McClulloch and Pitts (M-P) neuron model and simple networks (1940-1980); second, the artificial neural network (ANN) (1980-2000); and, third, the modern CNN (2000-Present). The differences between these three generations will be clearly explained. Next, theoretical foundations of CNNs have been studied from the approximation, the optimization and the signal representation viewpoints, and I will present main results from the signal processing viewpoints. I will use an intuitive way to explain the complicated operations of the CNN systems.

George Toderici, Staff Software Engineer, Google Research

Bio: George Toderici received his Ph.D. in Computer Science from the University of Houston in 2007 where his research focused on 2D-to-3D face recognition, and joined Google in 2008. His current work at Google Research is focused on lossy multimedia compression using neural networks. His past projects at Google include the design of neural-network architectures and classical approaches for video classification, action recognition, YouTube channel recommendations, and video enhancement. He has helped organize the THUMOS-2014 and YouTube-8M (2017) video classification challenges, and contributed to the design of the Sports-1M dataset. He has also served as Area Chair for the ACM Multimedia Conference in 2014, and is a regular reviewer for CVPR, ICCV, and NIPS.

Title: Recent Deep Learning Advances in Video Action Recognition

Abstract: The field of action recognition has been traditionally dominated by "classical" methods, which rely on hand-crafted features that are extracted from the video frames, and then are fed into classification systems (such as linear classifiers or more complicated approaches). Recently, due to the increasing interest in, and feasibility of deep networks, a lot of research has concentrated on how to apply deep networks to this task. In this talk I will give an overview of a "generic" framework for action recognition using deep networks, discuss various frame-level approaches which may be integrated into this framework, and finally review the most recent advances in using motion as an additional feature.


Shao-Yi Chien, Professor, National Taiwan University

Bio: Shao-Yi Chien received the B.S. and Ph.D. degrees from the Department of Electrical Engineering, National Taiwan University (NTU), Taipei, Taiwan, in 1999 and 2003, respectively. During 2003 to 2004, he was a research staff in Quanta Research Institute, Tao Yuan County, Taiwan. In 2004, he joined the Graduate Institute of Electronics Engineering and Department of Electrical Engineering, National Taiwan University, as an Assistant Professor. Since 2012, he has been a Professor. Dr. Chien is the Associate Chair of Department of Electrical Engineering of National Taiwan University from 2013 to 2016. From 2017, he is a visiting professor in Intel Lab. His research interests include video analysis, computer vision, perceptual coding technology, image processing for digital still cameras and display devices, computer graphics, and the associated VLSI and processor architectures.

Dr. Chien served as an Associate Editor for IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Circuits and Systems I: Regular Papers, and Springer Circuits, Systems and Signal Processing (CSSP). He also served as a Guest Editor for Springer Journal of Signal Processing Systems in 2008. He also serves on the technical program committees of several conferences, such as ISCAS, ICME, SiPS, A-SSCC, and VLSI-DAT.

Title: Quantized Convolutional Neural Network for Efficient Hardware Realization

Abstract: Convolutional neural networks (CNNs) have emerged to provide powerful discriminative capability, especially in the world of image recognition and object detection. However, their massive computation requirements, storage and memory accesses make them hard to be deployed on mobile or embedded systems. In this talk, several optimization schemes for Convolutional neural networks (CNNs) will be first reviewed. Among them, quantization technique is emphasized since it can benefit many kinds of computing architectures. A dedicated hardware architecture design for face detection will also be shown as an example.


Yulia Tell, Technical Program Manager, Intel Corporation

Bio: Yulia Tell is a Technical Program Manager in Big Data Technologies team within Software and Services Group at Intel. She is working on several open source projects and partner engagements in the big data domain. Her work is focused specifically on Apache Hadoop and Apache Spark, including big data analytics applications that use machine learning and deep learning. She has worked in several groups at Intel over the past 10 years, including work on Intel’s HPC software tools and services. Yulia also is a training committee lead of Women in Big Data on the West Coast. Women in Big Data is a grass-roots community focused on strengthening the diversity in big data and analytics and aims to champion the success of women in big data domain. Yulia has received her MSc degree in Computer Science from Moscow Power Engineering Technical University. She has also completed executive education program on Market Driving Strategies at London Business School.

Title: Deep Learning at Scale with Apache Spark

Abstract: Deep learning is a fast growing subset of machine learning. There is an emerging trend to conduct deep learning in the same cluster along with existing data processing pipelines to support feature engineering and traditional machine learning. Being one of early and top contributors to Apache Spark, Intel has developed and open sourced a distributed deep learning framework called BigDL that is built organically on big data (Apache Spark) platform. It combines the benefits of high performance computing and big data architecture for rich deep learning support.

In this session, I will introduce BigDL, cover how our customers use BigDL to build end-to-end ML/DL applications, talk about platforms on which BigDL is deployed and also provide an update on the latest improvements in BigDL. BigDL helps make deep learning more accessible to the Big Data community, by allowing them to continue the use of familiar tools and infrastructure to build deep learning applications. With BigDL, users can write their deep learning applications as standard Spark programs, which can then directly run on top of existing Spark or Hadoop clusters. BigDL on Spark also enables customers to eliminate large volume of unnecessary dataset transfer between separate systems, eliminate separate HW clusters and move towards a CPU cluster, reduce system complexity and the latency for end-to-end learning.


What's BAMMF?

BAMMF is a Bay Area Multimedia Forum series. Experts from both academia and industry are invited to exchange ideas and information through talks, tutorials, posters, panel discussions and networking sessions. Topics of the forum will include but not limited to emerging areas in vision, audio, touch, speech, text, sensors, human computer interaction, natural language processing, machine learning, media-related signal processing, communication, and cross-media analysis etc. Talks in the event may cover advancement in algorithms and development, demonstration of new inventions, product innovation, business opportunities, etc. If you are interested in giving a presentation at the forum, please contact us.

Our Sponsors:
PARC, a Xerox Company

Hewlett Packard