Location: FX Palo Alto Laboratory, 3174 Porter Drive, Palo Alto, CA 94304

Time: October 10, 2018 Wednesday, 3:00pm - 5:00pm                  


Tat-Seng Chua, KITHCT Chair Professor, School of Computing, National University of Singapore

Bio: Dr Chua is the KITHCT Chair Professor at the School of Computing, National University of Singapore. He was the Acting and Founding Dean of the School from 1998-2000. His main research interests fall under the topics of multimedia information retrieval, unstructured multimodal analytics, and the emerging applications in Chabot, wellness and Fintech. He is the Co-Director of NExT, a joint Center between NUS and Tsinghua University on Extreme Search. 

Dr Chua is the recipient of the 2015 ACM SIGMM Achievements Award. He is the Chair of steering committee of ACM international Conference on Mutilmedia Retrieval (ICMR) and Multimedia Modeling (MMM) Conference series. Dr Chua is also the General Co-Chair of ACM Multimedia 2005, ACM CIVR (now ACM ICMR) 2005, ACM SIGIR 2008, and ACM Web Science 2015. He serves in the editorial boards of several international journals. Dr Chua is the Co-Founder of two technology startip companies in Singapore. He holds a PhD from the University of Leeds. More about Dr Chua can be found here.

Title: Revisiting Visual Food Recognition and Beyond


This Talk reviews and discusses research on visual food recognition. In particular, it describes our research towards building a robust visual food recognition app to track users' food in-take as well as the recipe and nutritional information. For the app to be usable, we examine not only accuracy, but also the robustness and scalability of food recognition technology. As part of this research, we will build a food knowledge graph that links food/ ingredients to nutrition and diseases. Beyond food recognition, we look into the combination of food in-take with other lifestyle data of users, including activities and POCT (point of care testing) data. This enable us to perform big lifestyle analytics, support chronic diseases prediction and prevention, and carry out personalized recommendation, nudging and influence. This work is conducted under NExT, a joint research Centre between NUS and Tsinghua to carry out research on unstructured data analytics. The talk will also highlight key research activities under NExT.

Daniel DeTone, Senior Software Engineer in Deep Learning, Magic Leap, Inc

Bio: Daniel is a Senior Research Engineer in Deep Learning at Magic Leap, currently focused on pioneering new learning-based methods for Visual SLAM, co-advised by Tomasz Malisiewicz and Andrew Rabinovich. He received his Master's and Bachelor's degrees at the University of Michigan, where he focused on Machine Learning, Computer Vision and Robotics. During this time he worked on various small projects on topics including person tracking, outdoor SLAM, scene text detection, 3D voxel convnets, robotic path planning, and text summarization, advised by great mentors like Matthew Johnson-Roberson, Edwin Olson, Silvio Savarese and Homer Neal. During his Master's studies he did an internship with Occipital where he helped release the Structure SDK which runs RGB-D SLAM on mobile devices. More about Daniel can be found here.

Title: Learning Deep Convolutional Frontends for Visual SLAM


The self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multihomography approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner detector. The final system gives rise to state-of-the-art homography estimation results on HPatches when compared to LIFT, SIFT and ORB.


What's BAMMF?

BAMMF is a Bay Area Multimedia Forum series. Experts from both academia and industry are invited to exchange ideas and information through talks, tutorials, posters, panel discussions and networking sessions. Topics of the forum will include but not limited to emerging areas in vision, audio, touch, speech, text, sensors, human computer interaction, natural language processing, machine learning, media-related signal processing, communication, and cross-media analysis etc. Talks in the event may cover advancement in algorithms and development, demonstration of new inventions, product innovation, business opportunities, etc. If you are interested in giving a presentation at the forum, please contact us.

Our Sponsors:
PARC, a Xerox Company

Hewlett Packard