7 October 2021
Dr Mike Intro Video Cover

Introducing Assistant Professor Mike Z. Shou – New ECE Faculty and NRF Fellow (2021)



Dr. Mike Z. Shou joined the Department of Electrical and Computer Engineering (ECE) at NUS as a tenure-track Assistant Professor with an award of the National Research Foundation (NRF) Fellowship (Class of 2021) in May 2021.

The Introductory video highlights the following that includes Dr. Mike’s research, his NRF Fellowship Award, and the video AI models that he built.

Tell us about yourself

I’m Mike Shou. I got my Ph.D. degree at Columbia University in the city of New York. During my Ph.D., I have interned once at a start-up company in the Bay Area, once at Microsoft Research in Seattle. After graduation, I joined Facebook AI, in the Bay Area again, as a Research Scientist, developing AI platforms and models that serve all videos you watch on Facebook and Instagram.

What are your research interests?

My research areas are computer vision and deep learning. In particular, I am interested in building AI models for understanding videos. This can power many real-world applications to better assist our day-to-day life and work. For example, automatically monitor CCTV videos, visual perception for self-driving cars, video recommendation on social media, summarize a long news video into a short highlight, caring-robot for the elderly and patients, AR glass that has a camera seeing exactly what we see and can answer our questions like “where did I leave my key?”.

Tell us about your NRF Fellowship Award.

It is an initiative run by the National Research Foundation to attract early career researchers to carry out independent research in Singapore. I am grateful to receive this prestigious award, which supports me to quickly build up my own research group, which, by the way, is called Show Lab (https://sites.google.com/view/showlab).

How do you build video AI models?

Video is not just about vision; video has multiple modalities (i.e. visual, audio, language). The alignment between these modalities provides us free supervision for training AI models. For example, on YouTube, video transcripts often describe what happened in the video and we can easily get tons of such free data from online to train very large-scale video models. Such so-called “large-scale pre-training” is a promising deep learning technique nowadays. Also, we can get the best of different modalities and combine them. Take audio-visual for example, to identify a speaker’s ID, both the voice and the face are useful. And because of the multimodal nature of the video, we partner with researchers in other areas to conduct multidisciplinary research. For example, recently we have been working with colleagues specializing in speech, robots, and distributed computing.

What are your hobbies?

Actually, my hobbies are quite in line with my research, which is also video-centric. I like watching movies and TV series. Also, I enjoy video creation like rolling cameras and editing video clips in post-production. Interestingly, I found there are many things in video creation that AI can help automate. Thus, we recently kickstarted a new project to develop an AI assistant for video creation tools that can save a lot of man-hour work for journalists, YouTubers, and filmmakers.

How do you feel about joining NUS ECE?

I am so glad to become a member of this big family. I’d like to take this opportunity to thank our management and faculty colleagues for their support and guidance. Also, I appreciate the great support from our administrative team for helping me settle in and set up many things. I look forward to working with our students and colleagues. Let’s stay safe and hope to meet everyone in person soon!

Recent News