I am a Distinguished Engineer at Waymo.
My work spans robotics, machine learning, audio and visual perception.
I previously founded and led the robotics research team at Google for the past 8 years. I also co-created the Conference on Robot Learning and taught Deep Learning on Udacity.
I grew up in France and now live in San Francisco.
Find me on
Medium: http://vanhoucke.medium.com
LinkedIn: http://linkedin.com/in/vanhoucke
Facebook: http://facebook.com/vanhoucke
Google AI: http://ai.google/research/people/VincentVanhoucke
Google Scholar: http://scholar.google.com/citations?user=T7uctwYAAAAJ
Press
The Power of AI Convergence for Global Impact, AI For Good, August 2024.
Will Scaling Solve Robotics?, IEEE Spectrum, May 2024.
Is robotics about to have its own ChatGPT moment?, MIT Technology Review, April 2024.
How is AI Helping Advance Robotics?, Eye on AI Podcast, March 2024.
Google DeepMind’s robotics head on general purpose robots, generative AI and office WiFi, TechCrunch, November 2023.
Vincent Vanhoucke (Google DeepMind) : "L’IA va permettre aux robots d’évoluer avec les humains", L'Express, September 2023.
Aided by A.I. Language Models, Google’s Robots Are Getting Smart, The New York Times, July 2023. (cover - also 1, 2, 3, 4)
Elon’s X Machina, Crypto Orbs and a Visit to Google’s Robot Lab, The New York Times Hard Fork Podcast, July 2023.
The AI Revolution, 60 Minutes, April 2023.
How the French became the kings of AI… in Silicon Valley, California18, May 2023.
Helper Robots: Implementation and Policy, 'This Doesn't Compute' - The Center for Strategic and International Studies Podcast, March 2023.
The Open Source Robotics Transformer 1 Aims to Help Robots Learn From Other Robots, hackster.io, December 2022.
Video: Google Brain Director on Creating Robots for a Messy World (Like Kitchens!), The Spoon, April 2019.
Food robotics pioneers take orders for growing industry appetite, The Robot Report, April 2019.
Google robotics could focus on navigation, machines moving from place to place, CNET, April 2019.
Inside Google’s Rebooted Robotics Program, The New York Times, March 2019. (Also 1, 2, 3, 4, 5, 6)
Google Brain’s Vanhoucke on Robots, AI and Programming vs. Learning, The Spoon, February 2019.
Google introduces AI for drug discovery protein recognition, VentureBeat, July 2018. (Also 1, 2, 3, 4)
Inside Waymo's Strategy to Grow the Best Brains for Self-Driving Cars, The Verge, May 2018.
Les 100 Français qui Comptent dans l'Intelligence Artificielle, L'Usine Nouvelle, February 2018.
Profession: Éducateur de Robots, 01Net, September 2017.
Une Nouvelle Intelligence est Née, Science & Vie, July 2017.
How Computer Vision Is Finally Taking Off, After 50 Years, Nat & Friends, 2017.
Google boosting search to the speed of sound, San Francisco Chronicle, August 2013.
How Google Retooled Android With Help From Your Brain, Wired, February 2013.
Google Now: behind the predictive future of search, The Verge, October 2012.
Google Puts Its Virtual Brain Technology to Work, MIT Technology Review, October 2012.
Google's Voice and Speech Recognition Software is Paving the Road to Unlimited Mobility, TMC News, August 2011.
How Google Is Leading the Way to a Voice-Activated Future, Mashable, July 2011.
Can Google Get Web Users Talking?, MIT Technology Review, June 2011.
Essays
My channel on Medium (30+ essays), 2018-Present.
Intelligence Artificielle (IA) et Grand Public, Centraliens Magazine, March 2008.
L'Intelligence Humaine au Service de la Machine, Centraliens Magazine, July/August 2007.
Lectures
Deep Learning. Udacity online lecture (blog posts), 2016.
Large Scale Deep Learning. Lecture at the Kyoto Machine Learning Summer School, 2015.
Boards
Robot Learning Foundation. President, 2022-Present.
Association for Advancing Automation (A3) Artificial Intelligence Tech Strategy Board. 2021-22.
Conferences and Workshops
4th Robot Learning Workshop: Self-Supervised and Lifelong Learning, Alex Bewley, Igor Gilitschenski, Masha Itkina, Hamidreza Kasaei, Jens Kober, Nathan Lambert, Julien Perez, Ransalu Senanayake, Vincent Vanhoucke, Markus Wulfmeier, at NeurIPS'21, 2021.
3rd Robot Learning Workshop: Grounding Machine Learning Development in the Real World, Masha Itkina, Alex Bewley, Igor Gilitschenski, Julien Perez, Ransalu Senanayake, Markus Wulfmeier, Roberto Calandra, Vincent Vanhoucke, at NeurIPS'20, 2020.
Workshop on Exploration in Reinforcement Learning, Benjamin Eysenbach, Surya Bhupatiraju, Shane Gu, Junhyuk Oh, Vincent Vanhoucke, Oriol Vinyals, Doina Precup, at ICML'18, 2018.
1st Conference on Robot Learning (CoRL 2017), Sergey Levine, Vincent Vanhoucke, Ken Goldberg, creators and co-chairs. In Proceedings of Machine Learning Research vol. 78, 2017.
Talks
Common Sense, Asimov and the Semantics of Safety in the era of Embodied Foundation Models. Invited talk at the Princeton Symposium on Safe Deployment of Foundation Models in Robotics, 2024.
Robotics Keynote at Wing, 2024.
Is AI Truly Redefining Robotics? Debate at ICRA@40, 2024.
Constitutional Embodied AI. Invited keynote at ICRA@40, 2024.
HRI in the Era of Robot Foundation Models. Plenary talk at RO-MAN'24, 2024.
How to leverage AI in the UN in support of safe, responsible and equitable AI? UN AI For Good Global Summit, 2024.
Robotics in the Age of Generative AI. Nvidia GTC (video, companion post), 2024.
What Should We Work On Next? Invited talk at the Workshop What tasks should robotics researchers focus on? at CoRL'23, 2023.
Embodied Foundation Models. Invited talk at the Workshop Pre-Training for Robot Learning at CoRL'23, 2023.
Embodied Foundation Models. Keynote at ICIP'23, 2023.
Invited panelist at the UC Berkeley–NASA Inaugural Symposium on The Future of Skills in the AI Era, 2023.
Towards Embodied Foundation Models. BAIR Robotics and Systems Workshop, 2023.
The Future of Robots for Good: The Quest for Embodied AI. Keynote at AI For Good, 2023.
Scaling up Robot Learning at Google. Invited talk at CoRL (video), 2022.
Humor, Dialog, Affordances, and the Path to Grounded Intelligence. BAIR Robotics and Systems Workshop, 2022.
Research and Engineering Careers for PhDs in Industry. Google PhD Fellowship Forum, 2020.
MIT Tech Review EmTech Digital, 2020.
Closing the Perception-Actuation Loop using Machine Learning: New Perspectives and Strategies:
Invited talk at the CITRIS Research Exchange (video), 2020.
Invited keynote at the Collaborative Robots, Advanced Vision & A.I. Conference, 2019.
Invited talk at the MIT Industrial AI Showcase, 2019.
Robot Understanding and Acting in a Messy, Human World. Invited talk at Articulate, The Food Robotics Summit (video), 2019.
Machine Learning going Meta. Invited keynote at LinkedIn AI Summit, 2019.
Robot Perception. Invited talk at the RE-WORK Toronto Deep Learning Summit, 2018.
Deep Learning: Bringing Machine Intelligence to the Human World. Invited talk at the Silicon Valley Japan Platform Benkyokai, 2018.
Learning Robots. Google I/O (video), 2018.
Self-Supervision for Robotic Learning. Invited talk at the Stanford Robotics Seminar (video), 2018.
Machine Perception: Thinking, Deciding and Acting Like a Human. Invited talk at the Silicon Valley Innovation and Entrepreneurship Forum AI+Talk, 2018.
Simulated Perception. Invited talk at the Bay Area Robotics Symposium, 2017.
Invited speaker at Rutberg WI, 2017.
Learning to Co-Design. Invited talk at the ICSA Workshop on Trends in Machine Learning (video), 2017.
Generative Adversarial Robotics. Keynote at the Symposium on Robot Learning, 2017.
Intelligence Artificielle et l'Innovation de Rupture. Invited talk at École Polytechnique, 2017.
"OK Google, fold my laundry s'il te plaît". Invited talk at AAAI'17, 2017.
Visual Representations at Scale. Invited talk at ICLR'14 (video), 2014.
Acoustic Modeling and Deep Learning. Keynote at ICML'13, 2013.
Quantum Computing and Speech Recognition. Google Open House at the 1st NASA Quantum Future Technologies Conference, 2012.
Toward Superhuman Speech Recognition. AAAI Open House (video) and Frontiers Of Engineering Symposium (video), 2011.
Speech and Search: Bridging the Gap. Keynote at ISCSLP'08, Kunming, China, December 2008.
Scientific Publications
Achieving Human Level Competitive Robot Table Tennis, David B. D'Ambrosio, Saminda Abeyruwan, Laura Graesser, Atil Iscen, Heni Ben Amor, Alex Bewley, Barney J. Reed, Krista Reymann, Leila Takayama, Yuval Tassa, Krzysztof Choromanski, Erwin Coumans, Deepali Jain, Navdeep Jaitly, Natasha Jaques, Satoshi Kataoka, Yuheng Kuang, Nevena Lazic, Reza Mahjourian, Sherry Moore, Kenneth Oslund, Anish Shankar, Vikas Sindhwani, Vincent Vanhoucke, Grace Vesom, Peng Xu, Pannag R. Sanketi, 2024.
The Design of the Barkour Benchmark for Robot Agility, Wenhao Yu, Ken Caluwaerts, Atil Iscen, J. Chase Kew, Tingnan Zhang, Daniel Freeman, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique CHEN, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee, Ofir Nachum, Kenneth Oslund, Francesco Romano, Fereshteh Sadeghi, Baruch Tabanpour, Daniel Zheng, Michael Neunert, Raia Hadsell, Nicolas Heess, Francesco Nori, Jeff Seto, Carolina Parada, Vikas Sindhwani, Vincent Vanhoucke, Jie Tan, Kuang-Huei Lee. Accepted at IROS 2024.
Learning to Learn Faster from Human Feedback with Language Model Predictive Control, Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore, Ken Oslund, Dushyant Rao, Allen Ren, Baruch Tabanpour, Quan Vuong, Ayzaan Wahid, Ted Xiao, Ying Xu, Vincent Zhuang, Peng Xu, Erik Frey, Ken Caluwaerts, Tingnan Zhang, Brian Ichter, Jonathan Tompson, Leila Takayama, Vincent Vanhoucke, Izhak Shafran, Maja Mataric, Dorsa Sadigh, Nicolas Heess, Kanishka Rao, Nik Stewart, Jie Tan, Carolina Parada. 2024.
Open X-Embodiment: Robotic Learning Datasets and RT-X Models, Open X-Embodiment Collaboration, Abhishek Padalkar, Acorn Pooley, Ajinkya Jain, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anikait Singh, Anthony Brohan, Antonin Raffin, Ayzaan Wahid, Ben Burgess-Limerick, Beomjoon Kim, Bernhard Schölkopf, Brian Ichter, Cewu Lu, Charles Xu, Chelsea Finn, Chenfeng Xu, Cheng Chi, Chenguang Huang, Christine Chan, Chuer Pan, Chuyuan Fu, Coline Devin, Danny Driess, Deepak Pathak, Dhruv Shah, Dieter Büchler, Dmitry Kalashnikov, Dorsa Sadigh, Edward Johns, Federico Ceola, Fei Xia, Freek Stulp, Gaoyue Zhou, Gaurav S. Sukhatme, Gautam Salhotra, Ge Yan, Giulio Schiavi, Gregory Kahn, Hao Su, Hao-Shu Fang, Haochen Shi, Heni Ben Amor, Henrik I Christensen, Hiroki Furuta, Homer Walke, Hongjie Fang, Igor Mordatch, Ilija Radosavovic, Isabel Leal, Jacky Liang, Jad Abou-Chakra, Jaehyung Kim, Jan Peters, Jan Schneider, Jasmine Hsu, Jeannette Bohg, Jeffrey Bingham, Jiajun Wu, Jialin Wu, Jianlan Luo, Jiayuan Gu, Jie Tan, Jihoon Oh, Jitendra Malik, Jonathan Tompson, Jonathan Yang, Joseph J. Lim, João Silvério, Junhyek Han, Kanishka Rao, Karl Pertsch, Karol Hausman, Keegan Go, Keerthana Gopalakrishnan, Ken Goldberg, Kendra Byrne, Kenneth Oslund, Kento Kawaharazuka, Kevin Zhang, Krishan Rana, Krishnan Srinivasan, Lawrence Yunliang Chen, Lerrel Pinto, Liam Tan, Lionel Ott, Lisa Lee, Masayoshi Tomizuka, Maximilian Du, Michael Ahn, Mingtong Zhang, Mingyu Ding, Mohan Kumar Srirama, Mohit Sharma, Moo Jin Kim, Naoaki Kanazawa , Nicklas Hansen, Nicolas Heess, Nikhil J Joshi, Niko Suenderhauf, Norman Di Palo, Nur Muhammad Mahi Shafiullah, Oier Mees, Oliver Kroemer, Pannag R Sanketi, Paul Wohlhart, Peng Xu, Pierre Sermanet, Priya Sundaresan, Quan Vuong, Rafael Rafailov, Ran Tian, Ria Doshi, Roberto Martín-Martín, Russell Mendonca, Rutav Shah, Ryan Hoque, Ryan Julian, Samuel Bustamante, Sean Kirmani, Sergey Levine, Sherry Moore, Shikhar Bahl, Shivin Dass, Shubham Sonawani, Shuran Song, Sichun Xu, Siddhant Haldar, Simeon Adebola, Simon Guist, Soroush Nasiriany, Stefan Schaal, Stefan Welker, Stephen Tian, Sudeep Dasari, Suneel Belkhale, Takayuki Osa, Tatsuya Harada, Tatsuya Matsushima, Ted Xiao, Tianhe Yu, Tianli Ding, Todor Davchev, Tony Z. Zhao, Travis Armstrong, Trevor Darrell, Vidhi Jain, Vincent Vanhoucke, Wei Zhan, Wenxuan Zhou, Wolfram Burgard, Xi Chen, Xiaolong Wang, Xinghao Zhu, Xuanlin Li, Yao Lu, Yevgen Chebotar, Yifan Zhou, Yifeng Zhu, Ying Xu, Yixuan Wang, Yonatan Bisk, Yoonyoung Cho, Youngwoon Lee, Yuchen Cui, Yueh-Hua Wu, Yujin Tang, Yuke Zhu, Yunzhu Li, Yusuke Iwasawa, Yutaka Matsuo, Zhuo Xu, Zichen Jeff Cui. ICRA 2024. Best Paper Award.
Robotic Table Tennis: A Case Study into a High Speed Learning System, David B D'Ambrosio, Jonathan Abelian, Saminda Abeyruwan, Michael Ahn, Alex Bewley, Justin Boyd, Krzysztof Choromanski, Omar Cortes, Erwin Coumans, Tianli Ding, Wenbo Gao, Laura Graesser, Atil Iscen, Navdeep Jaitly, Deepali Jain, Juhana Kangaspunta, Satoshi Kataoka, Gus Kouretas, Yuheng Kuang, Nevena Lazic, Corey Lynch, Reza Mahjourian, Sherry Q Moore, Thinh Nguyen, Ken Oslund, Barney J Reed, Krista Reymann, Pannag R Sanketi, Anish Shankar, Pierre Sermanet, Vikas Sindhwani, Avi Singh, Vincent Vanhoucke, Grace Vesom, Peng Xu, 2023.
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alex Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich, 2023.
Barkour: Benchmarking animal-level agility with quadruped robots. Ken Caluwaerts, Atil Iscen, J Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee, Linda Luu, Ofir Nachum, Ken Oslund, Jason Powell, Diego Reyes, Francesco Romano, Feresteh Sadeghi, Ron Sloat, Baruch Tabanpour, Daniel Zheng, Michael Neunert, Raia Hadsell, Nicolas Heess, Francesco Nori, Jeff Seto, Carolina Parada, Vikas Sindhwani, Vincent Vanhoucke, Jie Tan, 2023.
PaLM-E: An Embodied Multimodal Language Model, Danny Driess, Fei Xia, Mehdi SM Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence, 2023.
RT-1: Robotics Transformer for Real-World Control at Scale, Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich, 2022.
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based Robotics Research, Ryan Hoque, Kaushik Shivakumar, Shrey Aeron, Gabriel Deza, Aditya Ganapathi, Adrian Wong, Johnny Lee, Andy Zeng, Vincent Vanhoucke, Ken Goldberg, 2022.
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances, Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, CoRL 2022. Special Innovation Award.
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language, Andy Zeng, Adrian Wong, Stefan Welker, Krzysztof Choromanski, Federico Tombari, Aveek Purohit, Michael Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, Pete Florence, 2022.
Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items (dataset), Laura Downs, Anthony Francis, Nate Koenig, Brandon Kinman, Ryan Hickman, Krista Reymann, Thomas B. McHugh, Pascal Muetschard, Vincent Vanhoucke, ICRA 2022.
Mechanical Search on Shelves using LAX-RAY: Lateral Access X-RAY, Huang Huang, Marcus Dominguez-Kuhne, Jeffrey Ichnowski, Vishal Satish, Michael Danielczuk, Kate Sanders, Andrew Lee, Anelia Angelova, Vincent Vanhoucke, Ken Goldberg. IROS 2021.
Differentiable Mapping Networks: Learning Structured Map Representations for Sparse Visual Localization, Peter Karkus, Anelia Angelova, Vincent Vanhoucke, Rico Jonschkowski. ICRA 2020.
X-Ray: Mechanical Search for an Occluded Object by Minimizing Support of Learned Occupancy Distributions, Michael Danielczuk, Anelia Angelova, Vincent Vanhoucke, Ken Goldberg. IROS 2020.
Policies Modulating Trajectory Generators, Atil Iscen, Jie Tan, Ken Caluwaerts, Vikas Sindhwani, Tingnan Zhang, Vincent Vanhoucke. CoRL 2018.
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping, Coline Devin, Eric Jang, Sergey Levine, Vincent Vanhoucke. CoRL 2018.
QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine, CoRL 2018. Best Systems Paper Award.
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots, Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez and Vincent Vanhoucke. RSS 2018.
Classification of Crystallization Outcomes using Deep Convolutional Neural Networks, Andrew E. Bruno, Patrick Charbonneau, Janet Newman, Edward H. Snell, David R. So, Vincent Vanhoucke, Shawn Williams and Julie Wilson. PLOS ONE (ArXiv version, blog post), 2018.
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping, Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine and Vincent Vanhoucke. ICRA 2018.
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video, Esteban Real, Jonathon Shlens, Stefano Mazzocchi, Xin Pan and Vincent Vanhoucke. CVPR 2017.
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, Christian Szegedy, Sergey Ioffe and Vincent Vanhoucke. ICLR 2016 (workshop) and AAAI 2017 (poster).
Rethinking the Inception Architecture for Computer Vision, Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna. CVPR 2016.
Real-Time Pedestrian Detection With Deep Network Cascades, Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke, Abhijit Ogale, Dave Ferguson. BMVC 2015.
Pedestrian Detection with a Large-Field-Of-View Deep Network, Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke. ICRA 2015.
Going Deeper with Convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. CVPR 2015 (oral). Winner, ImageNet Large Scale Visual Recognition Challenge 2014.
Autoregressive Product of Multi-frame Predictions Can Improve the Accuracy of Hybrid Models, Navdeep Jaitly, Vincent Vanhoucke, Geoffrey Hinton. Interspeech 2014.
Asynchronous Stochastic Optimization for Sequence Training of Deep Neural Networks, Georg Heigold, Erik McDermott, Vincent Vanhoucke, Andrew Senior, Michiel Bacchiani. ICASSP 2014.
Multiframe Deep Neural Networks for Acoustic Modeling, Vincent Vanhoucke, Matthieu Devin, Georg Heigold. ICASSP 2013.
Multilingual Acoustic Models using Distributed Deep Neural Networks, Georg Heigold, Vincent Vanhoucke, Andrew Senior, Patrick Nguyen, Marc'Aurelio Ranzato, Matthieu Devin and Jeff Dean. ICASSP 2013.
On Rectified Linear Units for Speech Processing, Matthew D. Zeiler, Marc'Aurelio Ranzato, Rajat Monga, Mark Mao, Ke Yang, Quoc V. Le, Patrick Nguyen, Andrew Senior, Vincent Vanhoucke, Jeff Dean and Geoffrey E. Hinton. ICASSP 2013.
Deep Neural Networks for Acoustic Modeling in Speech Recognition : The Shared Views of Four Research Groups, Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, Brian Kingsbury, IEEE Signal Processing Magazine, Vol. 29, No. 6, November, 2012. 2022 IEEE Signal Processing Magazine Best Paper Award.
Application Of Pretrained Deep Neural Networks To Large Vocabulary Speech Recognition (poster), Navdeep Jaitly, Patrick Nguyen, Andrew Senior, Vincent Vanhoucke. Interspeech 2012.
Investigations on Exemplar-based Features for Speech Recognition - Towards Thousands of Hours of Unsupervised, Noisy Data, Georg Heigold, Patrick Nguyen, Mitchel Weintraub and Vincent Vanhoucke. ICASSP 2012.
Improving the Speed of Neural Networks on CPUs (poster), Vincent Vanhoucke, Andrew Senior and Mark Z. Mao. Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.
Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models, Francoise Beaufays,Vincent Vanhoucke and Brian Strope. Interspeech 2010.
Reading Text in Consumer Digital Photographs, Vincent Vanhoucke and S. Burak Gokturk. SPIE DRR XIV, 2007.
Confidence Scoring and Rejection using Multi-Pass Speech Recognition, Vincent Vanhoucke. Interspeech 2005.
Automatic Training Set Segmentation For Multi-Pass Speech Recognition, Mark Z. Mao, Vincent Vanhoucke and Brian Strope. ICASSP 2005.
Design of Compact Acoustic Models through Clustering of Tied-Covariance Gaussians, Mark Z. Mao and Vincent Vanhoucke. ICSLP 2004.
Mixtures of Inverse Covariances: Covariance Modeling for Gaussian Mixtures with Applications to Automatic Speech Recognition, Vincent Vanhoucke, Department of Electrical Engineering, Stanford University.
Mixtures of Inverse Covariances, Vincent Vanhoucke and Ananth Sankar. IEEE Transactions on Speech and Audio Processing, Vol 13, #3, pp.250-264, May 2004.
Variable Length Mixtures of Inverse Covariances, Vincent Vanhoucke and Ananth Sankar. In Proceedings of Eurospeech 2003.
Mixtures of Inverse Covariances, Vincent Vanhoucke and Ananth Sankar. ICASSP2003 and NNSP 2003.
Speaker-Trained Recognition using Allophonic Enrollment Models, Vincent Vanhoucke, Michael M. Hochberg and Christopher J. Leggetter. ASRU2001.
Interpretability in Multidimensional Classification, Vincent Vanhoucke and Rosaria Silipo, in Interpretability Issues in Fuzzy Modeling, J. Casillas, O. Cordon, F. Herrera, L. Magdalena, editors, Studies in Fuzziness and Soft Computing, Springer-Verlag
Effects of Prompt Style when Navigating through Structured Data, Vincent Vanhoucke, W. Lawrence Neeley, Maria Mortati, Michael J. Sloan and Clifford Nass. INTERACT 2001.
Misc/Reports
The Opportunities of Foundation Models for Robotics, Submission to the US National Robotics Roadmap update, 2022.
TF-Agents: A library for Reinforcement Learning in TensorFlow, Sergio Guadarrama, Anoop Korattikara, Oscar Ramirez, Pablo Castro, Ethan Holly, Sam Fishman, Ke Wang, Ekaterina Gonina, Chris Harris, Vincent Vanhoucke, and Eugene Brevdo. 2018.
TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow, Danijar Hafner, James Davidson and Vincent Vanhoucke. 2017.
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng, arXiv:1603.04467, 2015.
Winning entry in the LSUN'15 Scene Classification Challenge, Christian Szegedy, Julian Ibarz and Vincent Vanhoucke. 2015.
Block Artifact Cancellation in DCT Based Image Compression, Vincent Vanhoucke. 2001. (Unpublished, but cited by a couple of papers.)
Speech Detection in Adverse Conditions using Genetic Programming, Vincent Vanhoucke. In Genetic Algorithms and Genetic Programming at Stanford 2000, John R. Koza, Editor, pp. 415--424, Stanford Bookstore.
Patents
Using Simulation and Domain Adaptation for Robotic Control. Konstantinos Bousmalis, Alexander Irpan, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Julian Ibarz, Sergey V. Levine, Kurt Konolige, Vincent O. Vanhoucke, Matthew L. Kelcey. WO Patent #2019/060626.
Image Classification Neural Networks. Vincent O. Vanhoucke, Christian Szegedy, Sergey Ioffe. US Patent #10460211.
Processing Structured Documents using Convolutional Neural Networks. Vincent O. Vanhoucke. US Patents #10387531, #11550871.
Optimized Matrix Multiplication using Vector Multiplication of Interleaved Matrix Values, Nishant Patil, Matthew Sarett, Rama Krishna Govindaraju, Benoit Steiner, Vincent O. Vanhoucke. US Patents #9645974, #9830303, #10073817.
Processing Images using Deep Neural Networks, Christian Szegedy, Vincent O. Vanhoucke. US Patents #9715642, #9904875, #9911069, #10977529, #11462035, #11809955.
Frame-level combination of deep neural network and gaussian mixture models, Hui Lin, Xin Lei and Vincent Vanhoucke. US Patent #9240184, #10650289.
Speech Recognition Process, Georg Heigold, Patrick An Phu NGuyen, Mitchel Weintraub, Vincent O. Vanhoucke. US Patent #8775177.
Asynchronous Optimization for Sequence Training of Neural Networks, Georg Heigold, Erik McDermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A.U. Bacchiani. US Patents #10019985, #10482873, #10916238, #11227582, #11854534.
Keyword detection without decoding, Vincent O. Vanhoucke, Oriol Vinyals, Patrick An Phu Nguyen, Maria Carolina Parada San Martin, Johan Schalkwyk, US Patent #9378733.
Adaptive auto-encoders, Vincent Vanhoucke, US Patent #8484022.
Multi-frame prediction for hybrid neural network/hidden Markov models, Vincent Vanhoucke, US Patent #8442821.
Activating content distribution, Vincent Vanhoucke, Michael H. Cohen, Manish G. Patel and Gudmundur Hafsteinsson, US Appl. 2009, WO Patent #2010/056874.
System and method for using image analysis and search in E-commerce, Salih Burak Gokturk, Baris Sumengen, Diem Vu, Navneet Dalal, Danny Yang, Xiaofan Lin, Azhar Khan, Mujal Shah, Dragomir Anguelov, Lorenzo Torresani, Vincent Vanhoucke, US Patent #8732030.
System and method for search portions of objects in images and features thereof, Salih Burak Gokturk, Baris Sumengen, Diem Vu, Navneet Dalal, Danny Yang, Xiaofan Lin, Azhar Khan, Munjal Shah, Dragomir Anguelov, Lorenzo Torresani and Vincent Vanhoucke, US Patents #7657126, #8345982, #9008435, WO Patent #2008/060919, EP Patent #2092444.
System and method for enabling image searching using manual enrichment, classification and/or segmentation, Salih Burak Gokturk, Baris Sumengen, Diem Vu, Navneet Dalal, Danny Yang, Xiaofan Lin, Azhar Khan, Munjal Shah, Dragomir Anguelov, Lorenzo Torresani and Vincent Vanhoucke, US Patents #7660468, #9082162.
System and method for enabling image recognition and searching of images, Salih Burak Gokturk, Baris Sumengen, Diem Vu, Navneet Dalal, Danny Yang, Xiaofan Lin, Azhar Khan, Munjal Shah, Dragomir Anguelov, Lorenzo Torresani and Vincent Vanhoucke, US Patent #7657100.
Computer-Implemented Method for Performing Similarity Searches, Vincent Vanhoucke, Salih Burak Gokturk, Dragomir Anguelov, Kuang-chih Lee, Munjal Shah, and Ashwin Tengli. US Patents #7760917, #8311289, #8989451, #9542419.
System and method for enabling the use of captured images through recognition, Salih Burak Gokturk, Dragomir Anguelov, Vincent Vanhoucke, Kuang-chih Lee, Diem Vu, Danny Yang, Munjal Shah, Azhar Khan, US Patents #7519200, #8649572, #8897505, WO Patent #2006/122164, EP Patent #1889207.
System and method for providing objectified image renderings using recognition information from images, Salih Burak Gokturk, Dragomir Anguelov, Vincent Vanhoucke, Kuang-chih Lee, Diem Vu, Danny Yang, Munjal Shah, Azhar Khan, US Patents #7783135, #8139900, #9171013.
System and method for recognizing objects from images and identifying relevancy amongst images and information, Salih Burak Gokturk, Dragomir Anguelov, Vincent Vanhoucke, Kuang-chih Lee, Diem Vu, Danny Yang, Munjal Shah, Azhar Khan, US Patent #7809192.
System and method for enabling search and retrieval from image files based on recognized information, Salih Burak Gokturk, Dragomir Anguelov, Vincent Vanhoucke, Kuang-chih Lee, Diem Vu, Danny Yang, Munjal Shah, Azhar Khan. US Patent #7809722.
System and method for displaying contextual supplemental content based on image content, Vincent Vanhoucke, Salih B. Gokturk, Munjal Shah, Julie Baumgartner, US Patents #8416981, #9047654, #9324006.
Techniques for enabling or establishing the use of face recognition algorithms, Salih Burak Gokturk, Dragomir Anguelov, Lorenzo Torresani, Vincent Vanhoucke, Munjal Shah, Diem Vu, Kuang-Chih Lee, US Patents #8385633, #8571272, #8630493, #9690979.
System and method for enabling image recognition and searching of remote content on display, Salih Burak Gokturk, Dan Chiao, Jacquie Phillips, Mark Moran, Vincent Vanhoucke, Azhar Khan, Xiaofan Lin, Munjal Shah, Andrew Miller, Navneet Dalal and Diem Vu. US Patents #8712862, #8732025.