Computer Vision-Powered Solutions to Create Breakthrough User Experiences
Samir Kumar is a Managing Director at M12 – Microsoft's Venture Fund. He is leading investment activities in emerging tech areas including quantum computing and autonomous vehicles. Prior to joining M12, Samir worked in Qualcomm’s corporate R&D division, leading early-stage product validation, partnerships, acquisition, and strategy for embedded and on-device deep learning. Samir started his career at Microsoft, leading product management and planning efforts for enterprise mobility before heading to Palm and Samsung.
Samir will share his insights on trends and investment opportunities in visual technologies as a panelist at our 7th Annual LDV Vision Summit.
Evan Nisselson, General Partner at LDV Capital, asked Samir some questions about his experience investing in visual tech.
Evan: You have a degree in mechanical engineering, have led product management/product planning efforts for enterprise mobility at Microsoft as well as at Palm and Samsung. Before joining M12 you were a senior director of product management in Qualcomm’s R&D division. Which aspects of your expertise do you believe helps you empower entrepreneurs to succeed and why?
Samir: When working with entrepreneurs in M12’s portfolio, I draw on my own product management and product planning lessons learned—especially when it comes to competitive strategy, customer focus, and roadmap definition. My experience leading Business Development at Qualcomm provided me with a robust understanding of how startups can partner with big corporations. I’ve witnessed firsthand that these partnerships can accelerate a startup’s trajectory or devastate a startup’s resources. At M12, we’ve developed our value-add platform to help startups take advantage of Microsoft’s scale, and avoid those pitfalls whenever possible.
And of course, my mechanical engineering degree instilled a strong capacity for analytical problem solving; that skill set and vocabulary has enabled me to connect with many brilliant engineers and researchers as both an operator and a VC.
Evan: M12 has invested in many visual technology businesses. Could you give a couple of examples of visual tech companies you have invested in and how they are uniquely creating and/or analyzing visual data?
Samir: Netradyne offers deep learning-powered computer vision at the edge to generate analytics and promote safer driving behavior. They’ve built the equivalent of a cloud-connected flight data recorder for roadgoing vehicles. They’ve already analyzed more than 3B minutes of driving data, 1B miles of roads and all with deep learning-based computer vision running entirely at the edge.
TwentyBN is using video instead of still images to create more natural and intelligent human-AI interactive experiences. They’ve developed a novel approach to generating real-world labeled training data—called CrowdActing—for deep learning models acting on video data. This has enabled them to create a breakthrough experience for a fitness workout use case via an AI-powered personal trainer that runs entirely on the edge and on your Android or iOS device.
Wandelbots is democratizing programming industrial robots with a no-code/low code and incorporates vision-based sensors for robots to make online adjustments to the precision with which they perform a task.
Evan: In the next 10 years - which business sectors will be the most disrupted by computer vision and why?
Samir: Transportation, healthcare, manufacturing, and process automation will be hugely impacted by computer vision, across different verticals. The centrality of cameras for automotive driver assist or more advanced levels of autonomy is well understood at this point; the average number of cameras per vehicle trending around 6-8. But autonomous or semi-autonomous operations in other transportation modalities like aviation or maritime are also benefiting from computer vision.
Medical imaging is front and center for computer vision applications in healthcare, but I’m also excited about vision as a catalyst for remote diagnostics and patient monitoring—especially during COVID when social distancing measures are in place.
We are still in the early days of process optimization scenarios in manufacturing—including processes performed by humans as well as machines—but the potential is huge. The idea of generating analytics and ultimately predictive analytics from real-world physical data using vision will really start to pan out in this decade.
A nascent area with incredible business value opportunity is “in sensor processing” where vision + AI-generated analytics are taking place right at the point of capture and in the visual sensor hardware itself. I think this will have a transformative effect on the pervasiveness of intelligence derived from computer vision. Technologies like in sensor processing may also give a boost to meeting privacy challenges associated with computer vision technologies, where you can gather analytics without using personally identifiable information.
Evan: LDV Capital started in 2012 with the thesis of investing in people building businesses powered by visual technologies and some said it was “cute, niche and science fiction.” How would you characterize visual technologies today and tomorrow?
Samir: Visual technology is certainly no longer niche, and even the cute cases—like filters in our social media apps—have a tremendous amount of technology behind them to make them work so well for consumers. Computer vision-powered solutions are already core to many safety and security scenarios and have created breakthrough new user experiences, like frictionless checkout in retail environments.
Deep fakes—an area once thought to be exclusively in the domain of science fiction and in the hands of high-end special effects creators are being democratized, for better or worse. The intersection of generative AI incorporating multiple modalities of vision, language and audio…let’s just say we ain’t seen nothing yet!
Of course, something like the Holodeck from Star Trek remains in the science fiction realm but likely serves as a real guidepost for those working to advance the state of the art AR/VR experiences. Well before the Holodeck, it’s easy to imagine how much smarter our AI assistants on our phones and smart speakers could be if they understood visual context).
Evan: What are you most looking forward to at our 7th annual LDV Vision Summit?
Samir: LDV Vision Summit always brings together best-in-class entrepreneurs, researchers, investors, and product leaders from big and small companies. I always enjoy reconnecting and extending my network at the event. And of course, I’m so excited to see new applications of visual tech from startups and new capabilities getting developed by larger companies.