Building a Global Marketplace to Commoditize AI, Synthetic Data is Key
/Yashar Behzadi is the CEO of Neuromation, he is an experienced entrepreneur who has built transformative businesses in the AI, medical technology, and IoT space. At our 2018 LDV Vision Summit, Yashar spoke about how Neuromation aims to democratize artificial intelligence through the use of synthetic data and distributed computing to dramatically reduce the cost of development. Through its token-based global marketplace, Neuromation connects AI talent, data providers, and customers to enable the development of novel AI solutions. He shared how their marketplace and token functions, as well as synthetic data, will impact our industry today and in the future.
It's great to be here. As mentioned, I'm Yashar Behzadi, the CEO of Neuromation. This should be a familiar image for everybody. We know that major platform companies maintain their competitive advantage by building walls around their ecosystems and data moats to protect themselves, especially with data-hungry deep learning applications. Neuromation's vision is to marketize AI, level the playing field, and enable everybody to contribute to and benefit from AI.
So how do we do that? Well, we've raised about $50 million to bring our vision to the world, and it starts with first building a global marketplace. Some of the trends that drive us to think about a marketplace for AI is the commoditization of a lot of these algorithms and applications. The freeing of data as many people are starting to want to share their data and bringing this data to bear or monetize their data assets. I think the most powerful kind of trend here is that software engineers are becoming AI practitioners. Right? Now you're looking at a pool of people who are orders of magnitude bigger than the current AI practitioner space.
What happens when millions of people from across the world - in centers of excellence that we're finding in Eastern Europe, in China, in all these different places - are developing these skills and now have access to data as well as have access to a global demand to the enterprise. That's where we're building the nexus is this overall marketplace, to connect the dots.
Part of this will be the data side of things, and we've developed a number of tools to make it simpler to build AI applications, just out of pragmatism. It's hard to get data sometimes for some of these applications. I'll talk about this a little bit more in detail. So where there's open source data, proprietary data, individuals' data, or synthetic data, we allow that to be in our platform, integrated in a simple way, and connected with the overall model development system and the AI developers to streamline the overall development.
All of this is enacted through a token exchange, so we did an ICO, as well, and built a virtual economy around this. This allows for a number of key advantages. One is very granular control of individuals' data and ability to share and withdraw access controls. And global microtransactions are a very big part of this. You can have a model. You can be a grad student. You can put your AI model on our platform. One person can call from the other side of the world and get paid on that one transaction, very seamlessly, very easy, and then you can use that to buy data assets or other things on the platform, so it creates a very simple and easy kind of method for transactions.
It’s automated. It’s cheap. And probably the most powerful element of synthetic data is that its combinatoric power adds robustness to any application.
Synthetic data, as I mentioned, is one of the key technologies we're building to enable this. We have some other things in the lab, as well. And synthetic data is a very interesting place to start because it breaks down the barriers with these data moats and allows small companies to get the data they need for specific applications to then compete and win against the larger platform companies. And it allows for 100% pixel perfect labels. It's automated. It's cheap, and probably the most powerful element of synthetic data that we heard about yesterday, as well, is that the combinatoric power adds robustness to any application.
By being able to take an object, multiplying the number of objects by the number of environments, by the number of different camera attributes, to a number of landscapes and things that you may want to embed that object in, and combinatorially, you can create billions of images that supplement and make your applications even more robust or address issues of bias and other things we've talked about earlier today.
We find that there's kind of a natural trajectory for using synthetic data. The first place to use it is where the models are very simple and the object models are very simple. So in a retail application, we're working with a large retailer, they have 200,000 SKUs, they're rapidly changing, they have various different shelf arrangements ... It's kind of impractical and very costly to generate the data necessary to do the traditional deep-learning models.
But with synthetic data, we can generate all the SKUs because it's easy to create a realistic model of a particular consumer good. It's well-described. And we can build this application and have it perform on par with using traditional data methods. The next kind of evolution of synthetic data will be around simulation environments. I think progression is more generalized models in which you look at kind of how well your model does in classifying specific objects, and then have the synthetic data automatically generate new data to make it more robust against with those particular edge cases.
Broad set of applications, as I mentioned, across a variety of fields and a lot of these we're developing in-house currently or in partnership with partners. The first task is enabling applications to be done faster and cheaper than traditional methods. The second one is in applications in which the event that you're trying to estimate is actually very rare. So it's actually hard and impractical to get the amount of data that you need, and the number of things in that category. The third is any data application that can benefit from the variety that's added by the combinatoric power of a synthetic data hardens it.
I'm very happy to be here, and this is a great conference. And if you guys want to talk more about contributing to our marketplace or benefiting from our marketplace, please talk to me. Thank you.