Prove that you are computer vision experts!
Face off against your fellow Computer Vision and Machine Learning enthusiasts for a chance to present your solutions to top industry Venture Capitalists, Angel Investors, startups and major companies that are recruiting talented people. Last year several competitors have been hired and/or have met others with whom they are now building companies.
We call these Entrepreneurial Computer Vision challenges because they are not strictly analyzed on their quantitative results. You can be as creative as you wish. You can focus solely on your algorithms to solve the challenges or you can combine your code with third party APIs or SDKs to improve your solution and impress your colleagues, investors and the judges.
You will have 2 months to work on these challenges.
These Entrepreneurial Computer Vision Challenges have been created by:
Computer Vision Program: Serge Belongie, Cornell NYC Tech
Computer Vision Advisors: Jan Erik Solem, Mapillary; Silvio Savarese, Stanford University; Peter Welinder, Dropbox; Kristen Grauman, Ira Kemelmacher, U. Washington, University of Texas and coordinator Genevieve Patterson, Graduate Student Brown University
Judges:
Peter Welinder - Dropbox, Engineer. Sold Anchovi Labs > Dropbox
Patrick Eggen - Qualcomm Ventures
Pete Warden - Google, Engineer. Sold Jetpac > Google
Geoff Judge - iNovia Capital, Partner
Serge Belongie - Professor, Computer Vision
Jan Erik Solem - Mapillary, CEO. Sold Polar Rose > Apple
Moshe Bercovich - Shutterfly, GM. Sold Photoccino > Shutterfly
Kristen Grauman - U. Texas at Austin, Assoc. Professor
Mor Naaman - Cornell Tech, Seen.co, Co-Founder
Andrea Frome - Google Brain, Computer Vision
Ramesh Raskar - MIT Media Lab, Assoc. Professor
Silvio Savarese - Stanford U., Assist. Professor
Avi Muchnick - Adobe, Director of Products. Sold Aviary > Adobe
Ira Kemelmacher - U. of Washington, Assist. Professor
Samson Timoner - Scalable Display Tech., CTO
Sean Ammirati - Birchmere Ventures, Partner
Evan Nisselson - LDV Capital, Partner
A selection of sub-finalists will receive remote mentoring by Evan Nisselson and invited to in person coaching during the final selection process May 18. Finalists will be invited to present during Day 1 of the Summit in front of ~ 200 attendees & judges. We will select a smaller set of finalists who will present on Day 2 in front of ~400 attendees. The winner will be announced at the end of the Summit.
Competitors are invited to submit solutions to any of the challenges below. Submissions will be scored on creativity, marketability, and general awesomeness alongside conventional measures of accuracy. Competitors have to pay for their own travel expenses. Finalist teams will all receive two free tickets to attend the Summit.
World renowned entrepreneurs, computer vision experts and investors will be judges.
Challenge 1:
Estimate how often a photo will be re-shared
Input: Images submitted to the social platform of your choice. Image metadata such the user who posted the image or the associated GPS coordinates can be used to determine re-share score.
Output: Predicted number of re-shares. For example, “Test Image 1: ~1000 re-shares.”
Training data: Receive access to datasets, API's & SDK's after completing your application.
Challenge 2:
Estimate overall appeal of an image.
This challenge is recommended in conjunction with Challenge 1.
Input: User submitted images to social platform <Name Here>.
Output: A rating for how appealing or interesting the input image is. Competitors can also estimate the genre of the image. For example “Test Image 1: 4 out of 5 stars. Romantic. Beach Scene.” or “Test Image 2: 2 out of 5 stars. Blurry, poorly-lit selfie.”
Training data: Receive access to datasets, API's & SDK's after completing your application.
Challenge 3:
Estimate the price of a home or property.
Input: A single or small set of images of a for-sale property. Metadata such as neighborhood and street address.
Output: Estimated Sale Price
Training data: Receive access to datasets, API's & SDK's after completing your application.
Example Pipeline: Image + Neighborhood Data = Price Estimate
Challenge 4:
Show compelling ways to combine photos with maps
Users the world over are embracing location based services and mobile applications that take advantage of geotagged images. This challenge is an opportunity to think of a clever, appealing application that involves geolocation and user created images.
Input: Geotagged photos from photo services, whatever metadata you find relevant (extracted from the photos or elsewhere).
Output: A unique or cool visualization combining photos and map data (Web or mobile).
Training data: Competitors are encouraged to explore any social media API, academic research dataset, or open API such as Flickr, Mapillary, Instagram for photos, Mapbox, Google for map data. Receive access to datasets, API's & SDK's after completing your application.
Example Pipeline: sm.rutgers.edu/thebeat/
Challenge 5:
Create a personalized video search engine.
Make a tool that enables consumers to search their personal videos by text search string. We also recommend that competitors consider having an interactive UI where users can begin a search with a text string and then directly interact with video or still frames to find what they are looking for.
Input: Text Query (ex. ‘dog with a tiny hat’) and personal collection of videos
Output: Locations in video or short video sequences likely to contain the query item(s)
Training data: User Provided or from past TRECVID or MMM VSS competitions. Receive access to datasets, API's & SDK's after completing your application.
Example Pipeline: Input Text Query -> Output: List of frames in video likely to contain the query
Challenge 6:
#LDVvisionHack - create a unique new prototype for any business
Leverage one or more of the API's and SDK's listed on this page or any others.
Working Prototype
Training data & Tools: Receive access to datasets, API's & SDK's after completing your application.
Good luck!
Thanks,
Serge, Jan Erik, Evan, Genevieve, Silvio, Peter, and Raquel
APIs and SDKs:
Apply to gain access to these:
Youtube: Do want to enhance your app with video, a rich set of YouTube APIs can bring your products to life.
Newaer: Proximity Platform SDK.Looking for a lightweight way to make your app context aware that works across all types of devices and platforms.
NYC Open Data: makes the wealth of public data generated by various New York City agencies and other City organizations available for public use.
Mapillary: Crowdsourced Street Level Photos Discover places, get inspired, and capture the world around you.
Instagram: You can surface the amazing content Instagram users share every second, in fun and innovative ways.
Twilio: Powering Modern Communications. Build the next generation of Voice and SMS applications.
Zillow: API Network turns member sites into mini real estate portals by offering fresh and provocative real estate content to keep people coming back.
Trulia: A real estate search engine that provides buyers with information about homes for sale, real estate trends and local market information.
Clarifai: Bring the future into focus with our world class visual recognition system. Make sense of your data.
Pond5: The Marketplace for Creativity. Royalty-free footage, audio, images, and visual effects.
Vimeo: Many ways you can interact with Vimeo programatically.
Getty: Seamlessly integrate Getty Images' expansive digital imagery, powerful search technology and rich metadata into your publishing tools, products and services.
Shutterstock: Over 40 Million Stock Photos, Vectors, Videos, and Music Tracks.Find everything you need for your creative projects.
Mapbox: Whether you're a developer or designer, realtor or runner, the Mapbox stack is equipped with tools for quickly sharing custom maps.
Twitter Fabric solves this problem by combining all seven of our SDKs under one roof and organizing them into three Kits.
Google Maps has a wide array of APIs that let you embed the robust functionality and everyday usefulness of Google Maps into your own website or mobile.
Facebook: Grow your app and get more installs with mobile app install ads and engagement ads.
Flickr: API provides the ability to view, manipulate, and search photo tags, display photos from a specific user or group, retrieve tags to construct URLs to particular photos or photo group.
Pinterest: is a visual discovery tool that you can use to find ideas for all your projects and interests.
Happy to review others to be added to this list. Contact us.