Open Food Facts Images on AWS Open Dataset: The Ultimate Food Image Dataset
Hello everyone! We’re very excited to announce that Open Food Facts Images is now available on AWS Open Dataset. This opens up many possibilities for machine learning enthusiasts and researchers.
Open Food Facts is a non-profit organization on a mission to give people the power to understand and change the impact of food on our health and the planet. In a world where transparency and conscious eating matter more than ever, Open Food Facts believes that knowing what’s in our food should be a right for everyone.
Open Food Facts is also the biggest open food database, with over 2.9 millions products worldwide. All data, including images, come either from individual contributors using our app, partner apps or directly from producers that send us accurate and up-to-date data through our pro platform. Start-ups / apps reuse this data freely to inform consumers, while researchers use it to carry out important studies to advance scientific research.
We have made our massive collection of over 6.7 million food packaging images and extracted text easily accessible to everyone 🤩.
Let’s dive in and see possible usages of this dataset.
We have compiled the first-ever open dataset of food packaging images and made it available freely on AWS. Using Google Cloud Vision OCR, we extracted all texts from these images. We believe this dataset would mostly interest researchers, developers and machine learning practitioners with a keen interest in food 🥫.
🗄 To learn more about how to access this dataset on AWS, check out the documentation here.
At Open Food Facts, we leverage machine learning to enrich the product database at scale, and we used this dataset for various tasks, such as:
- automatic categorisation of products: we categorise automatically new products using various fields as inputs, including the product images
- automatic detection of brand and label logos: we detect automatically any logo present on packaging images, and Open Food Facts volunteers help us by validating the model predictions using our in-house annotation tool, Hunger Games
- detection of nutrition tables using an object detection model
There are many remaining challenges for us to tackle in computer vision, such as the automatic extraction of nutritional information or the detection of duplicated images. If you’re passionate about machine learning (computer vision, NLP,…) and want to join our collaborative effort for more food transparency, check out our AI wiki page and join our Slack, where most of our discussions take place.