On taxonomies

On taxonomies

This blog entry is written by Yukti (who joins us with the Outreachy program). She kindly gave us permission to share it here as we think it explains the topic very well. Thank you very much Yukti! (Link to the original article “Everybody Struggles”).


We are now in Week 3 of the Outreachy internship! A lot of points I’ll touch on in the Blog have been inspired by some of the writing prompts from the Outreachy organizers. In this blog, I will be writing about an open source vocabulary term that was new for me. During the Outreachy Application Period.

There were a ton of words/concepts that I was confused with during the Outreachy application period. The word that sticks out to me the most right now is Taxonomy (plural: Taxonomies). As a newcomer to the Open Food Facts community, I noticed this term being talked about a lot, mentioned in issues and pull requests. It didn’t take too long for me to realize that this term should be an important one in the context of the project.

Source: Google

This definition seems to make it even more confusing right? (after all, it’s not Open Animal Facts, it’s Open Food Facts). At that point, I could vaguely make out what this term may mean outside of a biological context but still wasn’t able to really relate it to the project.

Moving forward

Admittedly, I was hesitant to ask a community member to just “define” it for me because the term felt so central to this project that it only felt fair to give myself a few days, explore the codebase and see how stuff works around. Very naturally, this word (and several others concepts) started making sense as I became more familiar with the repository’s structure over the next few days.

So… what are taxonomies ? (Link to the wiki page too!)

Taxonomies are at the heart of the Data Structure in Open Food Facts. These are raw text files (.txt) that mainly contain translations, classifications, labels, ingredients lists and hierarchies.

For simplifying it a bit, their function can be broadly broken down into 2 parts,

  • Classification: We can use taxonomies to establish hierarchies for consumables. This is best explained by an example: Let’s take a look at this “Nutella Biscuits” product page linked here.
Note: These categories are listed in a hierarchal order. This product is a “Snack” first and within snacks, a “Sweet snack”. Further, we could say it belongs to the Biscuits subclass and going down the “Biscuits” classification, it’s a “Filled Biscuit“.
  • Translation: With these we can maintain a list of translations for ingredients, certain phrases, measuring units, nutrients and countries. If you are bilingual like me, you may already know that Google Translate just doesn’t cut it sometimes. There could be local variations to names, alternate spellings, uncommon ingredients, synonyms (alternate names) and many more cases where using a translation tool wouldn’t solve our problem.

Something cool about Taxonomies

You can contribute to them regardless of your familiarity with Coding. Yes, you heard that right! If you find something you wish to correct or add to the Taxonomies, all you need to do is edit/make your addition to the text files linked here and propose your change through a pull request. (linked to a descriptive guide)

Parting words

I will soon be coming up with a blog that’s specific to contributing to the Open Food Fact taxonomies for someone uninitiated with coding and/or Git and GitHub.

Stay Tuned for that! 🙂

Signing off, Yukti

Comments: 3

  1. […] La communauté Open Food Facts via Slack, qui a participé activement à la mise à jour des produits ou encore l’amélioration des taxonomies   […]

  2. […] The Open Food Facts community via Slack, which actively participated in adding / updating products and improving the taxonomies   […]

  3. […] Remark: when we refer to an ingredient (or a label, or a category) and its translation in all languages, we talk about “taxonomy” (see the blog entry of Yukti: https://blog.openfoodfacts.org/en/news/on-taxonomies) […]

Add your comment