Hi Friends
Today we are going to see a interesting paper came out from Facebook group.
Link to the paper and code can be found at the description section.
The paper which we are going to discuss is "Inverse Cooking: Recipe Generation from Food Images".
We have divided the whole paper into 5 segments. Those are Abstract, Introduction, Approaches, Experiments and Conclusion.
Lets's start with Abstract.
The paper tries to introduce an inverse cooking system that takes a food image and recreates cooking recipes consisting of a title, ingredients and sequence of cooking instructions.
The system predicts ingredients as sets, modeling their dependencies without imposing any order, and then generates cooking instructions by taking both image and its inferred ingredients simultaneously.
Outcome
Improve performance w.r.t. previous baselines for ingredient prediction.
Able to obtain high quality recipes by leveraging both image and ingredients.
Able to produce more compelling recipes than image-to-recipe retrieval approaches according to human judgment.
Summary
Evaluate the whole system on the large-scale Recipe of 1M data set.
Whole code and models publicly available at GitHub. Link is provided.
Introduction
Argument
Food is fundamental to human existence.
Food culture has been spreading more than ever in the current digital era,
with many people sharing pictures of food they are eating across social media.
Instagram queries for #food leads to at least 300M posts; similarly, searching for #foodie results in at least 100M posts,
highlighting the unquestionable value that food has in our society.
In the past, food was mostly prepared at home, but nowadays we frequently consume food prepared by third parties (e.g. takeaways, catering and restaurants).
Thus The access to detailed information about prepared food is limited and, as a consequence, it is hard to know precisely what we eat.
Therefore, The paper argues that there is a need for inverse cooking systems, which are able to infer ingredients and cooking instructions from a prepared meal.
Challenges
The last few years, we have seen outstanding improvements in visual recognization tasks such as natural image classification, object detection and semantic segmentation.
However, when comparing to natural image understanding, food recognization poses additional challenges, 
since food and its components have high intra- class variability and present heavy deformations that occur during the cooking process.
Ingredients are frequently occluded in a cooked dish and come in a variety of colors,forms and textures.
Further it needs prior knowledge. e.g. cake will likely contain sugar and not salt etc.
Previous efforts
Traditionally, the image-to-recipe problem has been formulated as a retrieval task where a recipe is retrieved from a fixed data set based on the image similarity score in an embedding space.
The performance of such systems highly depends on the data set size and diversity, as well as on the quality of the learned embedding.
These systems fail when a matching recipe for the image query does not exist in the static data set.
How to overcome these issues:
An alternative to overcome the data set constraints of retrieval systems is to formulate the image-to-recipe problem as a conditional generation one.
Therefore, The paper presents a system that generates a cooking recipe containing a title, ingredients and cooking instructions directly from an image.
It poses the instruction generation problem as a sequence generation one conditioned on two modalities simultaneously, namely an image and its predicted ingredients.
It formulates the ingredient prediction problem as a set prediction, exploiting their underlying structure and
models ingredient dependencies while not penalizing for prediction order, thus revising the question of whether order matters.
Summary
The paper presents an inverse cooking system, which generates cooking instructions conditioned on an image and its ingredients, exploring different attention strategies.
It exhaustively study ingredients as both a list and a set, and 
proposes a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order.
It also demonstrates the superiority of our proposed system against image-to-recipe retrieval approaches.
Thanks for watching this video. 
The link to the paper and code can be found at the description section.
Don't forget to subscribe this channel to see new videos.
Bye.
