Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models

This repository contains code and technical details for the paper:

Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models (website)

Presented at ICRA 2024 in Yokohama, Japan (05/13/2024-05/17/2024).

Authors: Angelos Mavrogiannis, Christoforos Mavrogiannis, Yiannis Aloimonos

Please cite our work if you found it useful:

@INPROCEEDINGS{10611086,
  author={Mavrogiannis, Angelos and Mavrogiannis, Christoforos and Aloimonos, Yiannis},
  booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)}, 
  title={Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models}, 
  year={2024},
  volume={},
  number={},
  pages={17679-17686},
  keywords={Runtime;Grounding;Large language models;Formal languages;Linguistics;Feature extraction;Libraries},
  doi={10.1109/ICRA57147.2024.10611086}}

Overview

Cook2LTL is a system that receives a cooking recipe in natural language form, reduces high-level cooking actions to robot-executable primitive actions through the use of LLMs, and produces unambiguous task specifications written in the form of LTL formulae.

System Architecture

The input instruction $r_i$ is first preprocessed and then passed to the semantic parser, which extracts meaningful chunks corresponding to the categories $\mathcal{C}$ and constructs a function representation $\mathtt{a}$ for each detected action. If $\mathtt{a}$ is part of the action library $\mathbb{A}$, then the LTL translator infers the final LTL formula $\phi$. Otherwise, the action is reduced to a lower-level of admissible actions ${a_1,a_2,\dots a_k}$ from $\mathcal{A}$, saving the reduction policy to $\mathbb{A}$ for future use. The LTL translator yields the final LTL formulae from the derived actions.

Named Entity Recognition

The first step of Cook2LTL is a semantic parser that extracts salient categories towards building a function representation $\mathtt{a}=\mathtt{Verb(What?,Where?,How?,Time,Temperature)}$ for every detected action. To this end, we annotate a subset (100 recipes) of the Recipe1M+ dataset and fine-tune a pre-trained spaCy NER BERT model to predict these categories in unseen recipes at inference. To annotate the recipes we used the Brat annotation tool (installation instructions). To fill in context-implicit parts of speech (POSs), we use ChatGPT or manually inject them into the sentence. The resulting edited recipes can be found in the edited_recipes.csv file. The code to train the NER model is in train_ner.py.

Reduction to Primitive Actions

Action Reduction

If $\mathtt{a}\notin\mathcal{A}$, we prompt an LLM with a pythonic import of the set of admissible primitive actions $\mathcal{A}$ in the environment of our use case, two example function implementations using actions from this set, and the function representation $\mathtt{a}$ in the form of the pythonic function along with its parameters. This is implemented in the action_reduction function by calling gpt-3.5-turbo through the OpenAI API. The new action is added to the pythonic import, enabling the system to invoke it in future action reductions.

Note: An OpenAI API key is required. You can create an account and set up API access here.

Action Library

Every time that we query the LLM for action reduction, we cache $\mathtt{a}$ and its action reduction policy (in the cache_action function) to an action library $\mathbb{A}$ for future use through a dictionary lookup manner. At runtime, a new action $\mathtt{a}$ is now checked against both $\mathcal{A}$ and $\mathbb{A}$. If $\mathtt{a}\in\mathbb{A}$, which means that $\mathtt{Verb}$ from $\mathtt{a}$ matches the $\mathtt{Verb}$ from an action $\mathtt{a}_\mathbb{A}$ in $\mathbb{A}$ and the parameter types match (e.g. both actions have $\mathtt{What?}$ and $\mathtt{Where?}$ as parameters), then $\mathtt{a}$ is replaced by $\mathtt{a}_\mathbb{A}$ (reuse_cached_action function) and its LLM-derived sub-action policy, and passed to the subsequent LTL Translation step.

Note: A new action library is initialized at runtime but the user can also load and build upon an existing action library (load_action_library function).

LTL Translation

We assume the following specification pattern for executing a set of implicitly sequential actions ${\mathtt{a}_1,\mathtt{a}_2,\dots,\mathtt{a}_n}$ found in a recipe instruction step: $$\mathcal{F}(\mathtt{a}_1\wedge \mathcal{F}(\mathtt{a}_2\wedge\dots\mathcal{F}\mathtt{a}_n)))$$ The parse_chunks function scans every chunk extracted from the NER module for instances of conjunction, disjunction, or negation, and produces an initial LTL formula $\phi$ using the above equation and the operators $\mathcal{F},\wedge,\vee,\lnot$. Following the action reduction module, the decode_LTL function substitutes any reduced actions to the initial formula $\phi$. For example, if $\mathtt{a}\rightarrow{a_1,a_2,\dots,a_k}$, then $\mathcal{F}\mathtt{a}$ is converted to the formula $\mathcal{F}(a_1\wedge \mathcal{F}(a_2\wedge\dots\mathcal{F}a_n)))$. In cases where the $\mathtt{Time}$ parameter includes explicit sequencing language, the LLM has been prompted to return a $\mathtt{wait}$ function, which is then parsed into the ($\mathcal{U}$ : $\mathtt{until}$) operator and substituted in $\phi$.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
brat		brat
images		images
ner_500_md		ner_500_md
LICENSE		LICENSE
README.md		README.md
ai2thor_prompt.txt		ai2thor_prompt.txt
cook2ltl.py		cook2ltl.py
edited_recipes.csv		edited_recipes.csv
kitchen_env.py		kitchen_env.py
prompt.txt		prompt.txt
requirements.txt		requirements.txt
train_ner.py		train_ner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models

Overview

System Architecture

Named Entity Recognition

Reduction to Primitive Actions

Action Reduction

Action Library

LTL Translation

About

Releases

Packages

Languages

License

angmavrogiannis/Cook2LTL

Folders and files

Latest commit

History

Repository files navigation

Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models

Overview

System Architecture

Named Entity Recognition

Reduction to Primitive Actions

Action Reduction

Action Library

LTL Translation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages