KAIKAKU.AI’s Epicure compresses 4.14M recipes into 2MB map

London startup KAIKAKU.AI released Epicure, three ingredient AI models trained on 4.14 million recipes that encode 1,790 ingredients as a 2MB, 300-dimension vector map.

London startup KAIKAKU.AI published Epicure, a set of three ingredient AI models trained on 4.14 million multilingual recipes that represent 1,790 ingredients as 300-number vectors in a roughly 2-megabyte table.

The models were trained on recipes drawn from 11 datasets across seven languages. Each ingredient is stored as a 300-dimension vector; the storage calculation is 1,790 ingredients × 300 numbers × 4 bytes per number, or about 2.05 megabytes. The project is documented in a paper on arXiv by co-founder and CEO Josef Chen and researcher Jakub Radzikowski.

Epicure stores learned relationships between ingredients rather than full recipe text. The vectors encode which ingredients appear together in dishes, which share flavor compounds and which occur in similar culinary traditions. The team describes the result as a coordinate map that can be manipulated with mathematical operations instead of a repository of written recipes.

The paper describes a steering operator called SLERP rotation that moves an ingredient vector toward a culinary direction. For example, rotating the vector for chicken toward an East Asian axis surfaces soy sauce, ginger and sesame oil; rotating it in a direction associated with Tex-Mex surfaces tortillas, salsa and monterey jack.

Epicure is available in three variants that answer different questions. Cooc is trained on recipe co-occurrence and reflects what actually appears together in dishes. Chem is trained on flavor-chemistry data, including FlavorDB, and finds ingredients that share aroma compounds. Core blends the two approaches. The models return ingredient coordinates that can be queried for pairings or substitutions; asking Cooc for chocolate pairings yields pantry companions such as vanilla and almond, while Chem yields chemically related items such as toffee and ganache.

The models have a fixed vocabulary of 1,790 ingredients and do not generate freeform recipe text or invent new ingredients. The paper notes that this constraint limits output to known items and can reduce the risk of implausible or unsafe suggestions compared with larger generalist language models.

KAIKAKU.AI published the trained models on Hugging Face and launched an interactive ingredient map at epicure.kaikaku.ai. The team also released a message-passing client for agents. The full training code has not been made public.

The paper compares Epicure to earlier work such as FlavorGraph (2021), noting Epicure uses a multilingual corpus that is larger than prior datasets and a cleaned ingredient vocabulary aimed at efficient embedding. Possible uses identified by the authors include chef substitution tools, food product development for finding alternative ingredients, and recipe apps that provide consistent pantry swaps.

On X, Josef Chen wrote: “4.1M recipes. 7 languages. 1,790 ingredients. 300 dimensions. All of human cooking compressed into 2 megabytes.” The researchers present Epicure as a research release and invite use of the public models and interactive map for further exploration.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author