FluentAI: Learn languages in a flash

Caution

This project is currently under development, please see the issues to see everything that still needs to be done before this is ready to use.

Introduction

FluentAI is inspired by the method detailed in the paper SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and Visual Cues by Jaewook Lee and Andrew Lan. The aim is to recreate their approach using accessible, open-source models. The pipeline they propose, as shown below, serves as the blueprint for our project. It illustrates the process of automating language learning, blending cutting-edge AI techniques with proven language learning methodology. For the architectural overview view our Figma board

You can find the list of supported languages here.

Mnemonic Word Generation 🏭

In the image below you can see a more detailed process of deriving the mnemonic word, which is the core of the project. The mnemonic word is a word that is easy to remember and that is associated with the word you want to learn. This is done by using a pre-trained model to generate a sentence that is then used to generate a mnemonic word. In the image above this is referred to as "TransPhoner", as this is where the image below is derived from.

Imageability

The imageability of a word is a measure of how easily a word can be visualized. This is important for the mnemonic word, as it should be easy to visualize. To determine the imageability of a word, we train a model on this dataset. It includes the embeddings for each word and their imageability score. The embeddings are generated by the FastText model and these embeddings can be used to predict the imageability of words that are not in the dataset.

Phonetic Similarity

The phonetic similarity of a word is a measure of how similar the pronunciation of two words is. This is important for the mnemonic word, as it should be easy to remember. Therefore we use this to determine which English words should be considered for the mnemonic word. We use the CLTS and PanPhon models to generate the feature vectors of the IPA representation of the words. These feature vectors are then used to calculate the phonetic similarity between the words. We use faiss to speed up the search for the most similar words.

Orthographic Similarity

The orthographic similarity of a word is a measure of how similar the spelling of two words is. This is a very simple process and the user can select a few methods that they'd like to use.

Semantic Similarity

The semantic similarity of a word is a measure of how similar the meaning of two words is. The FastText model is used to generate the embeddings of the words and these embeddings are used to calculate the semantic similarity between the words.

Best Mnemonic Word

To determine the best mnemonic word, we use the methods described above. The results of each method are given as a score (between 0 and 1) and these scores are combined to determine the best mnemonic word. The user can select the weights of each method to determine how important each method is.

Mnemonic Image Generation

TODO

Prerequisites 📋

Before starting, make sure you have the following requirements:

Anki installed on your device.
Anki-Connect this add-on allows you to add cards to Anki from the command line.
Add the deck in /deck/FluentAI.apkg to your Anki application. You can do this by dragging and dropping the file into the Anki application.

Installation ⚙️

The required packages to run this code can be found in the requirements.txt file. To run this file, execute the following code block after cloning the repository:

pip install -r requirements.txt

or

pip install git+https://github.com/StephanAkkerman/FluentAI.git

GPU Support

If you would like to use a GPU to run the code, you can install the torch package with the CUDA support. You can find the installation instructions here.

Usage ⌨️

TODO

Citation ✍️

If you use this project in your research, please cite as follows:

@misc{FluentAI,
  author  = {Stephan Akkerman, Winston Lam},
  title   = {FluentAI},
  year    = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/StephanAkkerman/FluentAI}}
}

Contributing 🛠

Contributions are welcome! If you have a feature request, bug report, or proposal for code refactoring, please feel free to open an issue on GitHub. We appreciate your help in improving this project. If you would like to make code contributions yourself, please read CONTRIBUTING.MD.

License 📜

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
.github		.github
deck		deck
fluentai		fluentai
img		img
logs		logs
requirements		requirements
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
setup.py		setup.py
supported-languages.md		supported-languages.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FluentAI: Learn languages in a flash

Introduction

Table of Contents 🗂

Mnemonic Word Generation 🏭

Imageability

Phonetic Similarity

Orthographic Similarity

Semantic Similarity

Best Mnemonic Word

Mnemonic Image Generation

Prerequisites 📋

Installation ⚙️

GPU Support

Usage ⌨️

Citation ✍️

Contributing 🛠

License 📜

About

Releases

Packages

Contributors 5

Languages

License

StephanAkkerman/FluentAI

Folders and files

Latest commit

History

Repository files navigation

FluentAI: Learn languages in a flash

Introduction

Table of Contents 🗂

Mnemonic Word Generation 🏭

Imageability

Phonetic Similarity

Orthographic Similarity

Semantic Similarity

Best Mnemonic Word

Mnemonic Image Generation

Prerequisites 📋

Installation ⚙️

GPU Support

Usage ⌨️

Citation ✍️

Contributing 🛠

License 📜

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages