Skip to content

Automating language learning with the power of Artificial Intelligence. This repository presents FluentAI, a tool that combines Fluent Forever techniques with AI-driven automation. It streamlines the process of creating Anki flashcards, making language acquisition faster and more efficient.

License

Notifications You must be signed in to change notification settings

StephanAkkerman/FluentAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FluentAI: Learn languages in a flash

FluentAI Banner

Supported versions License Code style: black


Caution

This project is currently under development, please see the issues to see everything that still needs to be done before this is ready to use.

Introduction

FluentAI is inspired by the method detailed in the paper SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and Visual Cues by Jaewook Lee and Andrew Lan. The aim is to recreate their approach using accessible, open-source models. The pipeline they propose, as shown below, serves as the blueprint for our project. It illustrates the process of automating language learning, blending cutting-edge AI techniques with proven language learning methodology. For the architectural overview view our Figma board

You can find the list of supported languages here.

image

Table of Contents πŸ—‚

Mnemonic Word Generation 🏭

In the image below you can see a more detailed process of deriving the mnemonic word, which is the core of the project. The mnemonic word is a word that is easy to remember and that is associated with the word you want to learn. This is done by using a pre-trained model to generate a sentence that is then used to generate a mnemonic word. In the image above this is referred to as "TransPhoner", as this is where the image below is derived from.

image

Imageability

The imageability of a word is a measure of how easily a word can be visualized. This is important for the mnemonic word, as it should be easy to visualize. To determine the imageability of a word, we train a model on this dataset. It includes the embeddings for each word and their imageability score. The embeddings are generated by the FastText model and these embeddings can be used to predict the imageability of words that are not in the dataset.

Phonetic Similarity

The phonetic similarity of a word is a measure of how similar the pronunciation of two words is. This is important for the mnemonic word, as it should be easy to remember. Therefore we use this to determine which English words should be considered for the mnemonic word. We use the CLTS and PanPhon models to generate the feature vectors of the IPA representation of the words. These feature vectors are then used to calculate the phonetic similarity between the words. We use faiss to speed up the search for the most similar words.

Orthographic Similarity

The orthographic similarity of a word is a measure of how similar the spelling of two words is. This is a very simple process and the user can select a few methods that they'd like to use.

Semantic Similarity

The semantic similarity of a word is a measure of how similar the meaning of two words is. The FastText model is used to generate the embeddings of the words and these embeddings are used to calculate the semantic similarity between the words.

Best Mnemonic Word

To determine the best mnemonic word, we use the methods described above. The results of each method are given as a score (between 0 and 1) and these scores are combined to determine the best mnemonic word. The user can select the weights of each method to determine how important each method is.

Mnemonic Image Generation

TODO

Prerequisites πŸ“‹

Before starting, make sure you have the following requirements:

  • Anki installed on your device.
  • Anki-Connect this add-on allows you to add cards to Anki from the command line.
  • Add the deck in /deck/FluentAI.apkg to your Anki application. You can do this by dragging and dropping the file into the Anki application.

Installation βš™οΈ

The required packages to run this code can be found in the requirements.txt file. To run this file, execute the following code block after cloning the repository:

pip install -r requirements.txt

or

pip install git+https://github.com/StephanAkkerman/FluentAI.git

GPU Support

If you would like to use a GPU to run the code, you can install the torch package with the CUDA support. You can find the installation instructions here.

Usage ⌨️

TODO

Citation ✍️

If you use this project in your research, please cite as follows:

@misc{FluentAI,
  author  = {Stephan Akkerman, Winston Lam},
  title   = {FluentAI},
  year    = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/StephanAkkerman/FluentAI}}
}

Contributing πŸ› 

Contributions are welcome! If you have a feature request, bug report, or proposal for code refactoring, please feel free to open an issue on GitHub. We appreciate your help in improving this project. If you would like to make code contributions yourself, please read CONTRIBUTING.MD.
https://github.com/StephanAkkerman/FluentAI/graphs/contributors

License πŸ“œ

This project is licensed under the MIT License. See the LICENSE file for details.

About

Automating language learning with the power of Artificial Intelligence. This repository presents FluentAI, a tool that combines Fluent Forever techniques with AI-driven automation. It streamlines the process of creating Anki flashcards, making language acquisition faster and more efficient.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published