Crash Course on TINTOlib: Tabular Data to Synthetic Images for Vision-Based Machine Learning

Description

This repository provides a comprehensive crash course on using TINTOlib, a Python library designed to transform tabular data into synthetic images for machine learning tasks. It includes slides and Jupyter notebooks that demonstrate how to apply state-of-the-art vision models like Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) to problems such as regression and classification, using TINTOlib for data transformation.

The repository also features Hybrid Neural Networks (HyNNs), where one branch is an MLP designed to process tabular data, while another branch—either CNN or ViT—handles the synthetic images. This architecture leverages the strengths of both data formats for enhanced performance on complex machine learning tasks. Ideal for those looking to integrate image-based deep learning techniques into tabular data problems.

Features

Input data formats (2 options):
- Pandas Dataframe
- Files with the following format
  - Tabular files: The input data must be in CSV, taking into account the Tidy Data format.
  - Tidy Data: The target (variable to be predicted) should be set as the last column of the dataset. Therefore, the first columns will be the features.
  - All data must be in numerical form.
Runs on Linux, Windows and macOS systems.
Compatible with Python 3.7 or higher.

Materials

This TINTOlib crash course contains the following materials in different folders:

Datasets: Different supervised learning datasets (regression and classification) for training with TINTOlib.
Presentations: Contains specific presentations on TINTOlib and the deep learning architectures that can be built.
Notebooks: Includes different folders with practical examples and recipes for using TINTOlib for classification and regression tasks. These are:
- LazyPredict: How to get baseline results with classic models on Tidy Data.
- PyTorch: Recipes for using TINTOlib with PyTorch.
- TensorFlow: Recipes for using TINTOlib with TensorFlow/Keras.

Practical Session

Work in groups to try and surpass the baseline set by classical models on the Boston housing dataset.

Lazypredict - refer to this notebook: Notebooks/Lazypredict/LazyPredict_Regression.ipynb

Using synthetic images, experiment with either vision models like CNNs or ViTs, and explore hybrid models. Below are the architectures that will be presented, and the ones you will modify and use during the session:

Synthetic images using CNN
Synthetic images using Hybrid Neural Network with ViT (HyViT)

Notebooks - Open in Colab

Here are the notebooks you can directly open and run in Google Colab:

Note: Before running the notebooks, you will need to download the required dataset. For the practical session, we will use a small dataset, specifically the Boston housing dataset, which is located in Data/Regression/boston.csv.

The notebooks listed below are designed for regression tasks:

TensorFlow - CNN:
TensorFlow - CNN + MLP Hybrid:
TensorFlow - Vision Transformer (ViT):
TensorFlow - ViT + MLP Hybrid:

Methods for Tabular-to-Image Transformation

In this tutorial, we will explore various methods to transform tabular data into images to take advantage of deep learning models such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).

TINTOlib is a state-of-the-art library that wraps the most important techniques for the construction of Synthetic Images from Sorted Data (also known as Tabular Data).

Citing TINTO: If you used TINTO in your work, please cite the SoftwareX:

@article{softwarex_TINTO,
    title = {TINTO: Converting Tidy Data into Image for Classification with 2-Dimensional Convolutional Neural Networks},
    journal = {SoftwareX},
    author = {Manuel Castillo-Cara and Reewos Talla-Chumpitaz and Raúl García-Castro and Luis Orozco-Barbosa},
    volume={22},
    pages={101391},
    year = {2023},
    issn = {2352-7110},
    doi = {https://doi.org/10.1016/j.softx.2023.101391}
}

And use-case developed in INFFUS Paper

@article{inffus_TINTO,
    title = {A novel deep learning approach using blurring image techniques for Bluetooth-based indoor localisation},
    journal = {Information Fusion},
    author = {Reewos Talla-Chumpitaz and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro},
    volume = {91},
    pages = {173-186},
    year = {2023},
    issn = {1566-2535},
    doi = {https://doi.org/10.1016/j.inffus.2022.10.011}
}

All the methods presented can be called using the TINTOlib library. The methods presented include:

Model	Class	Features	Hyperparameters
TINTO	`TINTO()`	`blur`	`problem` `algorithm` `pixels` `submatrix` `blur` `amplification` `distance` `steps` `option` `random_seed` `times` `verbose`
IGTD	`IGTD()`		`problem` `scale` `fea_dist_method` `image_dist_method` `max_step` `val_step` `error` `switch_t` `min_gain` `zoom` `random_seed` `verbose`
REFINED	`REFINED()`		`problem` `n_processors` `hcIterations` `zoom` `random_seed` `verbose`
BarGraph	`BarGraph()`		`problem` `pixel_width` `gap` `zoom` `verbose`
DistanceMatrix	`DistanceMatrix()`		`problem` `zoom` `verbose`
Combination	`Combination()`		`problem` `zoom` `verbose`
SuperTML	`SuperTML()`		`problem` `columns` `font_size` `image_size` `verbose`
FeatureWrap	`FeatureWrap()`		`problem` `size` `bins` `zoom` `verbose`
BIE	`BIE()`		`problem` `precision` `zoom` `verbose`

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Data		Data
Images		Images
Notebooks		Notebooks
Presentations		Presentations
UAL		UAL
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crash Course on TINTOlib: Tabular Data to Synthetic Images for Vision-Based Machine Learning

Description

Features

Materials

Practical Session

Notebooks - Open in Colab

Methods for Tabular-to-Image Transformation

More information

Authors

Contributors

About

Releases

Packages

Languages

License

oeg-upm/TINTOlib-Crash_Course

Folders and files

Latest commit

History

Repository files navigation

Crash Course on TINTOlib: Tabular Data to Synthetic Images for Vision-Based Machine Learning

Description

Features

Materials

Practical Session

Notebooks - Open in Colab

Methods for Tabular-to-Image Transformation

More information

Authors

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages