ArtificialVision_FinalContest2020

This repository is created for the final contest of Artificial Vision subject at University of Salerno. The aim of this project is to design a DCNN (as regressor or classifier) for age estimation on VggFace2 dataset labeled with ages by MiviaLab.

Group Members

Salvatore Ventre
Vincenzo Russomanno
Giovanni Puzo
Vittorio Fina

Preparation of the dataset to give as input to DCNN

For dealing with Google Colab, we to use a special format to compat data, TFRecord.

In order to create this files, first of all you have to choose the datasets (training, validation and test sets) to be used for training and evaluate the model; our implementation (due to limits imposed by Colab's GPU usage) consists in using a subpart of the original training set of the dataset VggFace2 composed by at most 150 images per identity; afterwards, it's divided in 70% for training, 20% for validation and 10% for test. To choose the subpart from the entire dataset, we divide the age range of each identity of the dataset in 4 groups, randomly taken a fixed number of images for each group; this number is 30, except for the 3rd group in which we take 60 elements in order to respect the original data distribution and to avoid spike of samples of a particular age. If an identity has less than 150 images, it has taken entirely without age grouping for that identity; moreover if, after the age grouping, a group has less than 30 elements (40 for the 3rd group), it is taken entirely and the remaining images to reach the threshold of 150 are randomly chosen from the images of the other groups not already taken.

In order to reproduce our experiment, first of all you have to launch the script process_csv.py within its directory with the command

python3 process_csv.py

This script:

recovers the ages from the CSV file containing the annotations (which has to be placed here)
```
ages = read_csv(PATH_TO_CSV_FILE, test=test)
```
groups images into 4 age groups, according to the age range of the specific identity
```
grouped_ages, final_dict = group_ages(ages)
```
randomly takes the desired number of images from each group of each identity
```
final_dict = recover_identities(grouped_ages,final_dict)
```

split previous chosen images into training, test and validation set

splitted_dict_samples, splitted_dict_labels = train_test_val_split(ages, final_dict)

starting from the prevous splits (which works on images' path), recovers the corresponding images' files saving it to a specific folder
```
extract_jpgs(splitted_dict_samples, set_type)
```

At this point, you have the 3 needed sets and you can pass to the phase of face extraction, which can be done with the script get_bboxes_from_csv.py launched within its directory with

python3 get_bboxes_from_csv.py

It recovers faces bounding boxes informations from the CSV annotation file (to be placed here):

split = row[0].split(",")
path = split[2]
dir_path, file_path = path.split("/")[0], path.split("/")[1]
id_folder = os.path.join(PATH_TO_CROPPED_TS, dir_path)
x,y = int(split[4]), int(split[5])
# if top-left point of the bbox has a negative coordinate, set it to 0
# because it probably means that the face is outside the limits of the image
if x<0:
    x=0
if y<0:
    y=0
width = int(split[6])
height = int(split[7])

and then crop the face:

crop_img = img[y:y+height, x:x+width]
if crop_img.size!=0:
    cv.imwrite(os.path.join(PATH_TO_CROPPED_TS, path), crop_img)

For saving time, we use face annotations provided by MiviaLab which you can download here, although we foresee in our system a detector for doing face extraction. In particular we choose MTCNN detector whose implementation can be found here and can be installed with:

pip install mtcnn

Face extraction is done with the function extract_face placed in the script extract_face.py: this function recovers all faces present in the image

results = detector.detect_faces(img)

then finds the bounding box with the max area because we can have multiple faces in a singe image and in this way we ideally select only the one in close-up and finally crops the image with the information of the max area.

If no faces are detected or bounding box are too small (which probably means bad detection), the function return original image.

Once cropped all sets, you can generate the TFRecord files using the script create_tfrecord.py which distingueshes (through the parameter test to set at line 145) between test record of the type:

example = tf.train.Example(features=tf.train.Features(feature={
'path': _bytes_feature((path).encode('utf-8')),
'image_raw': _bytes_feature(image_string)
}))

and validation/training record of the type:

example = tf.train.Example(features=tf.train.Features(feature={
'path': _bytes_feature((path).encode('utf-8')),
'width': _int64_feature(image_shape[1]),
'height': _int64_feature(image_shape[0]),
'label': _int64_feature(int(age)),
'image_raw': _bytes_feature(image_string)
}))

Moreover test TFRecord creation function also write a CSV file with the age label of each test image (in order to use it on Colab for evaluate the model):

# write path-age to gt csv
path = jpg_dir.split(os.sep)[-2]+"/"+jpg_dir.split(os.sep)[-1]
age = ages[d]["/"+f]
csv_writer.writerow([path, age])

and preprocess images before writing to record, normalizing and resizing them:

img = vggface2_preprocessing(img)
img = custom_resize(img, img.shape[0], img.shape[1], TARGET_SHAPE)

This script has to be launched within its directory with the command

python3 create_tfrecord.py

Well done! TFRecords are created successfully!

DCNN model: training

We decided to build a classifier able to recognize 101 classes (ages from 0 to 100), in particular we choose the Resnet50 model pre-trained on VGGFace2. At this implementation we added a Dense layer of 101 neurons, with softmax activation function, for adapting the pre-trained net to solve our classification problem.

The training procedure can be found here; it was done for 25 epochs (18 training only the last 11 layers and 7 training all the layers) with a batch size of 128, using SGD with momentum as optimizer. Moreover we have used:

Categorical Crossentropy as loss function
Categorical Accuracy and MAE as metrics

The learning rate starts at 0.005 and it's reduced by a factor of 0.2 after 20 epochs. To avoid overfitting, we use EarlyStopping callback, which stops the training if val_loss not improve for 5 epochs, and, as provided by original implementation of the chosen CNN, a weight decay of 1e-4.

Finally, for improving the representativeness of the available dataset, we use a data augmentation composed by:

random variation in brightness and contrast
random changing of the chrome of the image, from RGB to BW
random flip along horizontal or vertical axis

DCNN model: test

For testing the model, we use this notebook in which we read each sample contained in the chosen test set and write the associated prediction done by the model in a CSV; then we compare the labels contained in the CSV of ground truth and predictions for calculating the MAE, the metric used for effectively assessing model performance.

Useful links

Here you can find useful links to our projectwork "Age Estimation":

Google Drive Folder link: https://drive.google.com/drive/folders/1SbOUhx0yRu1UCUhiGzUYI-AZdGONcr25?usp=sharing

GitHub link: https://github.com/dev-guys-unisa/ArtificialVision_FinalContest2020

Model link: https://drive.google.com/file/d/1-Z2UaCQOQXYznpQRfip9PhYVAEjy3naO/view?usp=sharing

Last Checkpoint's Model link: https://drive.google.com/file/d/1-iDy6Wk8QnC83-tqktCBH0T3LsRAUJP9/view?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
GenderRecognitionFramework/training		GenderRecognitionFramework/training
csv_preprocessing		csv_preprocessing
doc		doc
face_annotations		face_annotations
face_extraction		face_extraction
notebooks		notebooks
predictions		predictions
tfrecord_management		tfrecord_management
CSV file for ages here		CSV file for ages here
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArtificialVision_FinalContest2020

Group Members

Preparation of the dataset to give as input to DCNN

DCNN model: training

DCNN model: test

Useful links

Group 18

About

Releases

Packages

Contributors 3

Languages

dev-guys-unisa/ArtificialVision_FinalContest2020

Folders and files

Latest commit

History

Repository files navigation

ArtificialVision_FinalContest2020

Group Members

Preparation of the dataset to give as input to DCNN

DCNN model: training

DCNN model: test

Useful links

Group 18

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages