How attractive you are? Analysing person’s photo using OpenAI CLIP model

Oleg Khomenko
3 min readMay 3, 2022
The proposed approach uses only MediaPipe for Face Detection and OpenAI CLIP for attractiveness estimation. Both models works well on CPU and requires minimal resource

Overview

In this tutorial, we will learn how to use OpenAI Clip model to predict attractiveness of a person on a photo.

OpenAI CLIP model is a neural network that is trained to measure the distance between textual features and the features of a target image. It is widely used in zero-shot classification tasks (classification for set of labels which were not used during training).

In this post I will show how to build such a classifier in few minutes without any model training.

As a bonus: all tasks are effectively performed on CPU, so you don’t need to spend money on GPU resources.

Everything will be done in the following way:

  • We use MediaPipe blaze fast face detector detects face on the image
  • Then, we pad it and crop it to keep some background around
  • After that, the CLIP model is being used for gender prediction and choose correct set of captions
  • Finally, the same CLIP model is used to predict how attractive is a person on a photo
Ryan Gosling seems to be a very attractive man.

Requirements

To install requirement packages please run in terminal:

pip3 install deepface mediapipe
pip3 install git+https://github.com/openai/CLIP.git

Main Class

We are going to implement one class which will perform all the work via public methods. Let’s write the constructor

PredictorCLIP constructor

Face Detection

The easiest way to import face detector into your project is to use DeepFace package. It containsOpenCV, SSD, Dlib, MTCNN, RetinaFace and MediaPipe.

We are going to use BlazeFace detector (used in MediaPipe framework): BlazeFace is a fast, light-weight face detector from Google Research. It weighs less than 1Mb and yield

Detector model was already initialized in __init__(...) method. Let’s declare detect_face_with_padding function

Now it is possible to use BlazeFace detector to return cropped and padded face image. If something goes wrong we will simply return (None, None, None)

Partially implemented predict(…) method

Gender classification & Attractiveness prediction

We are going to modify predict(…)for gender classification. Let’s define predict_clip(image: Image.Image, text: List[str]) function

predict_clip method. It takes image and text and returns normalized distance between img <-> caption

Now we have everything to implement predict(…)
We simply use ["man", "woman"] and take argmax out of the prediction. Then we use different captions depending on gender index

Fully implemented `predict` method

As one may note, we will use different set of captions for male and female:

captions = [
[“handsome”, “ugly”], # this one is for male
[“beautiful”, “ugly”] # this one is for female
]

The final version of predict method is below:

Method returns cropped image and predicted score in two formats: raw number and text

Finally, wrap the code into file

We want to run our code via python3 AttractiveMeter.py --image_path <path-to-image> command. Let’s add some code to make everything executable

AttractivenessMeter.py

🤖 That is it. You can test the final result via Telegram bot

Example:

🌱 Code is available at GitHub

P.S.

As soon as this post reaches 30 claps, I will write a post about how to wrap all this code as an asynchronous telegram bot

--

--