Pepper Tutorial <7>: Image recognition

In this tutorial, we will explain about the specification/behaviour of image recognition system implemented in Pepper through creating some sample applications on Choregraphe.

For this tutorial, the hardware Pepper robot is required to simulate the applications as image recognition system cannot be simulated on the virtual robot.

Sensor specifications

1. two 2D cameras (forehead [A] & mouth [B])

2. 3D camera (infrared exposure [C], infrared detection [D])

Pepper uses these cameras to process human and object recognition.

Using Choregraphe to Check the Image Input

Image input from Pepper’s camera can be checked on Choregraphe.

Video monitor pane:

Video monitor pane is used to manipulate the image input.

Go to [View] menu and select [Video monitor] to open the pane.

It allows you to check the image input as well as to control Vision Recognition Database.

  1. Camera image: image input from Pepper’s camera

  2. Play/Pause button: on Play, outputs the real-time input from the camera

  3. Learn button: switch to image learning mode

  4. Import button: import Vision Recognition Database from local file to Choregraphe

  5. Export button: export Vision Recognition Database from Choregraphe to local file

  6. New button: create new Vision Recognition Database

  7. Send button: send current Vision Recognition Database to Pepper

Monitor Application:

You may also check the image input with Monitor application which is installed along with Choregraphe.

1. Open Monitor desktop application

2. Select [Camera] from the menu

3. Select your Pepper or enter IP address to connect with the application

4. Click on the [Play] button

5. Now you can check the camera input from Pepper on the Monitor Desktop window

You may click on the Pause button to stop showing the image

6. The information relating to the image recognition can also be checked with the Monitor Desktop. For example, you can check Pepper’s face recognition processing state [B] by ticking the box next to face detection [A].

7. To check the input of 3D camera, go to [Load Plugin] menu and select [3dsensormonitor].

8. The Depth Map Image is now shown on the Monitor Desktop.

The Monitor application allows you to check and manipulate Pepper’s image recognition input in detail.

Face Recognition

By using the Face Detection box provided in the default box libraries in Choregraphe, the information about current number of faces detected by Pepper can be achieved. In this tutorial, we will use Say Text box to make Pepper say the number of faces it’s detecting.

1. Prepare the boxes.

  • Sensing > Vision > Human Detection > Face Detection

  • Speech > Creation > Say Text

2. Connect the boxes

By connecting the “numberOfFaces” output of Face Detection box to the Say Text box’s “onStart” input, Pepper is able to say the number of faces it’s detecting.

Now the application is ready to be initiated. To check the operation, please connect to Pepper and run the application. When Pepper recognises human faces, Pepper will say the number of faces it’s detecting, like “One” “Two”… and so on.

You may also check the positions of faces Pepper is detecting on the Robot View pane.

The yellow face icon appears on the Robot View that represents the position while Pepper is detecting human faces.

Face Tracking

We can also make Pepper track the face it’s detecting. In this tutorial, we will use Face Tracker box to make Pepper move towards the face.

1. Prepare the boxes.

Sensing > Vision > Human Detection > Face Tracker

2. Connect the box.

3. Set parameters of the Face Tracker box.

Change the “Mode” parameter to [Move] and click OK.

Now the application is ready to be initiated. To check the operation, please connect to Pepper and run the application. When Pepper detects someone nearby, it tracks the face by only moving its head around, but when you keep looking at Pepper and slowly move away from it, it should move towards you.

Learning Faces

We will now use the Learn Face box to make Pepper learn and remember the face. In this tutorial, we will make Pepper remember the face as “John” 5 seconds after the application starts running.

1. Prepare the boxes.

Programming > Data Edition > Text Edit

Sensing > Vision > Human Detection > Learn Face

2. Connect boxes.

3. Insert “John” into the Text Edit box.