“His dress told her nothing, but his face told her things which she was glad to know.”
A. A. Milne
This blog has been a labor of love for nearly ten years. I haven’t earned a dime from it, but money has never been my intention. Instead, it was a way for me to share what I am passionate about. It has also been a teaching tool for me. I can’t tell you how many times I’ve wondered about something, spent some time figuring that thing out, and then writing about it in order to have a place to turn to when it has been long forgotten. The older I get, the more I need to remind myself of who I am (or was).
One of the greatest joys from writing these articles is when someone takes the time to suggest a topic. That happened this week and the suggestion led me to some very interesting discoveries. A coworker read Image Analysis from AWS Rekognition and Avaya OneCloud CPaaS and asked me to expand my research into the world of facial recognition. While I had already played with counting faces in an image, I never took the time to figure out who that face belongs to. Now, thanks to the gentle push, I immersed myself in facial recognition concepts and practices and am ready to share this newfound passion.
AWS Rekognition
I already told you how impressed I am with Rekognition’s image categorization. The same is proving true with its facial recognition accuracy. It not only recognizes a face when it’s the only one in the image, I’ve found it very effective at recognizing faces in a crowd of faces. I’ve also found that the accuracy is very good when it encounters faces photographed at different angles. Additionally, facial recognition software has historically had problems with darker skin tones. That does not seem to be an issue for Rekognition.
I learn best by putting a new concept into practice, or in this case, code. After exploring Rekognition’s APIs, I started writing a suite of applications that implemented what I surmised to be the most effective workflow for facial recognition. This involves
- Step 1. Upload images to an AWS S3 Bucket. These images will be analyzed by Rekognition in the third step.
- Step 2. Create a Collection. A Collection is used to store facial information.
- Step 3. Create Index of Faces. This operation analyzes the uploaded S3 images and adds facial information for each image to the Collection. As part of the analysis, each image will be assigned a unique Id. The user can also assign a name to the image. For example, an image of Abraham Lincoln can be assigned the name “Abraham Lincoln.”
- Step 4. Search Faces by Index. This operation accepts an image and queries the Collection for a match. If a match is found, the unique Id, user assigned name, and confidence level are returned. This is the real-time step of facial recognition.
The recognition engine can be continually enhanced by repeating steps one and three. Not only can you add new faces, you can add different views of existing faces. Static is not a viable option when it comes to training AI engines.
At this point, the use-cases are endless. For starters, imagine a camera outside a door that a person must look into in order to verify that he or she is authorized to enter the room. The faces of the authorized personnel would be pre-populated in the Collection and Search Faces by Index executes the verification/authorization step.
Make it So
For fun, I chose to do facial recognition using four well-known celebrities — George Clooney, Scarlett Johansson, Princess Diana, and Jimi Hendrix. This way I can train the AI engine with easily obtained images. I could then test out the facial recognition with different images downloaded from the Internet.
In order to prove out my workflow, I wrote code for each step. For demonstration purposes, the code for steps two and three is very basic. I create a Collection with a hardcoded name in Step two and I index faces one at a time in Step 3. In a more perfect world, I would make everything database driven, but that would have added complexity outside of what I want to demonstrate.
As for Step 1, I simply went to my AWS S3 console and used the web GUI to upload my training images. I used multiple images of each celebrity. However, every image of a particular celebrity was assigned the same name in Step 3. In other words, all images of George Clooney were named “George,” all images of Princess Diana were named “Diana,” etc. This name returned when a match is encountered.
Step 4 is where they real work happens and I wanted something that allows me to dynamically test the software without a lot of fuss. I thought about using an IP-connected camera, but that wouldn’t allow others to play with the software without setting up a similar camera. In the end, I opted to use text messaging as the ingress and egress mechanism. So, while texting photos is not the ideal use-case, it gets the job done in an efficient manner that doesn’t tie my software to a particular device or geography.
The Nitty-Gritty
Before you deploy the application, you need an Avaya OneCloud CPaaS account. From that account you need your:
CPaaS Account SID
CPaaS Account Token
To use any AWS API or SDK, you need an AWS account and the credentials that come with it. Since Rekognition works closely with AWS S3, S3 Bucket information is required, too. For this application, you need:
AWS Account Key
AWS Secret Key
AWS S3 Bucket Name
AWS S3 Region
Step 2: Create a Rekognition Collection
const AWS = require('aws-sdk'); const AWS_ACCESS_KEY_ID = "AWS Account Key"; const AWS_SECRET_ACCESS_KEY = "AWS Secret Key"; const AWS_REGION = "AWS S3 Region -- e.g. us-east-2"; // Initialize AWS objects const imageClient = new AWS.Rekognition({ accessKeyId: AWS_ACCESS_KEY_ID, secretAccessKey: AWS_SECRET_ACCESS_KEY, region: AWS_REGION }); const params = { CollectionId: "myphotos" }; imageClient.createCollection(params, function(err, data) { if (err) { console.log(err, err.stack); } else { console.log(data); } });
Step 3: Create Index of Faces
const AWS = require('aws-sdk'); const AWS_ACCESS_KEY_ID = "AWS Account Key"; const AWS_SECRET_ACCESS_KEY = "AWS Secret Key"; const AWS_REGION = "AWS S3 Region -- e.g. us-east-2"; const AWS_BUCKET = "AWS S3 Bucket Name -- e.g. my-cpaas-bucket" // Initialize AWS objects const imageClient = new AWS.Rekognition({ accessKeyId: AWS_ACCESS_KEY_ID, secretAccessKey: AWS_SECRET_ACCESS_KEY, region: AWS_REGION }); var params = { CollectionId: "myphotos", DetectionAttributes: [ ], ExternalImageId: "Assign name to this index", Image: { S3Object: { Bucket: AWS_BUCKET, Name: "filename of S3 file to be indexed" } } }; imageClient.indexFaces(params, function(err, data) { if (err) { console.log(err, err.stack); } else { console.dir(data, { depth: null }); } });
Step 4: Search Faces by Index
This is the code for my Faces Bot. It accepts an inbound MMS image from Avaya CPaaS, uploads the image to an S3 Bucket, applies Rekognition facial recognition, deletes the image from the S3 bucket, and texts the results back to sender.
/* This Avaya CPaaS application accepts incoming MMS images and performs facial recognition 1. Receive MMS message from Avaya CPaaS 2. Upload image to AWS S3 Bucket 3. Process image with AWS Image Rekognition 4. Delete image from AWS S3 Bucket 5. Text AWS searchFaceByImage response to sender */ const express = require('express'); const request = require('request-promise'); const bodyParser = require('body-parser'); const cpaas = require('@avaya/cpaas'); var enums = cpaas.enums; var ix = cpaas.inboundXml; const AWS = require('aws-sdk'); const https = require('https'); // Change the following constants to match your environment const CPAAS_URL = "https://api-us.cpaas.avayacloud.com/v2/Accounts/"; const CPAAS_USER = "CPaaS Account SID"; const CPAAS_TOKEN = "CPaaS Auth Token"; const AWS_ACCESS_KEY_ID = "AWS Account Key"; const AWS_SECRET_ACCESS_KEY = "AWS Secret Key"; const AWS_REGION = "AWS S3 Region -- e.g. us-east-2"; const AWS_BUCKET = "AWS S3 Bucket Name -- e.g. my-cpaas-bucket" const URL_PORT = 5035; const CPAAS_SEND_SMS = "/SMS/Messages.json"; const basicAuth = "Basic " + Buffer.from(`${CPAAS_USER }:${CPAAS_TOKEN}`, "utf-8").toString("base64"); var app = express(); // Middleware to parse JSON app.use(bodyParser.urlencoded({ extended: true })); app.use(bodyParser.json()); // Tell server to listen on port var server = app.listen(URL_PORT, function() { var host = server.address().address; var port = server.address().port; console.log("AWS Face Bot is listening on port %s", port) }); // Initialize AWS objects const imageClient = new AWS.Rekognition({ accessKeyId: AWS_ACCESS_KEY_ID, secretAccessKey: AWS_SECRET_ACCESS_KEY, region: AWS_REGION }); const s3 = new AWS.S3({ accessKeyId: AWS_ACCESS_KEY_ID, secretAccessKey: AWS_SECRET_ACCESS_KEY, }); // Entry point for a GET from a web browser app.get('/', function(req, res) { res.send("AWS Face Bot is running."); }); // Entry point for mms text app.post('/cpaas-mms/', function(req, res) { processImage(req.body.From, req.body.To, req.body.Body, req.body.MediaUrl, res); }); // Entry point for sms text app.post('/cpaas-sms/', function(req, res) { processText(req.body.From, req.body.To, req.body.Body, res); }); async function processImage(from, to, body, imageUrl, res) { await downloadFile(imageUrl, from, to); res.type('application/json'); res.send(`{"return":"ok"}`); } async function downloadFile(imageUrl, from, to) { const filename = imageUrl.slice(imageUrl.lastIndexOf('/') + 1, imageUrl.indexOf('?')); const chunks = []; var buffer; const request = https.get(imageUrl, function(response) { response.on('data', chunk => chunks.push(Buffer.from(chunk))) .on('end', () => { buffer = Buffer.concat(chunks); // Convert chunks to Buffer object awsTransfer(filename, buffer, from, to); }); }); } function awsTransfer(filename, buffer, from, to) { var message = "I see: "; const mmsMetadata = { "type": "CPaaS MMS File", "from": from, "to": to }; // Upload file to S3 Bucket const params = { Bucket: AWS_BUCKET, Key: filename, Body: buffer, Metadata: mmsMetadata }; s3.upload(params, function(err, data) { if (err) { throw err; } // Send to AWS Image Rekognition for face detection const imageParams = { CollectionId: "myphotos", FaceMatchThreshold: 95, Image: { S3Object: { Bucket: AWS_BUCKET, Name: filename } }, MaxFaces: 1 }; imageClient.searchFacesByImage(imageParams, function(err, response) { if (err) { console.log(err, err.stack); } else { if (response.FaceMatches.length > 0) { // For demo purposes, work with the first element and ignore all others message = `I am ${String(response.FaceMatches[0].Face.Confidence)} percent confident I see ${response.FaceMatches[0].Face.ExternalImageId}`; } else { message = "No registered face found." } // Delete file object from S3 Bucket -- Comment out if you wish to preserve file const deleteParams = { Bucket: AWS_BUCKET, Key: filename } s3.deleteObject(deleteParams, function(err, data) { if (err) { console.log(err); } else {} }); // Text image labels to "from" number const options = { url: `${CPAAS_URL}${CPAAS_USER}${CPAAS_SEND_SMS}`, body: `From=${to}&To=${from}&Body=${message}`, headers: { 'Content-Type': 'text/plain', 'Accept': 'application/json', 'Authorization': basicAuth }, method: 'POST' } var response = request.post(options, function(e, r, body) {}); } }); }); } async function processText(from, to, body, res) { returnTextResponse(`Please text an image of a face.`, from, to, res); } async function returnTextResponse(prompt, from, to, res) { var xmlDefinition = generateXMLText(from, to, prompt) var serverResponse = await buildCPaaSResponse(xmlDefinition); console.log(serverResponse); res.type('application/xml'); res.send(serverResponse); } function generateXMLText(customer, cpaas, body) { var sms = ix.sms({ text: body, to: customer, from: cpaas }); var xml_content = []; xml_content.push(sms); var xmlDefinition = ix.response({ content: xml_content }); return xmlDefinition; } async function buildCPaaSResponse(xmlDefinition) { var result = await ix.build(xmlDefinition).then(function(xml) { return xml; }).catch(function(err) { console.log('The generated XML is not valid!', err); }); return result; }
Seeing is Believing
I hosted the application on my Linux server and texted in various photographs of George Clooney, Princess Diana, Jimi Hendrix, and Scarlett Johansson.

After a bit of training (i.e. repeats of Step 1 and Step 3), the facial recognition was quite remarkable. Note how it found Jimi despite all the other faces behind him. It was also very accurate when I texted in photos of people the AI engine was not trained to recognize. Failure to find is just as important as identifying known faces.
Mischief Managed
This is just a start, but it’s a good one. Sometime soon I will turn my attention to facial recognition in video clips and ultimately, streaming video. Stay tune for more fun and games.
I solemnly swear that you are up to something good here–already thinking of multiple use cases!
Ha! I most definitely am! Thanks for reading and commenting.