Computer vision
The science behind visual ID
A new opt-in feature for Echo Show and Astro provides more-personalized content and experiences for customers who choose to enroll.
By The Amazon visual ID teams
September 28, 2021
Share
With every feature and device we build, we challenge ourselves to think about how we can create an immersive, personalized, and proactive experience for our customers. Often, our devices are used by multiple people in our homes, and yet there are times when you want a more personalized experience. That was the inspiration for visual ID.
More coverage of devices and services announcements
On the all-new Echo Show 15, Echo Show 8, and Echo Show 10, you and other members of your household will soon be able to enroll in visual ID, so that at a glance you can see personalized content such as calendars and reminders, recently played music, and notes for you.
And with Astro, a new kind of household robot, enrolling in visual ID enables Astro to do things like find you to deliver something, such as a reminder or an item in Astro’s cargo bin.
Creating your visual ID
Visual ID is opt-in, so you must first enroll in the feature, much as you can enroll in voice ID (formerly Alexa voice profile) today. During enrollment, you will use the camera on your supported Echo Show device or Astro to take a series of headshots at different angles. For visual ID to accurately recognize you, we require five different angles of your face.
During the enrollment process, the device runs algorithms to ensure that each of the images is of high enough quality. For example, if the room is too dark, you will see on-screen instructions to adjust the lighting and try again. You will also see on-screen notifications as an image of each pose is successfully captured.
The images are used to create numeric representations of your facial characteristics. Called vectors (one for each angle of your face), these numeric representations are just that: a string of numbers. The images are also used to revise the vectors in the event of periodic updates to the visual ID model — meaning customers are not required to re-enroll in visual ID every time there is a model update. These images and vectors are securely stored on-device, not in Amazon’s cloud.
Up to 10 members of a household per account can enroll on each compatible Echo Show or Astro to enjoy more-personalized experiences for themselves. Customers with more than one visual-ID-compatible device will need to enroll on each device individually.
A screenshot of the enrollment process, during which the device’s camera takes a series of headshots at different angles.
Identifying an enrolled individual
Once you’ve enrolled in visual ID, your device attempts to match people who walk into the camera’s field of view with the visual IDs of enrolled household members. T
here are two steps to this process, facial detection and facial recognition, and both are done through local processing using machine learning models called convolutional neural networks.
To recognize a person, the device first uses a convolutional neural network to detect when a face appears in the camera’s field of view. If a person whom the device does not recognize as enrolled in visual ID walks into the camera’s field of view, the device will determine that there are no matches to the stored vectors. The device does not retain images or vectors from unenrolled individuals after processing. All of this happens in fractions of a second and is done securely on-device.
When your supported Echo Show device recognizes you, your avatar and a personalized greeting will appear in the upper right of the screen.
An example of what Echo Show 15 might show on its screen once an enrolled individual is recognized.
What shows on Astro’s screen will depend on what Astro is doing. For example, if you’ve enrolled in visual ID, and Astro is trying to find you, Astro will display text on its screen — “Looking for [Bob]”, followed by “Found [Bob]” — to acknowledge that it’s recognized you.
Astro will display text on its screen — “Looking for [Bob]”, followed by “Found [Bob]” — to acknowledge that it’s recognized you.
Enhancing fairness
We set a high bar for equity when it came to designing visual ID. To clear that bar, our scientists and engineers built and refined our visual ID models using millions of images — collected in studies with participants’ consent — explicitly representing a diversity of gender, ethnicity, skin tone, age, ability, and other factors. We then set performance targets to ensure the visual ID feature performed well across groups.
In addition to consulting with several Amazon Scholars who specialize in computer vision, we also consulted with an external expert in algorithmic bias, Ayanna Howard, dean of the Ohio State University College of Engineering, to review the steps we took to enhance the fairness of the feature. We’ve implemented feedback from our Scholars and Dr. Howard, and we will solicit and listen to customer feedback and make improvements to ensure the feature continues to improve on behalf of our customers.
Privacy by design
As with all of our products and services, privacy was foundational to how we built and designed visual ID. As mentioned above, the visual IDs of enrolled household members are securely stored on-device, and both Astro and Echo Show devices use local processing to recognize enrolled customers. You can delete your visual ID from individual devices on which you’ve enrolled through on-device settings and, for Echo Show, through the Alexa app. This will delete the stored enrollment images and associated vectors from your device. We will also automatically delete your visual ID from individual devices if your face is not recognized by that device for 18 months.
It’s still day one for visual ID, Echo Show, and Astro. We look forward to hearing how our customers use visual ID to personalize their experiences with our devices.
A new opt-in feature for Echo Show and Astro provides more-personalized content and experiences for customers who choose to enroll.
www.amazon.science