Recently released Sony Patent .
I’ve copied some points of interest below ,there’s many more if you care to click on the link .
Sony Patent | Deployment of dynamic vision sensor hybrid element in method for tracking a controller and simultaneous body tracking, slam or safety shutter.
patent.nweon.com
Eye Tracking
Aspects of the present disclosure may be applied to eye tracking. Generally, eye tracking image analysis takes advantage of characteristics distinctive to how light is reflected off of the eyes to determine eye gaze direction from the image. For example, the image may be analyzed to identify eye location based on corneal reflections in the image data, and the image may be further analyzed to determine gaze direction based on a relative location of the pupils in the image.
Two common gaze tracking techniques for determining eye gaze direction based on pupil location are known as Bright Pupil tracking and Dark Pupil tracking. Bright Pupil tracking involves illumination of the eyes with a light source that is substantially in line with the optical axis of the DVS, causing the emitted light to be reflected off of the retina and back to the DVS through the pupil. The pupil presents in the image as an identifiable bright spot at the location of the pupil, similar to the red eye effect which occurs in images during conventional flash photography. In this method of gaze tracking, the bright reflection from pupil itself helps the system locate the pupil if contrast between pupil and iris is not enough.
Dark Pupil tracking involves illumination with a light source that is substantially offline from the optical axis of the DVS, causing light directed through the pupil to be reflected away from the optical axis of the DVS, resulting in an identifiable dark spot in the Event at the location of the pupil. In alternative Dark Pupil tracking systems, an infrared light source and cameras directed at eyes can look at corneal reflections. Such DVS based systems track the location of the pupil and corneal reflections which provides parallax due to different depths of reflections gives additional accuracy.
FIG.
18A depicts an example of a dark pupil gaze tracking system
1800 that may be used in the context of the present disclosure. The gaze tracking system tracks the orientation of a user’s eye E relative to a display screen
1801 on which visible images are presented. While a display screen is utilized in the example system of FIG.
18A, certain alternative embodiments may utilize an image projection system capable of projecting images directly into the eyes of a user. In these embodiments, the user’s eye E would be tracked relative to the images projected into the user’s eyes. In the example of FIG.
18A, the eye E gathers light from the screen
1801 through a variable iris I and a lens L projects an image on the retina R. The opening in the iris is known as the pupil. Muscles control rotation of the eye E in response to nerve impulses from the brain. Upper and lower eyelid muscles ULM, LLM respectively control upper and lower eyelids UL LL in response to other nerve impulses.
Light sensitive cells on the retina R generate electrical impulses that are sent to the user’s brain (not shown) via the optic nerve ON. The visual cortex of the brain interprets the impulses. Not all portions of the retina R are equally sensitive to light. Specifically, light-sensitive cells are concentrated in an area known as the fovea.
The illustrated image tracking system includes one or more infrared light sources
1802, e.g., light emitting diodes (LEDs) that direct non-visible light (e.g., infrared light) toward the eye E. Part of the non-visible light reflects from the cornea C of the eye and part reflects from the iris. The reflected non-visible light is directed toward a DVS
1804sensitive to infrared light by a wavelength-selective mirror
1806. The mirror transmits visible light from the screen
1801 but reflects the non-visible light reflected from the eye.
The DVS
1804 produces an event of the eye E which may be analyzed to determine a gaze direction GD from the relative position of the pupil. This event may be produced with a processor
1805. The DVS
1804 is advantageous in this implementation as the extremely fast update rate for events provides near real time information on changes in the user’s gaze.
As seen in FIG.
18B, the event
1811 showing a user’s head H may be analyzed to determine a gaze direction GD from the relative position of the pupil. For example, analysis may determine a 2-dimensional offset of the pupil P from a center of the eye E in the image. The location of the pupil relative to the center may be converted to a gaze direction relative to the screen
1801, by a straightforward geometric computation of a three-dimensional vector based on the known size and shape of the eyeball. The determined gaze direction GD is capable of showing the rotation and acceleration of the eye E as it moves relative to the screen
1801.
As also seen in FIG.
18B, the event may also include reflections
1807 and
1808 of the non-visible light from the cornea C and the lens L, respectively. Since the cornea and lens are at different depths, the parallax and refractive index between the reflections may be used to provide additional accuracy in determining the gaze direction GD. An example of this type of eye tracking system is a dual Purkinje tracker, wherein the corneal reflection is the 1st Purkinje Image, and the lens reflection is the 4th Purkinje Image. There may also be reflections
1808 from a user’s eyeglasses
1809, if these are worn a user.
Performance of eye tracking systems depend on a multitude of factors, including the placement of light sources (IR, visible, etc.) and DVS, whether user is wearing glasses or contacts, Headset optics, tracking system latency, rate of eye movement, shape of eye (which changes during the course of the day or can change as a result of movement), eye conditions, e.g., lazy eye, gaze stability, fixation on moving objects, scene being presented to user, and user head motion. The DVS provides an extremely fast update rate for events with reduced extraneous information output to the processor. This allows for quicker processing and faster gaze tracking state and error parameter determination.
Error parameters that may be determined from gaze tracking data may include, but are not limited to, rotation velocity and prediction error, error in fixation, confidence interval regarding the current and/or future gaze position, and errors in smooth pursuit. State information regarding a user’s gaze involves the discrete state of the user’s eyes and/or gaze. Accordingly, example state parameters that may be determined from gaze tracking data may include, but are not limited to, blink metrics, saccade metrics, depth of field response, color blindness, gaze stability, and eye movement as a precursor to head movement.
In certain implementations, the gaze tracking error parameters can include a confidence interval regarding the current gaze position. The confidence interval can be determined by examining the rotational velocity and acceleration of a user’s eye for change from last position. In alternative embodiments, the gaze tracking error and/or state parameters can include a prediction of future gaze position. The future gaze position can be determined by examining the rotational velocity and acceleration of eye and extrapolating the possible future positions of the user’s eye. In general terms, the DVS update rate of the gaze tracking system may lead to a small error between the determined future position and the actual future position for a user with larger values of rotational velocity and acceleration because the updated rate of the DVS is so high this small error may be significantly less than existing camera based systems.
Finger Position Tracking
SNN
In an alternative implementation finger tracking may be performed without the use of one or more light sources. A machine learning model may be trained with a machine learning algorithm to detect finger position from events generated from ambient light changes due to finger movement. The machine learning model may be a general machine learning model such as a CNN, RNN or DNN as discussed above. In some implementations specialized machine learning model such as for example and without limitation a spiking (or sparking) neural network (SNN) may be trained with a specialized machine learning algorithm. An SNN mimics biological NNs by having an activation threshold and a weight that is adjusted according to a relative spike time within an interval, also known as Spike-timing-dependent-plasticity (STDP). When the activation threshold is achieved the SNN is said to spike and transmit its weight to the next layer. An SNN may be trained via STDP and supervised or unsupervised learning techniques. and More information about SNNs can be found in Tavanaei, Amirhossein et al. “Deep Learning in Spiking Neural Networks” Neural Networks (2018) arXiv:1804.08150, the contents of which are incorporated herein by reference for all purposes.
Alternatively, a high dynamic range (HDR) image may be constructed using aggregated events from ambient data. A machine learning model trained to recognize hand position or controller position and orientation from HDR images. The trained machine learning model may be applied to HDR images generated from the events to determine the hand/finger position or controller position and orientation. The machine learning model may be a general machine learning model trained with supervised learning techniques as discussed in the general neural network training section.