uiux
Regular
Patent specifically referencing spiking neural networks:
https://patents.google.com/patent/US20200273180A1/en
Deformable object tracking
Abstract
Various implementations disclosed herein include devices, systems, and methods that use event camera data to track deformable objects such as faces, hands, and other body parts. One exemplary implementation involves receiving a stream of pixel events output by an event camera. The device tracks the deformable object using this data. Various implementations do so by generating a dynamic representation of the object and modifying the dynamic representation of the object in response to obtaining additional pixel events output by the event camera. In some implementations, generating the dynamic representation of the object involves identifying features disposed on the deformable surface of the object using the stream of pixel events. The features are determined by identifying patterns of pixel events. As new event stream data is received, the patterns of pixel events are recognized in the new data and used to modify the dynamic representation of the object.
[0066]
In some implementations, the tracking algorithm performs machine-learning-based tracking. The event stream(s) of the event camera(s) are fed to a machine-learning algorithm. The algorithm either processes each event in turn, processes in batches of events, or events are accumulated spatially or temporally before they are fed to the machine learning algorithm, or a combination thereof. The machine learning algorithm can additionally take as input a set of values from a latent space, which potentially encodes information about the object being tracked and its previous states. In some implementations, the machine learning algorithm is trained to regress directly to a dynamic object representation, or to an intermediate representation that is later converted to the dynamic object representation. Optionally, the machine-learning algorithm can regress to an updated set of values in the latent space, that are then used to process future events. In some implementations, a machine learning algorithm that performs the tracking is configured as a convolutional neural network (CNN), a recurrent network such as a long short-term memory (LSTM) neural network, a spiking neural network (SNN), or a combination of these networks or using any other neural network architecture. FIG. 8 provides an example of a CNN configuration.
Patents specifically mentioning neuromorphic processors:
https://patents.google.com/patent/US10282623B1/en
Depth perception sensor data processing
Abstract
Some embodiments provide a sensor data-processing system which generates a depth data representation of an environment based on sensor data representations which are generated by passive sensor devices. The sensor data-processing system generates the depth data representation via applying an algorithm which includes an model architecture which determines depths of various portions of the represented environment based on detecting features correspond to depth information. The model architecture is established via training an algorithm to generate depth data which corresponds to a sample set of depth data representations of environments, given a corresponding set of image data representations of the environments. As a result, the sensor data-processing system enables depth perception of portions of an environment independently of receiving depth data representations of the environment which are generated by an active sensor device.
https://patents.google.com/patent/US10762440B1/en
Sensor fusion and deep learning
Abstract
Some embodiments provide a sensor data-processing system which detects and classifies objects detected in an environment via fusion of sensor data representations generated by multiple separate sensors. The sensor data-processing system can fuse sensor data representations generated by multiple sensor devices into a fused sensor data representation and can further detect and classify features in the fused sensor data representation. Feature detection can be implemented based at least in part upon utilizing a feature-detection model generated via one or more of deep learning and traditional machine learning. The sensor data-processing system can adjust sensor data processing of representations generated by sensor devices based on external factors including indications of sensor health and environmental conditions. The sensor data-processing system can be implemented in a vehicle and provide output data associated with the detected objects to a navigation system which navigates the vehicle according to the output data.
Once an algorithm is trained, it is installed into a sensor data-processing system located in a vehicle. A sensor data-processing system which implements a deep learning algorithm can require less time and power to process sensor data, relative to traditional sensor data-processing systems. In some embodiments, a sensor data-processing system implementing a deep learning algorithm implements general computing hardware configurations, including one or more of general CPU and GPU configurations. In some embodiments, a sensor data-processing system implementing a deep learning algorithm implements one or more particular computing hardware configurations, including one or more of Field-programmable gate array (FPGA) processing circuitry, neuromorphic processing circuitry, etc. Particular computing hardware configurations can provide augmented computing performance with reduced power consumption, relative to conventional hardware configurations, which can be beneficial when a sensor data-processing system implements one or more deep learning algorithms, which can be relatively computationally expensive relative to traditional data-processing algorithms.
Patents related to Dynamic Vision Sensors:
https://patents.google.com/patent/US20200278539A1/en
Method and device for eye tracking using event camera data
Abstract
In one implementation, a method includes emitting light with modulating intensity from a plurality of light sources towards an eye of a user. The method includes receiving light intensity data indicative of an intensity of the plurality of glints reflected by the eye of the user in the form of a plurality of glints and determining an eye tracking characteristic of the user based on the light intensity data. In one implementation, a method includes generating, using an event camera comprising a plurality of light sensors at a plurality of respective locations, a plurality of event messages, each of the plurality of event messages being generated in response to a particular light sensor detecting a change in intensity of light and indicating a particular location of the particular light sensor. The method includes determining an eye tracking characteristic of a user based on the plurality of event messages.
https://patents.google.com/patent/US20210068652A1/en
Glint-Based Gaze Tracking Using Directional Light Sources
Abstract
Various implementations determine gaze direction based on a cornea center and (a) a pupil center or (b) an eyeball center. The cornea center is determined using a directional light source to produce one or more glints reflected from the surface of the eye and captured by a sensor. The angle (e.g., direction) of the light from the directional light source may be known, for example, using an encoder that records the orientation of the light source. The known direction of the light source facilitates determining the distance of the glint on the cornea and enables the cornea position to be determined, for example, based on a single glint. The cornea center can be determined (e.g., using an average cornea radius, or a previously measured cornea radius or using information from a second glint). The cornea center and a pupil center or eyeball center may be used to determine gaze direction.
https://patents.google.com/patent/US20200348755A1/en
Event camera-based gaze tracking using neural networks
Abstract
One implementation involves a device receiving a stream of pixel events output by an event camera. The device derives an input image by accumulating pixel events for multiple event camera pixels. The device generates a gaze characteristic using the derived input image as input to a neural network trained to determine the gaze characteristic. The neural network is configured in multiple stages. The first stage of the neural network is configured to determine an initial gaze characteristic, e.g., an initial pupil center, using reduced resolution input(s). The second stage of the neural network is configured to determine adjustments to the initial gaze characteristic using location-focused input(s), e.g., using only a small input image centered around the initial pupil center. The determinations at each stage are thus efficiently made using relatively compact neural network configurations. The device tracks a gaze of the eye based on the gaze characteristic.
https://patents.google.com/patent/US10845601B1/en
AR/VR controller with event camera
Abstract
In one implementation, a method involves obtaining light intensity data from a stream of pixel events output by an event camera of a head-mounted device (“HMD”). Each pixel event is generated in response to a pixel sensor of the event camera detecting a change in light intensity that exceeds a comparator threshold. A set of optical sources disposed on a secondary device that are visible to the event camera are identified by recognizing defined illumination parameters associated with the optical sources using the light intensity data. Location data is generated for the optical sources in an HMD reference frame using the light intensity data. A correspondence between the secondary device and the HMD is determined by mapping the location data in the HMD reference frame to respective known locations of the optical sources relative to the secondary device reference frame.
https://patents.google.com/patent/US20200258278A1/en
Detecting physical boundaries
Abstract
Techniques for alerting a user, who is immersed in a virtual reality environment, to physical obstacles in their physical environment are disclosed.
https://patents.google.com/patent/US20210051406A1/en
Method and device for sound processing for a synthesized reality setting
Abstract
In one implementation, a method of transforming a sound into a virtual sound for a synthesized reality (SR) setting is performed by a head-mounted device (HMD) including one or more processors, non-transitory memory, a microphone, a speaker, and a display. The method includes displaying, on the display, an image representation of a synthesized reality (SR) setting including a plurality of surfaces associated with an acoustic reverberation property of the SR setting. The method includes recording, via the microphone, a real sound produced in a physical setting. The method further includes generating, using the one or more processors, a virtual sound by transforming the real sound based on the acoustic reverberation property of the SR setting. The method further includes playing, via the speaker, the virtual sound.
https://patents.google.com/patent/WO2020159784A1/en
Biofeedback method of modulating digital content to invoke greater pupil radius response
Abstract
One exemplary implementation displays a visual characteristic associated with an object on a display of a device and utilizes a sensor of the device to obtain physiological data associated with a pupillary response of a user to the visual characteristic. The device adjusts the visual characteristic based on the obtained physiological data to enhance pupillary responses of the user to the object and displays the adjusted visual characteristic to the user. For example, the adjusted visual characteristic may be selected based on previously identified pupillary responses of the user to particular visual characteristics.
https://patentscope.wipo.int/search/en/detail.jsf?docId=US339762766
US20210334992 - SENSOR-BASED DEPTH ESTIMATION
Abstract
Various implementations disclosed herein include techniques for estimating depth using sensor data indicative of changes in light intensity. In one implementation a method includes acquiring pixel events output by an event sensor that correspond to a scene disposed within a field of view of the event sensor. Each respective pixel event is generated in response to a specific pixel sensor within a pixel array of the event sensor detecting a change in light intensity that exceeds a comparator threshold. Mapping data is generated by correlating the pixel events with multiple illumination patterns projected by an optical system towards the scene. Depth data is determined for the scene relative to a reference position based on the mapping data.
Related News reports:
https://www.gadgetsnow.com/tech-new...clude-15-cameras/amp_articleshow/82068339.cms
Apple AR headset may include 15 cameras
The Apple AR headset will reportedly use Sony’s micro-OLED displays. “Although Apple has been focusing on AR, we think the hardware specifications of this product can provide an immersive experience that is significantly better than existing VR products.
https://www.msn.com/en-us/news/tech...eak-might-be-the-craziest-one-yet/ar-BB1fF1ak
The latest Apple AR headset leak might be the craziest one yet
the headset that Apple has reportedly been working on for years could feature as many as 15 cameras when it launches next year. In a previous report, he predicted that Apple’s headset with augmented reality and virtual reality capabilities could be available as soon as 2022, and would cost around $1,000.
As for why Apple would need to put so many cameras on its headset, 9to5Mac notes that in addition to collecting data from the outside world, Apple is reportedly interested in tracking the eye movements of the user while they’re wearing the headset. In yet another report, Kuo said that the headset will be able to tell where the user is looking, determine if they’re blinking, and use iris recognition to identify users — basically Face ID for headsets.
https://appleinsider.com/articles/2...ently-identify-and-annotate-items-of-interest
Apple AR will intelligently identify and annotate items of interest
https://www.macrumors.com/roundup/apple-glasses/
Apple Glasses
https://www.engadget.com/apples-mixed-reality-headset-standalone-ming-chi-kuo-082117045.html
Apple's mixed reality headset may be a standalone device
Apple's long-rumored mixed reality headset will be powered by two processors, according to renowned analyst Ming-Chi Kuo. In Kuo's latest research report seen by MacRumors and 9to5Mac, the analyst said that the device will have a main processor with the same computing power as the M1 chip and a secondary processor to handle all sensor-related computing. With both processors in place, the headset won't need to be tethered to an iPhone or a Mac.
https://www.computerworld.com/article/3642649/analyst-apples-ar-glasses-will-run-mac-chips.html
Analyst: Apple's AR glasses will run Mac chips
What’s really critical is the expected division of labor; Kuo says the headset will have one processor with the “same computing power level as the Mac,” while another chip will handle “sensor-related computing."
The additional chip is required because the sensors gather so much information that needs to be managed in real time. Kuo says the headset holds “at least” six to eight optical modules to provide “continuous video see-through AR services.”
https://patents.google.com/patent/US20200273180A1/en
Deformable object tracking
Abstract
Various implementations disclosed herein include devices, systems, and methods that use event camera data to track deformable objects such as faces, hands, and other body parts. One exemplary implementation involves receiving a stream of pixel events output by an event camera. The device tracks the deformable object using this data. Various implementations do so by generating a dynamic representation of the object and modifying the dynamic representation of the object in response to obtaining additional pixel events output by the event camera. In some implementations, generating the dynamic representation of the object involves identifying features disposed on the deformable surface of the object using the stream of pixel events. The features are determined by identifying patterns of pixel events. As new event stream data is received, the patterns of pixel events are recognized in the new data and used to modify the dynamic representation of the object.
[0066]
In some implementations, the tracking algorithm performs machine-learning-based tracking. The event stream(s) of the event camera(s) are fed to a machine-learning algorithm. The algorithm either processes each event in turn, processes in batches of events, or events are accumulated spatially or temporally before they are fed to the machine learning algorithm, or a combination thereof. The machine learning algorithm can additionally take as input a set of values from a latent space, which potentially encodes information about the object being tracked and its previous states. In some implementations, the machine learning algorithm is trained to regress directly to a dynamic object representation, or to an intermediate representation that is later converted to the dynamic object representation. Optionally, the machine-learning algorithm can regress to an updated set of values in the latent space, that are then used to process future events. In some implementations, a machine learning algorithm that performs the tracking is configured as a convolutional neural network (CNN), a recurrent network such as a long short-term memory (LSTM) neural network, a spiking neural network (SNN), or a combination of these networks or using any other neural network architecture. FIG. 8 provides an example of a CNN configuration.
Patents specifically mentioning neuromorphic processors:
https://patents.google.com/patent/US10282623B1/en
Depth perception sensor data processing
Abstract
Some embodiments provide a sensor data-processing system which generates a depth data representation of an environment based on sensor data representations which are generated by passive sensor devices. The sensor data-processing system generates the depth data representation via applying an algorithm which includes an model architecture which determines depths of various portions of the represented environment based on detecting features correspond to depth information. The model architecture is established via training an algorithm to generate depth data which corresponds to a sample set of depth data representations of environments, given a corresponding set of image data representations of the environments. As a result, the sensor data-processing system enables depth perception of portions of an environment independently of receiving depth data representations of the environment which are generated by an active sensor device.
https://patents.google.com/patent/US10762440B1/en
Sensor fusion and deep learning
Abstract
Some embodiments provide a sensor data-processing system which detects and classifies objects detected in an environment via fusion of sensor data representations generated by multiple separate sensors. The sensor data-processing system can fuse sensor data representations generated by multiple sensor devices into a fused sensor data representation and can further detect and classify features in the fused sensor data representation. Feature detection can be implemented based at least in part upon utilizing a feature-detection model generated via one or more of deep learning and traditional machine learning. The sensor data-processing system can adjust sensor data processing of representations generated by sensor devices based on external factors including indications of sensor health and environmental conditions. The sensor data-processing system can be implemented in a vehicle and provide output data associated with the detected objects to a navigation system which navigates the vehicle according to the output data.
Once an algorithm is trained, it is installed into a sensor data-processing system located in a vehicle. A sensor data-processing system which implements a deep learning algorithm can require less time and power to process sensor data, relative to traditional sensor data-processing systems. In some embodiments, a sensor data-processing system implementing a deep learning algorithm implements general computing hardware configurations, including one or more of general CPU and GPU configurations. In some embodiments, a sensor data-processing system implementing a deep learning algorithm implements one or more particular computing hardware configurations, including one or more of Field-programmable gate array (FPGA) processing circuitry, neuromorphic processing circuitry, etc. Particular computing hardware configurations can provide augmented computing performance with reduced power consumption, relative to conventional hardware configurations, which can be beneficial when a sensor data-processing system implements one or more deep learning algorithms, which can be relatively computationally expensive relative to traditional data-processing algorithms.
Patents related to Dynamic Vision Sensors:
https://patents.google.com/patent/US20200278539A1/en
Method and device for eye tracking using event camera data
Abstract
In one implementation, a method includes emitting light with modulating intensity from a plurality of light sources towards an eye of a user. The method includes receiving light intensity data indicative of an intensity of the plurality of glints reflected by the eye of the user in the form of a plurality of glints and determining an eye tracking characteristic of the user based on the light intensity data. In one implementation, a method includes generating, using an event camera comprising a plurality of light sensors at a plurality of respective locations, a plurality of event messages, each of the plurality of event messages being generated in response to a particular light sensor detecting a change in intensity of light and indicating a particular location of the particular light sensor. The method includes determining an eye tracking characteristic of a user based on the plurality of event messages.
https://patents.google.com/patent/US20210068652A1/en
Glint-Based Gaze Tracking Using Directional Light Sources
Abstract
Various implementations determine gaze direction based on a cornea center and (a) a pupil center or (b) an eyeball center. The cornea center is determined using a directional light source to produce one or more glints reflected from the surface of the eye and captured by a sensor. The angle (e.g., direction) of the light from the directional light source may be known, for example, using an encoder that records the orientation of the light source. The known direction of the light source facilitates determining the distance of the glint on the cornea and enables the cornea position to be determined, for example, based on a single glint. The cornea center can be determined (e.g., using an average cornea radius, or a previously measured cornea radius or using information from a second glint). The cornea center and a pupil center or eyeball center may be used to determine gaze direction.
https://patents.google.com/patent/US20200348755A1/en
Event camera-based gaze tracking using neural networks
Abstract
One implementation involves a device receiving a stream of pixel events output by an event camera. The device derives an input image by accumulating pixel events for multiple event camera pixels. The device generates a gaze characteristic using the derived input image as input to a neural network trained to determine the gaze characteristic. The neural network is configured in multiple stages. The first stage of the neural network is configured to determine an initial gaze characteristic, e.g., an initial pupil center, using reduced resolution input(s). The second stage of the neural network is configured to determine adjustments to the initial gaze characteristic using location-focused input(s), e.g., using only a small input image centered around the initial pupil center. The determinations at each stage are thus efficiently made using relatively compact neural network configurations. The device tracks a gaze of the eye based on the gaze characteristic.
https://patents.google.com/patent/US10845601B1/en
AR/VR controller with event camera
Abstract
In one implementation, a method involves obtaining light intensity data from a stream of pixel events output by an event camera of a head-mounted device (“HMD”). Each pixel event is generated in response to a pixel sensor of the event camera detecting a change in light intensity that exceeds a comparator threshold. A set of optical sources disposed on a secondary device that are visible to the event camera are identified by recognizing defined illumination parameters associated with the optical sources using the light intensity data. Location data is generated for the optical sources in an HMD reference frame using the light intensity data. A correspondence between the secondary device and the HMD is determined by mapping the location data in the HMD reference frame to respective known locations of the optical sources relative to the secondary device reference frame.
https://patents.google.com/patent/US20200258278A1/en
Detecting physical boundaries
Abstract
Techniques for alerting a user, who is immersed in a virtual reality environment, to physical obstacles in their physical environment are disclosed.
https://patents.google.com/patent/US20210051406A1/en
Method and device for sound processing for a synthesized reality setting
Abstract
In one implementation, a method of transforming a sound into a virtual sound for a synthesized reality (SR) setting is performed by a head-mounted device (HMD) including one or more processors, non-transitory memory, a microphone, a speaker, and a display. The method includes displaying, on the display, an image representation of a synthesized reality (SR) setting including a plurality of surfaces associated with an acoustic reverberation property of the SR setting. The method includes recording, via the microphone, a real sound produced in a physical setting. The method further includes generating, using the one or more processors, a virtual sound by transforming the real sound based on the acoustic reverberation property of the SR setting. The method further includes playing, via the speaker, the virtual sound.
https://patents.google.com/patent/WO2020159784A1/en
Biofeedback method of modulating digital content to invoke greater pupil radius response
Abstract
One exemplary implementation displays a visual characteristic associated with an object on a display of a device and utilizes a sensor of the device to obtain physiological data associated with a pupillary response of a user to the visual characteristic. The device adjusts the visual characteristic based on the obtained physiological data to enhance pupillary responses of the user to the object and displays the adjusted visual characteristic to the user. For example, the adjusted visual characteristic may be selected based on previously identified pupillary responses of the user to particular visual characteristics.
https://patentscope.wipo.int/search/en/detail.jsf?docId=US339762766
US20210334992 - SENSOR-BASED DEPTH ESTIMATION
Abstract
Various implementations disclosed herein include techniques for estimating depth using sensor data indicative of changes in light intensity. In one implementation a method includes acquiring pixel events output by an event sensor that correspond to a scene disposed within a field of view of the event sensor. Each respective pixel event is generated in response to a specific pixel sensor within a pixel array of the event sensor detecting a change in light intensity that exceeds a comparator threshold. Mapping data is generated by correlating the pixel events with multiple illumination patterns projected by an optical system towards the scene. Depth data is determined for the scene relative to a reference position based on the mapping data.
Related News reports:
https://www.gadgetsnow.com/tech-new...clude-15-cameras/amp_articleshow/82068339.cms
Apple AR headset may include 15 cameras
The Apple AR headset will reportedly use Sony’s micro-OLED displays. “Although Apple has been focusing on AR, we think the hardware specifications of this product can provide an immersive experience that is significantly better than existing VR products.
https://www.msn.com/en-us/news/tech...eak-might-be-the-craziest-one-yet/ar-BB1fF1ak
The latest Apple AR headset leak might be the craziest one yet
the headset that Apple has reportedly been working on for years could feature as many as 15 cameras when it launches next year. In a previous report, he predicted that Apple’s headset with augmented reality and virtual reality capabilities could be available as soon as 2022, and would cost around $1,000.
As for why Apple would need to put so many cameras on its headset, 9to5Mac notes that in addition to collecting data from the outside world, Apple is reportedly interested in tracking the eye movements of the user while they’re wearing the headset. In yet another report, Kuo said that the headset will be able to tell where the user is looking, determine if they’re blinking, and use iris recognition to identify users — basically Face ID for headsets.
https://appleinsider.com/articles/2...ently-identify-and-annotate-items-of-interest
Apple AR will intelligently identify and annotate items of interest
https://www.macrumors.com/roundup/apple-glasses/
Apple Glasses
https://www.engadget.com/apples-mixed-reality-headset-standalone-ming-chi-kuo-082117045.html
Apple's mixed reality headset may be a standalone device
Apple's long-rumored mixed reality headset will be powered by two processors, according to renowned analyst Ming-Chi Kuo. In Kuo's latest research report seen by MacRumors and 9to5Mac, the analyst said that the device will have a main processor with the same computing power as the M1 chip and a secondary processor to handle all sensor-related computing. With both processors in place, the headset won't need to be tethered to an iPhone or a Mac.
https://www.computerworld.com/article/3642649/analyst-apples-ar-glasses-will-run-mac-chips.html
Analyst: Apple's AR glasses will run Mac chips
What’s really critical is the expected division of labor; Kuo says the headset will have one processor with the “same computing power level as the Mac,” while another chip will handle “sensor-related computing."
The additional chip is required because the sensors gather so much information that needs to be managed in real time. Kuo says the headset holds “at least” six to eight optical modules to provide “continuous video see-through AR services.”