Fullmoonfever
Top 20
I see Olivier has recently provided some comments to a post by a freelancer named Artiom as per link.
Olivier talks about some tech details I found interesting (my bold) and also, when you go to Olivier's LinkedIn, it says working for BRN? I thought he moved on or I mistake something?
www.linkedin.com
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
The Austrian Institute of Technology (AIT) where Christoph Posch developed the ATIS sensor developed a linear event-based visual array of 3(?) pixel wide to scan objects on a conveyor belt moving really fast. We used the technique presented here to categorize moving grocery items and read their barcodes when put in a shopping basket. The difficulty was the effective temporal resolution, only about 0.1 ms at best, far from the ns resolution of the timestamps. So we needed better techniques, dynamical NNs to extract all the info and compensate for the time-dependent response of a pixel: if fired recently, less likely to fire again for same contrast. We didnโt have that NN then and I think I have it now with PLEIADES and successors.
Like
Reply
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
The increase resolution brings out the another main advantage of event-based vision sensors, one can eventually see/resolve objects that one couldnโt with same frame-based resolution. We could still track drones flying in the sky that were 1/10 the size of a single pixel for example. Try that with a frame-based camera. We could generate Gigapixel resolution images with a DAVIS240 (240 x 180) of our surroundings while driving a vehicle offroad on a bumpy ride.
Like
Reply
Vincenzo Polizzi
PhD Candidate at University of Toronto | SLAM, Machine Learning, Robotics
3d
This is a great application, very nicely explained! One interesting aspect visible in your results is that edges parallel to the motion direction are much less represented, simply because they donโt generate many events. A way to address this is to excite pixels from multiple directions, ensuring that all edges trigger events. This is exactly what we explored in a class project that eventually evolved into our paper VibES: we mounted the event camera on a small mechanical platform that oscillates in both X and Y, essentially a tiny โwashing-machine-likeโ motion
. By injecting a known motion pattern (and estimating the sinusoidal components online), we obtained dense event streams and significantly sharper reconstructions, even for edges that would otherwise remain silent. Glad to hear your thoughts!
Like
Reply
1 Reaction
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
This brings out another point: single cone/rod in our retinas donโt have the angular resolution to clearly see stars in the night sky, our eyes just donโt have the necessary resolution, yet we can clearly stars. How is that then possible? Eye movement and spike timing. Within a certain spatial resolution, spikes fire at the same spatial location when our eyes move. The temporal resolution of the spike translates to the spatial resolution that we can achieve when the spikes accumulates around the same spatial location where the star is. Be drunk and the stars should appear thicker or will appear to move because the temporal spike alignment is off. Thus, what we see is clearly a result of the neural processing, not just the inputs, the neural networks processing the spike firing. Thatโs the neural processing we need to take full advantage of event-based (vision) sensors, without having to resort to the processing of periodic sampling of frames that the engineering world is stuck in today.
Olivier talks about some tech details I found interesting (my bold) and also, when you go to Olivier's LinkedIn, it says working for BRN? I thought he moved on or I mistake something?
๐๐จ๐ฐ ๐ญ๐จ ๐ฎ๐ฌ๐ ๐๐ง ๐๐ฏ๐๐ง๐ญ-๐๐๐ฌ๐๐ ๐๐๐ฆ๐๐ซ๐ ๐ข๐ง ๐๐ง ๐ข๐ง๐๐ฎ๐ฌ๐ญ๐ซ๐ข๐๐ฅ ๐๐จ๐ง๐ญ๐๐ฑ๐ญ? โ๏ธ๐ธ Many industrial setups fall under a common configuration: ๐ ๐ฌ๐ญ๐๐ญ๐ข๐ ๐๐๐ฆ๐๐ซ๐โฆ | Artiom Tchouprina
๐๐จ๐ฐ ๐ญ๐จ ๐ฎ๐ฌ๐ ๐๐ง ๐๐ฏ๐๐ง๐ญ-๐๐๐ฌ๐๐ ๐๐๐ฆ๐๐ซ๐ ๐ข๐ง ๐๐ง ๐ข๐ง๐๐ฎ๐ฌ๐ญ๐ซ๐ข๐๐ฅ ๐๐จ๐ง๐ญ๐๐ฑ๐ญ? โ๏ธ๐ธ Many industrial setups fall under a common configuration: ๐ ๐ฌ๐ญ๐๐ญ๐ข๐ ๐๐๐ฆ๐๐ซ๐ ๐จ๐๐ฌ๐๐ซ๐ฏ๐ข๐ง๐ ๐ฆ๐จ๐ฏ๐ข๐ง๐ ๐ฉ๐๐ซ๐ญ๐ฌ ๐จ๐ง ๐ ๐๐จ๐ง๐ฏ๐๐ฒ๐จ๐ซ. In such cases, ๐๐ฏ๐๐ง๐ญ-๐๐๐ฌ๐๐ ๐๐๐ฆ๐๐ซ๐๐ฌ are very appealing โ since the background is static, ๐จ๐๐ฃ๐๐๐ญ ๐๐ฑ๐ญ๐ซ๐๐๐ญ๐ข๐จ๐ง ๐๐จ๐ฆ๐๐ฌ โ๐๐จ๐ซ...
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
The Austrian Institute of Technology (AIT) where Christoph Posch developed the ATIS sensor developed a linear event-based visual array of 3(?) pixel wide to scan objects on a conveyor belt moving really fast. We used the technique presented here to categorize moving grocery items and read their barcodes when put in a shopping basket. The difficulty was the effective temporal resolution, only about 0.1 ms at best, far from the ns resolution of the timestamps. So we needed better techniques, dynamical NNs to extract all the info and compensate for the time-dependent response of a pixel: if fired recently, less likely to fire again for same contrast. We didnโt have that NN then and I think I have it now with PLEIADES and successors.
Like
Reply
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
The increase resolution brings out the another main advantage of event-based vision sensors, one can eventually see/resolve objects that one couldnโt with same frame-based resolution. We could still track drones flying in the sky that were 1/10 the size of a single pixel for example. Try that with a frame-based camera. We could generate Gigapixel resolution images with a DAVIS240 (240 x 180) of our surroundings while driving a vehicle offroad on a bumpy ride.
Like
Reply
Vincenzo Polizzi
PhD Candidate at University of Toronto | SLAM, Machine Learning, Robotics
3d
This is a great application, very nicely explained! One interesting aspect visible in your results is that edges parallel to the motion direction are much less represented, simply because they donโt generate many events. A way to address this is to excite pixels from multiple directions, ensuring that all edges trigger events. This is exactly what we explored in a class project that eventually evolved into our paper VibES: we mounted the event camera on a small mechanical platform that oscillates in both X and Y, essentially a tiny โwashing-machine-likeโ motion
Like
Reply
1 Reaction
Olivier Coenen
From theoretical models to chips: I design SOTA spatiotemporal NNs (PLEIADES, TENN (Akida 2+), SCORPIUS) that train like CNNs and run efficiently at the edge like RNNs.
6d
This brings out another point: single cone/rod in our retinas donโt have the angular resolution to clearly see stars in the night sky, our eyes just donโt have the necessary resolution, yet we can clearly stars. How is that then possible? Eye movement and spike timing. Within a certain spatial resolution, spikes fire at the same spatial location when our eyes move. The temporal resolution of the spike translates to the spatial resolution that we can achieve when the spikes accumulates around the same spatial location where the star is. Be drunk and the stars should appear thicker or will appear to move because the temporal spike alignment is off. Thus, what we see is clearly a result of the neural processing, not just the inputs, the neural networks processing the spike firing. Thatโs the neural processing we need to take full advantage of event-based (vision) sensors, without having to resort to the processing of periodic sampling of frames that the engineering world is stuck in today.