Pom down under
Top 20
Hi All
Here is another paper that I cannot provide a reference and might end up being over a couple of posts. Relax Pom I won't be asking questions. My opinion only DYOR Fact Finder:
EDGX-1: A New Frontier in Onboard AI
Computing with a Heterogeneous and
Neuromorphic Design
1
st Nick Destrycker
EDGX
Belgium
2
nd Wouter Benoot
EDGX
Belgium
3
rd Joao Mattias ̃
EDGX
Belgium
4
th Ivan Rodriguez
BSC
Spain
5
th David Steenari
ESA
The Netherlands
Abstract—In recent years, the demand for onboard artificial
intelligence (AI) computing in satellites has increased dramati-
cally, as it can enhance the autonomy, efficiency, and capabilities
of space missions. However, current data processing units (DPUs)
face significant challenges in handling complex AI algorithms and
large datasets required for onboard processing. In this paper, we
propose a novel heterogeneous DPU architecture that combines
several different processing units into a single architecture. The
proposed DPU leverages the strengths of each processing unit to
achieve high performance, flexibility, and energy efficiency, while
addressing the power, storage, bandwidth and environmental
constraints of space. We present the design methodology for the
DPU, including the hardware architecture and AI capabilities.
The product represents a potentially significant advancement in
the field of onboard AI computing in satellites, with applications
in a wide range of space missions.
Index Terms—Onboard AI Processing, GPU, Neuromorphic,
Heterogeneous computing, in-orbit retraining, AI Acceleration,
low power, high performance, continious online learing
I. INTRODUCTION
The demand for artificial intelligence (AI) in space ap-
plications has been growing rapidly in recent years. How-
ever, the limited computational resources available onboard
spacecrafts have posed a significant challenge for AI imple-
mentation [1] [2]. To address this challenge, data processing
units (DPUs) that leverage heterogeneous computing platforms
have emerged as a promising solution for high-performance
AI computing [3] [4] [5]. These platforms can leverage the
strengths of each processing unit to accelerate AI computations
and achieve higher energy efficiency compared to traditional
single-processor systems.
In this paper, we introduce EDGX-1, a novel DPU that
combines classic computing units and neuromorphic process-
ing capabilities for high-performance AI computing in space
applications designed for class V type missions. EDGX-
1 is a heterogeneous computing platform that integrates a
CPU, GPU, TPU, FPGA and NPU. It allows the creation of
flexible, dynamic AI pipelines to infer and even retrain AI/ML
algorithms onboard. Through the NPU, the EDGX-1 naturally
supports continual learning in the ML models.
The neuromorphic computing capability of the EDGX-1
DPU is a significant innovation which is designed to mimic
the neural structure of the brain [6], enabling them to perform
certain computations much faster and more efficiently than
traditional computing platforms. The integration of a neuro-
morphic processing unit in the EDGX-1 DPU can enhance the
system’s ability to handle AI applications that require real-
time processing [7], low power consumption and high data
bandwidth.
The EDGX-1 is targeted towards the satcom market as well
as the earth observation market and is well-suited for a variety
of space applications, including optical imaging, cognitive
radio, and cognitive SAR.
For optical imaging, the heterogeneous computing archi-
tecture and neuromorphic computing capability can enhance
the system’s ability to process high-resolution images and
data efficiently. In cognitive radio applications such as dy-
namic spectrum allocation and interference management, the
EDGX-1 can be used for low power spectrum monitoring,
analysis and realtime signal classification and charactarization.
In cognitive SAR applications, the EDGX-1 can leverage
its unique processing architecture to provide realtime SAR
instrument zooming, object classification and accelerated data
processing to support environmental monitoring and rapid
disaster response. Overall, the EDGX-1 offers a powerful and
versatile platform for a wide range of space applications. The
flexible architecture also allows for swift adaptation to serve
other spacecraft payloads and needs.
The remainder of this paper is structured as follows. Sec-
tion II addresses the current challenges faced by data process-
ing units and potential solutions to the problem. Section III
provides an overview of the EDGX-1 DPU’s heterogeneous
computing architecture and neuromorphic computing capabil-
ity. Section IV describes the AI operational and retraining
environment capabilities onboard the EDGX-1. Section V
concludes the paper and discusses future work.
II. ONBOARD AI CHALLENGES AND SOLUTIONS
A. Meeting Power Budget of Microsatellites and Cubesats
As the power budget of microsatellites and CubeSats is often
minimal, the EDGX-1 provides various ways to configure its
power consumption. In such a way, system engineers can
completely control and limit the EDGX-1’s power draw. For
the FPGA, they can decide upon the total power consumption
during the bitstream generation process. Similarly, they can
turn off some of the subsystems in the boot configuration for
the SoC. Some SoC subsystems can even be changed on the
fly, tailoring to a dynamic power budget on the satellite in-
orbit. Drastically changing power modes raises the question
if it would effect the radiation characteristics of the device.
However, preliminary tests from Rodriguez et al. [8] indicate
that the radiation characteristics remain consistent between
power modes.
Since the demand for more data acquisition keeps rising,
future data processing systems will need to process increas-
ingly more data within the same power budget. The EDGX-1
technically addresses this problem by implementing the Akida
processor from BrainChip. This digital neuromorphic chip
mimics the human brain to work at high energy efficiency. It
calculates in an event-driven fashion, using only power when
needed, reducing overall power consumption. Integrating the
Akida is ideal for satellites with a limited power budget.
Finally, the design of the EDGX-1 allows not only the
use of the NVIDIA Orin embedded SoC. It also supports
multiple previous generations of NVIDIA embedded devices,
like the TX2 or the Xavier family. This ultimately lowers
power consumption further in favour of performance. Even
within the same NVIDIA generation, the EDGX-1 can accom-
modate the Orin Nano, delivering a lower power consumption
for reasonable performance. All these SoCs share the same
hardware interface and software stack. This way, the EDGX-1
can meet any power requirement and adapt to a large spectrum
of smallsat, microsat and cubesat mission scenarios.
B. Reliability of COTS System-on-Chip in Space Environment
Radiation can impact semiconductors [9] and affect device
reliability [10]. High-energy particles can disrupt charges in
a transistor’s layers, causing unexpected behaviour known
as Single-Event Effects (SEEs) [11]. SEEs can be classified
as Single-Event Upsets (SEUs) or Single-Event Functional
Interrupts (SEFIs) [12]. In the first, the SEE only manifested
itself as a bit-flip in memory, whilst in the latter, a bit-flip in
the control logic caused a loss of device functionality. While
neither physically nor permanently harm the device, a radiation
event causing a MOSFET transistor related to power regulation
to remain open can result in a destructive Single-Event Latch-
up (SEL). Most SEEs, including SELs, can be cleared with a
power cycle and are transient errors.
Radiation testing is crucial to assessing electronic devices’
reliability in safety-critical systems. These tests use protons,
neutrons, heavy ions, two-photon absorption [13] or gamma
radiation (also known as Total Ionizing Dose (TID) testing)
to accelerate the appearance of SEEs for analysis. TID testing
can also shine a light on the device’s ageing by accumulating
radiation faster than it would naturally occur. Neutrons are
commonly used for terrestrial applications, while protons and
heavy ions are preferred for space systems.
The EDGX-1 targets the NVIDIA Jetson Orin NX. Al-
though no radiation test has happened on this device, it has
happened from some of the older modules from NVIDIA. For
example, the Jeston TX2 [14] and the Jetson Xavier NX [8]
have undergone radiation testing. The latter is relevant for the
EDGX-1 since the Xavier directly preceded the Orin. Even
though we can not extrapolate the radiation characteristics to
the Orin, most of its subsystems and processor architecture
are similar to the Xavier. Conducted radiation tests on the
Xavier [8], utilising the Reliability, Availability, and Service-
ability (RAS) features from ARM, pointing out that the cache
tags were the leading causes of SEFIs. These tags are only
protected by a single parity bit, causing complete reboots due
to the inability to recover. In the Orin family, NVIDIA changed
the processor to the Cortex-A78AE [15]. 1-bit correction and
2-bit detection error correction codes now protect the cache
tags. This change could decrease the sensitivity to SEFIs, but
further radiation tests of the Orin will need to verify this
hypothesis.
C. Availability of System in Space Environment
Devices built with these COTS components have a high
chance of encountering SEFIs, which could affect their avail-
ability. The lack of availability can cause the loss of data
or mission return. Availability is an open problem that also
has been explored in the automotive domain. Possible solu-
tions have been proposed, like redundant kernels in the GPU
section [16] or register protection for the CPU [17]. A more
resilient external device could also act as the interface between
the payload and the COTS device, ensuring no data loss in
the case of a SEFI [18]. Luckily, because the EDGX-1 is
a very computationally capable device, it can catch up and
re-compute at an increased rate. This way, the system still
processes all data safeguarded by the resilient supervisor. This
concept has been explored by Kritikakou et al. [19] with
promising results.
D. In-orbit flexibility, adaptability and AI retrainability
Space missions are subject to a wide array of factors that
can affect the performance of AI algorithms. Variations in data
distributions, changes in the spacecraft’s position, sensor drift,
unknown new data, unforeseen new environment parameters
and evolving mission objectives can all contribute to making
pre-trained AI models less effective over time. Therefore, the
challenge lies in developing a mechanism to update and adapt
AI models without requiring frequent costly communication
with ground stations. Enabling AI models to adapt and retrain
in orbit autonomously is crucial for maintaining optimal
performance throughout the lifetime of the mission.
Ensuring flexibility and adaptability in the data processing
part of the payload is crucial for securing a future proof satel-
lite design for next generation missions. The world of AI and
processing technology is changing rapidly, rendering todays
data processing units obsolete tomorrow. Implementing future
technology is critical to allow for changing business models
and keeping relevance in today’s fast paced environment
The EDGX-1 DPU addresses the challenge of in-orbit
adaptability through its innovative capability to retrain AI al-
gorithms directly onboard the spacecraft. This capability stems
from the heterogeneous architecture that combines various
processing units, capable of being reprogrammed and enabling
dynamic adjustments and updates to AI models without ex-
tensive external intervention. Conventional AI retrainability
as well as continual online learning through neuromorphic
technology is available onboard the EDGX-1 DPU and further
explained in Section IV.
Incorporating onboard AI retraining within the EDGX-1
DPU enhances the overall capabilities of space missions by
ensuring that AI models remain effective and accurate in
ever-changing conditions. This dynamic adaptability not only
optimizes decision-making but also reduces the dependence
on ground-based intervention for model updates.
Fig. 1. Hardware Architecture
III. EDGX-1 HARDWARE ARCHITECTURE
A low-power heterogeneous computing design that com-
bines a GPU, FPGA, CPU, TPU, and NPU can potentially
solve the problems with current data processing units and
boost onboard AI computing performance for satellites.
The general hardware architecture consists of multiple
printed circuit boards (PCBs), designed according to PCIe/104
form factor specification which allows for a large range of
benefits for embedded computing applications, including high-
speed connectivity, compact size, rugged design, scalability,
and interoperability.
Internal power and data transmission occur through the
PCIe/104 triple branch stacking connector which acts as a
structural and electrical backbone for the full DPU. This data
bus provides power in 3.3V, 5V and 12V and has both high-
speed (PCIe x1, PCIe x4, USB 2.0/3.0) and low-speed (CAN,
I2C) data links.
The overall DPU is therefore arranged as a modular stack
of these PCB modules as seen in Figure 1, where each
PCB provides specialized computing capabilities and can be
seamlessly integrated (or not) according to the needs and
constraints of the given mission. Multiples of the same type of
unit can be stacked to provide enhanced computational power.
Every processing board provides 2x high speed external
interfaces (ETH, USB 2.0/3.0) for data transfer as well as 2x
low speed interfaces (CAN, UART) for command and control
interfaces. Additional GPIO pins are available for customi-
sation. The power supply and distribution on the EDGX-1
includes overvoltage, overcurrent and latch-up protection.
Fig. 2. Core module schematic.
A. Core Module
This module serves as the core for the DPU stack and
is the only essential module. It provides the primary inter-
face between the EDGX-1 DPU and the spacecraft OBC.
A ruggedized radiation-hardened-by-design (RHBD) CPU is
responsible for the command and data handling dispatching
of the DPU stack, as well as acting as a watchdog over the
different DPU modules. The CPU controls the PCIe switch and
is responsible for establishing and managing the PCIe network
set up within the PCIe/104 stacking connectors.
As seen in Figure 2, the Core module power supply includes
redundant power connectors and latchup protection, as well as
power management of the DPU modules which is handled by
the CPU. The board contains boot drives in hot redundancy.
The SSD storage on the Core board can be accessed by other
DPU modules in the stack to serve as cold storage. In the
event that additional cold storage is necessary, data storage
expansion modules could be integrate in the DPU stack and
are accessible via PCIe links.
B. System-on-Chip Module
The System-on-Chip (SoC) module is designed to serve as
a general AI workhorse, supporting a wide variety of ML
tasks via its powerful and diverse computing architecture. The
processing capabilities of CPUs, GPUs and Tensor Processing
Units (TPUs) are embedded into the NVIDIA Jetson Orin
NX. It combines the CortexA78(AE), the highest-performing
Fig. 3. System-on-Chip module schematic.
CPU from ARM to-date with built-in safety features, with
the high performance offered by the Ampere GPU and the
two NVDLA 2.0 AI inference cores. The SoC runs a custom
operating system built on Jetson Linux to fully leverage the
capabilities of the hardware and provide a development- and
integration-friendly environment while minimising computa-
tional overhead.
Several features of this SoC make it a suitable choice for
onboard AI acceleration in a space mission context. The SoC is
designed for edge processing in a harsh industrial environment
from a thermal and mechanical perspective. From the software
side, CPU safety features include the reliability, availability
and serviceability (RAS) protocols, which allow for error
tracking and correction and thus enable the SoC to handle
single event effects (SEEs) due to radiation exposure. Variable
power modes can be configured to attain optimal power
budgets for different applications and mission/spacecraft con-
straints.
Fig. 4. FPGA module schematic.
C. FPGA Module
The FPGA module has the primary function of providing
flexible and customizable hardware acceleration for specific
AI/ML tasks, as well as general purpose high-throughput data
processing as commonly found in spacecraft DPUs. The use of
programmable logic and Intellectual Property (IP) cores can
be tailored for the specific use case, such as Digital Signal
Processing and ML inferencing.
The FPGA module layout described in Figure 4 is based
around the Xilinx Kintex Ultrascale series of FPGAs. Several
FPGA modules can be used in a DPU stack and the dedicated
high speed inter-board connection interface allows for direct
data exchange between FPGA units in the DPU stack through
high speed serial links using protocols such as Aurora.
D. Neuromorphic Module
The Neuromorphic module will contain a dedicated Neuro-
morphic Processing Unit (NPU) which enables the implemen-
tation, training and execution of neuromorphic AI algorithms.
The hardware implementation of these algorithms could be
supported through various different implementations of NPUs,
ranging from FPGAs running neuromorphic IP cores and pro-
grammable logic of neuromorphic compute units to dedicated
neuromorphic application-specific integrated circuits (ASICs).
Fig. 5. Neuromorphic module schematic
The current module design uses the BrainChip Akida
AKD1000, a neuromorphic ASIC. This chip has a fabric
of 80 separate neuron cores interconnected with each other,
providing next generation classic CNN acceleration, efficient
continuous online and on-chip learning, one-shot learning and
the execution of spiking neural network models. Leveraging
these technologies results in a module with high execution
speed, low power consumption and enhanced operability
which is suitable for energy efficient and low latency or real-
time data processing applications.
Figure 5 shows a detailed schematic of the Neuromorphic
module. The main communication interface for the AKD1000
is the PCIe x1 lane, although USB 2.0 and I2C are also
natively supported. This module additionally contains a local
CPU which stores the drivers and carries out data handling for
the module’s external data interfaces.
E. Overall Architectural Evaluation
By combining these different processing units, a low power
heterogeneous computing design can provide a balance of
performance, flexibility, and energy efficiency that is necessary
Fig. 6. EDGX-1 Operational and Retraining Environment
for onboard AI computing in satellites. This approach can
address the limited processing power, storage capacity, power
constraints, and radiation hardening requirements while also
maximizing bandwidth and handling the extreme environmen-
tal conditions of space.
However, implementing such a design would require careful
consideration of factors such as power consumption, size,
weight, and cost, as well as the specific requirements of the
satellite mission. Additionally, developing software that can
effectively utilize and coordinate the different processing units
would be crucial for achieving optimal performance.
IV. EDGX-1 ARTIFICIAL INTELLIGENCE
The EDGX-1 has a heterogeneous design that enables users
to create flexible, powerful AI pipelines, catering to a wide
range of complex tasks. Its high-performance components
even allow for onboard AI inference and retraining inside the
pipelines, ensuring that the AI models can adapt and improve
over time without requiring constant updates from external
sources. Furthermore, the neuromorphic capabilities of the
EDGX-1 not only empower it with advanced processing abil-
ities but also allow for low-power onboard continual learning
during operations. In this way, the EDGX-1 can pioneer the
development of autonomous onboard AI.
A. Retraining
The SoC board supports complete, unsupervised onboard
retraining to remove the need for frequent on-ground updates.
Onboard retraining abstains EDGX-1 from wasting valuable
bandwidth to send over weights updates. Instead, a new model
is being trained onboard parallel to operational execution.
Once the new model finishes training, it can take over opera-
tions. In such a way, the model calibrates itself autonomously
to real-life data. As Figure 6 depicts, an iterative student-
teacher cycle drives the onboard retraining. Whilst the NN
inferences the incoming data, the system stores all data and
predictions as pseudo-labels in a dedicated dataset. At mo-
ments when enough power is available, an identical NN trains
on the pseudo-labelled dataset. In such a way, the onboard
retraining takes the form of a self-supervised Knowledge
Distillation (KD) process. If desired, the system can mix in a
part of the original dataset to combat catastrophic forgetting.
The training makes full use of the available GPU whilst the
inferencing process continues to work uninterrupted on the
AI accelerators of the SoC. Iteratively the student evaluates a
reference dataset, taking over its place as the teacher should it
outperform it. Depending on whether full retraining or partial
fine-tuning is needed, the system can freeze the feature layers
of the student. The system thus exploits a Transfer Learning
(TL) approach to maximise efficiency.
B. Continual Learning
The NPU board naturally supports continual learning to
adapt the neuromorphic model to any drifts in data. Hence,
users can balance the plasticity and stability of their mod-
els. During inference, the neuromorphic models automatically
learn to cope with dynamic conditions. In such a way, the
EDGX-1 can guarantee smooth operations even under sudden
catastrophic changes, such as a sensor failure.
As Figure 6 depicts, a Spiking Neural Network (SNN) learns
completely unsupervised from the datastreams it perceives.
The local weight update paradigm, Spike-Timing-Dependent
Plasticity (STDP), will enable the SNN to recognise new
patterns and behaviours in the data. Because the field of
neuromorphic computing is still in the research phase, initially
coupling it with conventional AI will allow for early integra-
tion in real applications. The SoC can utilise a traditional NN
encoder to transform the data into a latent space representation.
This latent space facilitates converting the data to input spikes
for the SNN.
V. CONCLUSION
As the exponential growth of AI and high performance data
processing continuous to expand, the demand for more power-
ful and more efficient data processing units increases equally
as fast. This work details some of the greatest challenges for
onboard AI in the space industry and corresponding potential
solutions. The paper introduces EDGX-1, a new onboard
AI computer with a heterogeneous design and neuromorphic
capabilities. A new architecture solving the challenges the in-
dustry faces whilst keeping low power dissipation and pushing
the limits of high performance AI computing.