We made some comments on both Google Coral and Jetson Nano as two machine learning accelerating hardware platforms. Now, we would like to wrap things up with a comparison post.

Differing philosophies lead to different designs.

It is really hard to resist the comparison. But it is particularly hard to compare too, since they don't technically emulate each other even though their application domains are virtually identical.

It presents a case for fundamental cultural differences between these companies. For every question that can arise, they seem to have differing answers.

  • For example, when it comes to architecture, NVidia favors flexibility with its multiple GPU cores to efficiency, which Google favors with its stripped down core with only 8-bit integer support. Jetson and Coral use TensorRT and TFLite converter respectively in order to optimize the network to their favor. Google, of course, chose to disrupt, therefore seems to lead in power and efficiency.
  • When it comes to development environment, Jetson Nano ships a fully fledged Ubuntu running on the device with proper GUI whereas Coral is rather dependent on the host system. (Though there is a distribution that is called Mendel, it is advised against using it locally.)
  • Jetson Nano relies on an SD card as a storage element in the dev board and will ship actual modules with eMMC, whereas Coral is shipped with its eMMC in both cases. This might also be due to the fact that Nvidia was under pressure to release the dev kit under $100.
  • An important point is that Jetson Nano lacks wireless connectivity natively, whereas in Coral it is present, and pretty future-proof. BUT, promotional rendering of Jetson Nano shows a RF section which is not populated. Therefore, there will be a version of the module that incorporates wireless connectivity. Although it will be a long shot, the footprint indicates that it resembles Cypress (formerly Broadcom) CYW43340.
  • In the few benchmarks that were made available by NVidia, Google excels in application performance, except with the Inception-v4 model. With its tiny size, Coral surprises many, and it is their selling point.
Promotional image of the Jetson Nano Module with its glory.

In short, it seems that Nvidia favors upgradability, development flexibility whereas Google favors the performance and connectivity. This is due to the fact that in its debut, Google wants to bite and hold a portion of the market that Nvidia cannot address efficiently. Therefore, inadvertently they have polished their advantages and avoided points NVidia currently excels. Nvidia is the cautious party here (call them the incumbent, if you will) with a device in the middle ground, whereas Google has devised a board which aims particularly at IoT, boldly presenting their case (which makes them the disruptor).

One might also say that Nvidia already has customers to satisfy, but Google has come to impress and catch early adaptors.

First things first. NVidia already has a community and the toolchains required for development with GPUs. Google, with its application specific ASIC, dubbed Edge TPU, seems to drive the change novel core architecture. They are trying to guess the exact architecture that will drive the demand in the upcoming few years. Therefore they trimmed the fat in the architecture, and introduced the limitations in their development environment.

I think it is safe to say that from an embedded engineer's perspective, Jetson Nano and Intel Movidius NCS are similiar to DSPs and GPUs as they still operate on instruction sets, whereas Edge TPU is more distributed, something that you would achieve in an FPGA.

Finale

I would say that in order to begin working on machine learning on embedded systems, (or as some call AI on the edge) I would recommend the Jetson Nano. Though there is a sense of urgency around the topic and a powerful impetus towards inference performance, development environment still counts. Switching, if necessary would not be much of an issue.

We, on the other hand, are designing on both of them, actually also with many others including Intel Movidius Neural Compute Stick or straight STM32 ARM Cortex microcontrollers with machine learning optimizations. They cover a wide range of the ML inference domain, and should be chosen specifically for the application.