Recogni, a three-year old California startup developing AI-based vision processors, appears to be attracting growing interest among venture capitalists and the investment arms of Tier 1s and OEMs in the auto industry. Recogni (pronounced re-cog-nye) revealed Wednesday that it has garnered $48.9 million in its Series B funding.
The lead investor in this round is WRVI Capital. New investors include the Mayfield Fund, Continental and Robert Bosch Venture Capital. They join existing Recogni investors GreatPoint Ventures, Toyota AI Ventures, BMW i Ventures, Fluxunit-OSRAM, and DNS Capital — who provided $25 million in Series A funding in the summer of 2019.
Inference engine next to image sensors
Recogni’s new AI processor is an inference engine targeted at the “edge” of next-generation driver-assist and autonomous driving vehicles. The startup’s objective is to deliver an AI-based vision processor with extremely high-performing compute power at very low power consumption.
If this pitch sounds familiar, Recogni, indeed, isn’t first in making such a promise.
Separating Recgoni from the pack, though, might be its approach to solving autonomous vehicles’ perception pipeline issuse. Rather than beefing up the central compute power of an SoC — sometimes described as AV’s “brain” — Recogni wants carmakers to place the company’s 1000-TOPS processor at the vehicle’s edge, right next its “eye,” a CMOS image sensor.
Recogni’s chip, when tightly coupled with a perception sensor, will process visual data in real time, at high frame rates in high resolution, according to the company.
Given the multiple workloads an AV platform must support, raw processing speed — measured in trillions of operations per second (or TOPS) — of an SoC isn’t necessarily the most useful metric. However, Recogni’s proposal to put a 1000 TOPS processor per camera, for processing vision data, is very different from its competitors’ approach. Nvidia, for example, is promoting its Orin SoC, which achieves 200 TOPS, to handle the many applications and deep neural networks that run simultaneously in autonomous vehicles.
On its website, Recogni touts its perception processor as “the only multi-ocular camera system architecture purpose-built for object recognition that extracts passive stereoscopic depth at the pixel level.”
The startup also claims that its processor “achieves greater processing efficiency & speed by storing weights (parameters) of the object library on-chip, where the computational analysis is performed.”
Recogni said its module is “pipelined and operates at greater than 8-megapixel images at 60 frames per second, where it is able to recognize (detect, segment & classify) objects, fuse depth-sensor information into the objects, and provide the intelligence to the central system within a few milliseconds.”
Asked about Recogni, Egil Juliussen, a veteran automotive industry analyst and a columnist for EE Times, told us, “It absolutely makes sense to bring a processor very close to sensors.” It substantially reduces data that needs to be transferred, opening the path from sensors to central computer. Judging from Recogni’s website claims, he noted that Recogni’s perception processor appears to have an on-chip memory. “That’s a big advantage” for AI processing, he added.
In a one-on-one interview with EE Times, R K Anand, founder and CEO of Recogni declined to either detail the processor’s architecture or preview the product. However, he did explain what his team has identified as “the hard problem” for automakers developing ADAS and self-driving vehicles. The issue, as he sees it, is in “processing at the edge visual data that comes in at high frame rates, high resolution.” This becomes especially tough in fully autonomous vehicles, he added, because it must happen in real time.
Phil Magney, founder and president of VSI Labs, observed, “We are seeing a lot of innovation right now aimed at improving the ML vision pipeline for ADAS and AD applications. This applies to both the training and inference models.”
While neural network instructions are being better optimized for energy efficiency and computational power density, Magney said, “The industry is seeking improvements in the whole pipeline and trying to figure where they can trim some fat.” Given the massive data streams they must handle while identifying the important items to save, down-sample and store for further network training, it’s hardly surprising that “some of this is being pushed to the edge where resident models can make decisions about the data such as what to set aside versus what goes downstream.”
Why use one mega-pixel sensor?
A teardown of Tesla’s Model 3 reveals that its Triple Forward Camera features three CMOS image sensors of the ON Semi AR0136A — with 3.75 µm Pixel size at 1280×960 (1.2 megapixel) resolution. The Tesla Model 3 Driver Assist Autopilot Control Module Unit (or TM3DAACMU) offers front-image capture up to 250 meters.
But here’s the rub. Why is Tesla content driving around with one-megapixel sensors? “It just makes no sense to us because in our pockets, we have these phones with cameras with eight and 10 or 12 mega-pixel sensors,” Anand said.
Automakers are using one-megapixel sensors, not because they are lower cost but because their vehicles don’t have enough processing capacity, Anand said. And even if some carmakers make claims about two, four or eight-megapixel sensors, their vehicles are just taking down-sampled information from the new sensors, he explained.
Asked if down-sampling is common, VSI Labs’ Magney told us, “This is true and has been the case for a while. The newer GPUs can handle this, but again those are not optimized for an embedded system.”
In other words, the cost of sensors isn’t preventing carmakers from using higher-end image sensors, Anand said. “The problem is the cost of the compute, the amount of compute and the power of the compute.” If there were no solutions to these three problems, there would be no point in upgrading to a higher-end sensor, “because you can’t process it.” Anand calls this a “Mount Everest problem” that Recogni is poised to climb.
The higher the resolution, the better the inference
Inference is directly related to information load, said Anand. “So, when a machine has been trained with more information, the higher the likelihood that you can infer it correctly.”
Picture a pedestrian at 200 meters, a stop sign, or a soccer ball bouncing off the curb — these are things that matter. “If you don’t catch them early, the autonomous vehicle doesn’t have time to stop because stopping a car is purely physics,” he said. “The higher the resolution, and the higher the frame rate, the better the inference will be. But that also means the higher the compute demand will be.”
Wouldn’t Nvidia’s SoCs be able to do the job?
Anand said, “Nvidia is a great company and it has made fantastic innovations over the last 10 to 15 years. They’ve done an amazing job both on the development of their chips and on the software.”
“But their chips, which they tell their customers to use in their car, are designed for processing information in batches,” he said.
In his opinion, batch processing won’t work in an AV because the car needs real- time action. Nvidia is handicapped by its architecture, approach and software, he said. In other words, “Nvidia doesn’t have the luxury that we have as a startup, where we can think about the problem from a clean sheet.”
What about Mobileye, then?
“Mobileye is slightly different,” he said. “Mobileye is a computer vision company. The approach they took in EyeQ 4, EyeQ5 and EyeQ 6 was a processor thinking. So, they took more processor engines to do the AI work.” Mobileye, said Anand, is a company consisting of “legacy computer vision guys trying to become AI guys.”
Advancements in AI networks
Magney sees that vision processing is improving all the time, thanks to advances in ML methods and optimization of the vision pipeline. However, he said, “What you can do on a development system is a far cry from series production.” Automakers want “super-efficient compute platforms that can run these data-intensive ML applications.”
Against that backdrop, “Automakers are pushing some of the processing to the edge,” instead of opting for “the raw data approach,” Magney explained. Now that an AV stack may have dozens of neural networks, automakers are thinking “why not push those out to the dedicated processor on the edge whereby you reduce the load on the domain controller.”
Still unclear, however, is how something like Recogni’s solution — a proprietary processor and software tightly coupled with sensors — will be received by AV pioneers such as Waymo, Argo, Cruise or others. Juliussen, for one, wonders how easy it might be for those AV pioneers to replace their existing solutions by Recogni’s chip. “Time will tell on this,” said Magney.
While Recogni is pushing its new chip for fully autonomous vehicles, Magney suspects, “I think for ADAS applications you have a better chance with these tightly coupled architectures. For the L3-L5 stuff you need a common compute architecture and OEMs are weighing in more on that.”
Magney cautioned, “The giant chip companies are not slowing down. They’re building their portfolio to be scalable with neural network accelerators and edge processing too.”
Recogni’s chip schedule
About a year ago, Ashwini Choudhary, Recogni’s co-founder, indicated in an interview that the chip Recogni is developing will provide 1000 TOPS at 10 watts. Anand confirmed that this remains the goal, but declined to disclose its performance specs. The chip, to be fabricated at Taiwan Semiconductor Manufacturing Co. with a 7nm process technology, will be “close to tape out,” he added.
The post Recogni to Push out High TOPS AI Chip to AV’s Very Edge appeared first on EETimes.