Like so many software developers, Elias Sorensen has been studying AI. Now he and his 10-member team are teaching it to robots.
When the AI specialists at Mobile Industrial Robots, based in Odense, Denmark, are done, the first graduating class of autonomous machines will be on their way to factories and warehouses, powered by NVIDIA Jetson Xavier NX GPUs.
“The ultimate goal is to make the robots behave in ways humans understand, so it’s easier for humans to work alongside them. And Xavier NX is at the bleeding edge of what we are doing,” said Sorensen, who will provide an online talk about MiR’s work at GTC Digital.
MiR’s low-slung robots carry pallets weighing as much as 2,200 pounds. They sport lidar and proximity sensors, as well as multiple cameras the team is now linking to Jetson Xavier GPUs.
Inferencing Their Way Forward
The new digital brains will act as pilots. They’ll fuse sensor data to let the bots navigate around people, forklifts and other objects, dynamically re-mapping safety zones and changing speeds as needed.
The smart bots use NVIDIA’s DeepStream and TensorRT software to run AI inference jobs on Xavier NX, based on models trained on NVIDIA GPUs in the AWS cloud.
MiR chose Xavier for its combination of high performance at low power and price, as well as its wealth of software.
“Lowering the cost and power consumption for AI processing was really important for us,” said Sorensen. “We make small, battery-powered robots and our price is a major selling point for us.” He noted that MiR has deployed more than 3,000 robots to date to users such as Ford, Honeywell and Toyota.
The new autonomous models are working prototypes. The team is training their object-detection models in preparation for first pilot tests.
Jetson Nano Powers Remote Vision
It’s MiR’s first big AI product, but not its first ever. Since November, the company has shipped smart, standalone cameras powered by Jetson Nano GPUs.
The Nano-based cameras process video at 15 frames per second to detect objects. They’re networked with each other and other robots to enhance the robots’ vision and help them navigate.
Both the Nano cameras and Xavier-powered robots process all camera data locally, only sending navigation decisions over the network. “That’s a major benefit for such a small, but powerful module because many of our customers are very privacy minded,” Sorensen said.
MiR developed a tool its customers use to train the camera by simply showing it pictures of objects such as robots and forklifts. The ease of customizing the cameras is a big measure of the product’s success, he added.
AI Training with Simulations
The company hopes its smart robots will be equally easy to train for non-technical staff at customer sites.
But here the challenge is greater. Public roads have standard traffic signs, but every factory and warehouse is unique with different floor layouts, signs and types of pallets.
MiR’s AI team aims to create a simulation tool that places robots in a virtual work area that users can customize. Such a simulation could let users who are not AI specialists train their smart robots like they train their smart cameras today.
The journey into the era of autonomous machines is just starting for MiR. It’s parent company, Teradyne, announced in February it is investing $36 million to create a hub for developing collaborative robots, aka co-bots, in Odense as part of a partnership with MiR’s sister company, Universal Robotics.
Market watchers at ABI Research predict the co-bot market could expand to $12 billion by 2030. In 2018, Danish companies including MiR and Universal captured $995 million of that emerging market, according to Damvad, a Danish analyst firm.
With such potential and strong ingredients from companies like NVIDIA, “it’s a great time in the robotics industry,” Sorensen said.
This year’s event in San Jose, March 22-26, is no exception, with at least six autonomous machines expected on the show floor. Like C3PO and BB8, each one is different.
Among what you’ll see at GTC 2020:
a robotic dog that sniffs out trouble in complex environments such as construction sites
a personal porter that lugs your stuff while it follows your footsteps
a man-sized bot that takes inventory quickly and accurately
a short, squat bot that hauls as much as 2,200 pounds across a warehouse
a delivery robot that navigates sidewalks to bring you dinner
“What I find interesting this year is just how much intelligence is being incorporated into autonomous machines to quickly ingest and act on data while navigating around unstructured environments that sometimes are not safe for humans,” said Amit Goel, senior product manager for autonomous machines at NVIDIA and robot wrangler for GTC 2020.
The ANYmal C from ANYbotics AG (pictured above), based in Zurich, is among the svelte navigators, detecting obstacles and finding its own shortest path forward thanks to its Jetson AGX Xavier GPU. The four-legged bot can slip through passages just 23.6 inches wide and climb stairs as steep as 45 degrees on a factory floor to inspect industrial equipment with its depth, wide-angle and thermal cameras.
The folks behind the Vespa scooter will show Gita, a personal robot that can carry up to 40 pounds of your gear for four hours on a charge. It runs computer vision algorithms on a Jetson TX2 GPU to identify and follow its owner’s legs on any hard surfaces.
Say cheese. Bossa Nova Robotics will show its retail robot that can scan a 40-foot supermarket aisle in 60 seconds, capturing 4,000 images that it turns into inventory reports with help from its NVIDIA Turing architecture RTX GPU. Walmart plans to use the bots in at least 650 of its stores.
Mobile Industrial Robots A/S, based in Odense, Denmark, will give a talk at GTC about how it’s adding AI with Jetson Xavier to its pallet-toting robots to expand their work repertoire. On the show floor, it will demonstrate one of the robots from its MiR family that can carry payloads up to 2,200 pounds while using two 3D cameras and other sensors to navigate safely around people and objects in a warehouse.
From the other side of the globe, ExaWizards Inc. (Tokyo) will show its multimodal AI technology running on robotic arms from Japan’s Denso Robotics. It combines multiple sensors to learn human behaviors and perform jobs such as weighing a set portion of water.
Rounding out the cast, the Serve delivery robot from Postmates will make a return engagement at GTC. It can carry 50 pounds of goods for 30 miles, using a Jetson AGX Xavier and Ouster lidar to navigate sidewalks like a polite pedestrian. In a talk, a Postmates engineer will share lessons learned in its early deployments.
Many of the latest systems reflect the trend toward collaborative robotics that NVIDIA CEO Jensen Huang demonstrated in a keynote in December. He showed ways humans can work with and teach robots directly, thanks to an updated NVIDIA Isaac developers kit that also speeds development by using AI and simulations to train robots, now part of NVIDIA’s end-to-end offering in robotics.
Just for fun, GTC also will host races of AI-powered DIY robotic cars, zipping around a track on the show floor at speeds approaching 50 mph. You can sign up here if you want to bring your own Jetson-powered robocar to the event.
We’re saving at least one surprise in robotics for those who attend. To get in on the action, register here for GTC 2020.
Genomics is finally poised to go mainstream, with help from deep learning and accelerated-computing technologies from NVIDIA.
Since the first human genome was sequenced in 2003, the cost of whole genome sequencing has steadily shrunk, far faster than suggested by Moore’s law. From sequencing the genomes of newborn babies to conducting national population genomics programs, the field is gaining momentum and getting more personal by the day.
Advances in sequencing technology have led to an explosion of genomic data. The total amount of sequence data is doubling every seven months. This breakneck pace could see genomics in 2025 surpass by 10x the amount of data generated by other big data sources such as astronomy, Twitter and YouTube — hitting the double-digit exabyte range.
New sequencing systems, like the DNBSEQ-T7 from BGI Group, the world’s largest genomics research group, are pushing the technology into broad use. The system generates a whopping 60 genomes per day, equaling 6 terabytes of data.
With advancements in BGI’s flow cell technology and acceleration by a pair of NVIDIA V100 Tensor Core GPUs, DNBSEQ-T7 sequencing is sped up 50x, making it the highest throughput genome sequencer to date.
Getting Past the Genome Analysis Bottleneck: GPU-Accelerated GATK
The genomics community continues to extract new insights from DNA. Recent breakthroughs include single-cell sequencing to understand mutations at a cellular level, and liquid biopsies that detect and monitor cancer using blood for circulating DNA.
But genomic analysis has traditionally been a computational bottleneck in the sequencing pipeline — one that can be surmounted using GPU acceleration.
To deliver a roadmap of continuing GPU acceleration for key genomic analysis pipelines, the team at Parabricks — an Ann Arbor, Michigan-based developer of GPU software for genomics — is joining NVIDIA’s healthcare team, NVIDIA founder and CEO Jensen Huang shared today onstage at GTC China.
Teaming up with BGI, the Parabricks’ software can analyze a genome in under an hour. Using a server with eight NVIDIA T4 Tensor Core GPUs, BGI showed the throughput could lower the cost of genome sequencing to $2 — less than half the cost of existing systems.
See More, Do More with Smart Medical Devices
New medical devices are being invented across the healthcare industry. United Imaging Healthcare has introduced two industry-first medical devices. The uEXPLORER is the world’s first total body PET-CT scanner. Its pioneering ability to image an individual in one position enables it to carry out fast, continuous tracking of tracer distribution over the entire body.
The total-body coverage of uEXPLORER can significantly shorten scan time. Scans as brief as 30 seconds provide good image quality, compared to traditional systems requiring over 20 minutes of scan time. uEXPLORER is also setting a new benchmark in tracer dose — imaging at about 1/50 of the regular dose, without compromising image quality.
The FDA-approved system uses 16 NVIDIA V100 Tensor Core GPUs and eight 56 GB/s InfiniBand network links from Mellanox to process movie-like scans that can acquire up to a terabyte of data. The system is already deployed in the U.S. at the University of California, Davis, where scientists helped design the system. It’s also the subject of an article in Nature, as well as videos watched by nearly half a million viewers on YouTube.
United’s other groundbreaking system, the uRT-Linac, is the first instrument to support a full radiation therapy suite, from detection to prevention.
With this instrument, a patient from a remote village can make the long trek to the nearest clinic just once to get diagnostic tests and treatment. The uRT-Linac combines CT imaging, AI processing to assist in treatment planning, and simulation with the radiation therapy delivery system. Using multi-modal technologies and AI, United has changed the nature of delivering cancer treatment.
Further afield, a growing number of smart medical devices are using AI for enhanced signal and image processing, workflow optimizations and data analysis.
And on the horizon are patient monitors that can sense when a patient is in danger and smart endoscopes that can guide surgeons during surgery. It’s no exaggeration to state that, in the future, every sensor in the hospital will have AI-infused capabilities.
Our recently announced NVIDIA Clara AGX developer kit helps address this trend. Clara AGX comprises hardware based on NVIDIA Xavier SoCs and Volta Tensor Core GPUs, along with a Clara AGX software development kit, to enable the proliferation of smart medical devices that make healthcare both smarter and more personal.
Editor’s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all of our automotive posts, here.
A self-driving car’s view of the world often includes bounding boxes — cars, pedestrians and stop signs neatly encased in red and green rectangles.
In the real world, however, not everything fits in a box.
For highly complex driving scenarios, such as a construction zone marked by traffic cones, a sofa chair or other road debris in the middle of the highway, or a pedestrian unloading a moving van with cargo sticking out the back, it’s helpful for the vehicle’s perception software to provide a more detailed understanding of its surroundings.
Such fine-grained results can be obtained by segmenting image content with pixel-level accuracy, an approach known as panoptic segmentation.
With panoptic segmentation, the image can be accurately parsed for both semantic content (which pixels represent cars vs. pedestrians vs. drivable space), as well as instance content (which pixels represent the same car vs. different car objects).
Planning and control modules can use panoptic segmentation results from the perception system to better inform autonomous driving decisions. For example, the detailed object shape and silhouette information helps improve object tracking, resulting in a more accurate input for both steering and acceleration. It can also be used in conjunction with dense (pixel-level) distance-to-object estimation methods to help enable high-resolution 3D depth estimation of a scene.
Single DNN Approach
NVIDIA’s approach achieves pixel-level semantic and instance segmentation of a camera image using a single, multi-task learning deep neural network. This approach enables us to train a panoptic segmentation DNN that understands the scene as a whole versus piecewise.
It’s also efficient. Just one end-to-end DNN can extract all this rich perception information while achieving per-frame inference times of approximately 5ms on our embedded in-car NVIDIA DRIVE AGX platform. This is much faster than state-of-the-art segmentation methods.
DRIVE AGX makes it possible to simultaneously run the panoptic segmentation DNN along with many other DNN networks and perception functions, localization, and planning and control software in real time.
As shown above, the DNN is able to segment a scene into several object classes, as well as detect different instances of these object classes, as shown with the unique colors and numbers in the bottom panel.
On-Point Training and Perception
The rich pixel-level information provided by each frame also reduces training data volume requirements. Specifically, because more pixels per training image represent useful information, the DNN is able to learn using fewer training images.
Moreover, based on the pixel-level detection results and post-processing, we’re also able to compute the bounding box for each object detection. All the perception advantages offered by pixel-level segmentations require processing, which is why we developed the powerful NVIDIA DRIVE AGX Xavier.
As a result, the pixel-level details panoptic segmentation provides make it possible to better perceive the visual richness of the real world in support of safe and reliable autonomous driving.
In unveiling the specs of his new self-driving car computer at today’s Tesla Autonomy Day investor event, Elon Musk made several things very clear to the world.
First, Tesla is raising the bar for all other carmakers.
Second, Tesla’s self-driving cars will be powered by a computer based on two of its new AI chips, each equipped with a CPU, GPU, and deep-learning accelerators. The computer delivers 144 trillion operations per second (TOPS), enabling it to collect data from a range of surround cameras, radars and ultrasonics and power deep neural network algorithms.
Third, Tesla is working on a next-generation chip, which says 144 TOPS isn’t enough.
At NVIDIA, we have long believed in the vision Tesla reiterated today: self-driving cars require computers with extraordinary capabilities.
Which is exactly why we designed and built the NVIDIA Xavier SoC several years ago. The Xavier processor features a programmable CPU, GPU and deep learning accelerators, delivering 30 TOPs. We built a computer called DRIVE AGX Pegasus based on a two chip solution, pairing Xavier with a powerful GPU to deliver 160 TOPS, and then put two sets of them on the computer, to deliver a total of 320 TOPS.
And as we announced a year ago, we’re not sitting still. Our next-generation processor Orin is coming.
That’s why NVIDIA is the standard Musk compares Tesla to—we’re the only other company framing this problem in terms of trillions of operations per second, or TOPS.
But while we agree with him on the big picture—that this is a challenge that can only be tackled with supercomputer-class systems—there are a few inaccuracies in Tesla’s Autonomy Day presentation that we need to correct.
It’s not useful to compare the performance of Tesla’s two-chip Full Self Driving computer against NVIDIA’s single-chip driver assistance system. Tesla’s two-chip FSD computer at 144 TOPs would compare against the NVIDIA DRIVE AGX Pegasus computer which runs at 320 TOPS for AI perception, localization and path planning.
Additionally, while Xavier delivers 30 TOPS of processing, Tesla erroneously stated that it delivers 21 TOPS. Moreover, a system with a single Xavier processor is designed for assisted driving AutoPilot features, not full self-driving. Self-driving, as Tesla asserts, requires a good deal more compute.
Tesla, however, has the most important issue fully right: Self-driving cars—which are key to new levels of safety, efficiency, and convenience—are the future of the industry. And they require massive amounts of computing performance.
Indeed Tesla sees this approach as so important to the industry’s future that it’s building its future around it. This is the way forward. Every other automaker will need to deliver this level of performance.
There are only two places where you can get that AI computing horsepower: NVIDIA and Tesla.
And only one of these is an open platform that’s available for the industry to build on.
AI Robots. Chicken kebabs. Ridiculously fast technology. Ah, the good things in life.
At an event that spanned everything from drones that autonomously fly through the air to bots that navigate our sidewalks to the software and semiconductors that power it all, we hosted last night our second public autonomous-machines meetup at NVIDIA’s new headquarters building.
“Compute is critically important for autonomous machines,” NVIDIA’s Jesse Clayton said, welcoming more than 300 developers, entrepreneurs, investors and others to the event. “Jetson is how you get there.”
Developers can use Jetson AGX Xavier to build the autonomous machines that will solve some of the world’s toughest problems, and help transform a broad range of industries, Clayton explained. Millions are expected to come onto the market in the years ahead.
Consuming as little as 10 watts — about as much as a clock radio — the module enables companies to go into volume production with applications developed on the Jetson AGX Xavier developer kit, bringing next-gen robots and other autonomous machines to life.
Jetson AGX Xavier comes as the Jetson ecosystem around it is growing fast. The number of developers using NVIDIA’s Jetson platform has grown 5x since March of 2017, while the number of Jetson customers has grown 6x to 1800 over the same timespan, Clayton said.
Twenty-five of those customers and ecosystem partners were at the event to tell their story, as the crowd noshed on chicken kebabs and house-made pita chips with tzatziki and olive tapenade. A few highlights:
Farm of the future — SMART AG’s Autocart module gives farm equipment an autonomous upgrade. The Iowa startup uses Jetson TX2 to help it system see as it navitages farms. It’s already allowing tractors to hustle out during harvest to pick up loads of freshly harvested grain, giving farm productivity an autonomous boost.
Self-flying camera — The Skydio R1 is more than just a camera. It uses six pairs of cameras to build a 3D map of its environment. This lets it identify, localize, and track people and cars, and predict movement up to four seconds in the future. The result: stunning videos of your latest, and greatest adventures.
Going the extra mile — Marble’s building a “last mile delivery solution,” or, put another way, this San Francisco-based startup creates robots that save you a schlep by bringing food to your doorstep. Its intelligent delivery robots reliably and securely transport things that you need and want in a way that is safe and accessible to everyone. Uses advanced sensors, including LIDAR and cameras, to carefully navigate sidewalks.
Attendees were impressed. “It’s like living in the future,” Terry Smith, from Liquid Robotics said as he looked around the meetup. Smith’s company makes autonomous vehicle that rely on wind and solar power to roam the oceans, autonomously. He said Jetson TX 2 has “revolutionized,” what his company can do.
Others were scouting for breakthrough technology for even wilder projects. Attendees included Tad Morgan, whose company, Made In Space, is working to bring manufacturing to outer space; and Randy Gobbel, who works on deep learning at genomics startup Illumina, who is already experimenting with Jetson TX 2 and Xavier processors to “see what they can do.”
NVIDIA today unveiled the NVIDIA Clara platform, a combination of hardware and software that brings AI to the next generation of medical instruments as a powerful new tool for early detection, diagnosis and treatment of diseases.
At the heart of the platform are NVIDIA Clara AGX — a revolutionary computing architecture based on the NVIDIA Xavier AI computing module and NVIDIA Turing GPUs — and the Clara software development kit for developers to create a wide range of AI-powered applications for processing data from existing systems.
The Clara platform addresses the great challenge of medical instruments: processing the massive sea of data — tens to thousands of gigabytes worth — generated each second so it can be interpreted by doctors and scientists. Achieving this level of supercomputing has traditionally required three computing architectures: FPGAs, CPUs and GPUs.
Clara AGX simplifies this to a single, GPU-based architecture that delivers the world’s fastest AI inferencing on NVIDIA Tensor Cores; acceleration through CUDA, the world’s most widely adopted accelerated computing platform; and state-of-the-art NVIDIA RTX graphics. Its flexible design enables it to scale from entry-level devices to the most demanding 3D instruments.
Clara also addresses a fundamental disconnect between legacy medical instruments — which typically have a lifespan of over 10 years — and their ability to run modern applications, which benefit from the 1,000x acceleration of GPU computing over the past decade.
It achieves this by enabling the installed base of instruments to connect to the latest NVIDIA GPU servers through its ability to process raw instrument data. The most advanced imaging applications — like iterative reconstruction for CT and X-ray, beamforming for ultrasound and compressed sensing for MRI — can run on 10-year-old instruments.
The Clara SDK provides medical-application developers with a set of GPU-accelerated libraries for computing, graphics and AI; example applications for reconstruction, image processing and rendering; and computational workflows for CT, MRI and ultrasound. These all leverage containers and Kubernetes to virtualize and scale medical instrument applications for any instrument.
Support from Medical Imaging Developers
Medical imaging developers around the world are discovering numerous ways to use AI to automate workflows, make instruments run faster and improve image quality, in addition to assisting doctors in detecting and diagnosing disease. More than 400 AI healthcare startups have launched in the past five years, and the Clara platform will be able to help them harness AI to transform healthcare workloads.
For instance, Subtle Medical, a member of the NVIDIA Inception virtual accelerator program, is working on MRI applications that acquire images in one-quarter the time while requiring just one-tenth the contrast dosage to patients. Subtle Medical developers got their application running in a few hours, with an immediate speedup of 10x for AI inferencing.
“We are using AI to improve workflow for MRI and PET exams,” said Enhao Gong, founder of Subtle Medical. “NVIDIA’s Clara platform will enable us to seamlessly scale our technology to reduce risks from contrast and radiation, taking imaging efficiency and safety to the next level.”
ImFusion, also an Inception member, can create 3D ultrasound from a traditional 2D acquisition, and then visualize the ultrasound fused with CT. ImFusion developers ported their application to Clara in less than two days and take advantage of Clara’s inferencing, cinematic rendering engine and virtualization capability.
“We specialize in accelerated medical image computing and guided surgery,” said Wolfgang Wein, founder and CEO of ImFusion. “NVIDIA’s Clara platform gives us the ability to turn 2D medical images into 3D and deploy our technology virtually.”
The NVIDIA Clara platform is available now to early access partners, with a targeted beta planned for the second quarter of 2019.
Press a button on your smartphone and go. Daimler, Bosch and NVIDIA have joined forces to bring fully automated and driverless vehicles to city streets, and the effects will be felt far beyond the way we drive.
While the world’s billion cars travel 10 trillion miles per year, most of the time these vehicles are sitting idle, taking up valuable real estate while parked. And when driven, they are often stuck on congested roadways. Mobility services will solve these issues plaguing urban areas, capture underutilized capacity and revolutionize the way we travel.
All over the globe we are seeing a rapid adoption of new mobility services from companies like Uber, Lyft, Didi, and Ola. But now the availability of drivers threatens to limit their continued growth.
The answer is the driverless car — a vehicle rich with sensors, powered by an extremely energy efficient supercomputer, and running AI software that acts as a virtual driver.
The collaboration of Daimler, Bosch, and NVIDIA, announced Tuesday, promises to unleash what auto industry insiders call Level 4 and Level 5 autonomy — cars that can drive themselves.
The benefits of mobility services built on autonomous vehicles are enormous. These AI-infused vehicles will improve traffic flow, enhance safety, and offer greater access to mobility. In addition, analysts predict it will cost a mere 17 cents a mile to ride in a driverless car you can summon anytime. And commuters will be able to spend their drive to work actually working, recapturing an estimated $99 billion worth of lost productivity each year.
Driving the convenience of transportation up, and costs down, is a huge opportunity. By 2030, driverless vehicles and services will be a $1 trillion industry, according to KPMG.
To reap these benefits, the great automotive brands will need to weave the latest technology into everything they do. And NVIDIA DRIVE, our AV computing platform, promises to help them stitch all the breakthroughs of our time — deep learning, sensor fusion, image recognition, cloud computing and more — into this fabric.
Our collaboration with Daimler and Bosch will unite each company’s strengths. NVIDIA brings leadership in AI and self-driving platforms. Bosch, the world’s largest tier 1 automotive supplier, brings its hardware and system expertise. Mercedes-Benz parent Daimler brings total vehicle expertise and a global brand that’s synonymous with safety and quality.
Street Smarts Needed
Together, we’re tackling an enormous challenge. Pedestrians, bicyclists, traffic lights, and other vehicles make navigating congested urban streets stressful for even the best human drivers.
Demand for computational horsepower in this chaotic, unstructured environment adds up fast. Just a single video camera generates 100 gigabytes of data per kilometer, according to Bosch.
Now imagine a fully automated vehicle or robotaxi with a suite of sensors wrapped around the car: high resolution camera, lidar, and radar that are configured to sense objects from afar, combined with diverse sensors that are specialized for seeing color, measuring distance, and detecting motion across a wide range of conditions. Together these systems provide levels of diversity to increase safety and redundancy to provide backup in case of failure. However, this vast quantity of information needs to be deciphered, processed, and put to work by multiple layers of neural networks almost instantaneously.
A massive amount of computing performance is required to run the dozens of complex algorithms concurrently, executing within milliseconds so that the car can navigate safely and comfortably.
Daimler and Bosch Select DRIVE Pegasus
NVIDIA DRIVE Pegasus is the AI supercomputer designed specifically for autonomous vehicles, delivering 320 TOPS (trillions of operations per second) to handle these diverse and redundant algorithms. At just the size of a license plate, it has the performance equivalent to six synchronized deskside workstations.
This is the most energy efficient supercomputer ever created — performing one trillion operations per watt. By minimizing the amount of energy consumed, we can translate that directly to increased operating range.
Pegasus is architected for safety, as well as performance. This automotive-grade, functional safety production solution uses two NVIDIA Xavier SoCs and two of our next-generation GPUs designed for AI and vision processing. This co-designed hardware and software platform is created to achieve ASIL-D ISO 26262, the industry’s highest level of automotive functional safety. Even when a fault is detected, the system will still operate.
From the Car to the Cloud
NVIDIA AV solutions go beyond what can be put on wheels. NVIDIA DGX AI supercomputers for the data center are used to train the deep neural nets that enable a vehicle to deliver superhuman levels of perception. The new DGX-2, with its two petaflops of performance, enables deep learning training in a fraction of the time, space, energy, and cost of CPU servers.
Once trained on powerful GPU-based servers, the NVIDIA DRIVE Constellation AV simulator can be utilized to test and validate the complete software “stack” that will ultimately be placed inside the vehicle. This high performance software stack includes every aspect of piloting an autonomous vehicle, from object detection through deep learning and computer vision, to map localization and path planning, and it all runs on DRIVE Pegasus.
In the years to come, DRIVE Pegasus will be key to helping automakers meet a surge in demand. The mobility-as-a-service industry will purchase more than 10 million cars in 2040, up from 300,000 in 2017, market research firm IHS Markit projects.
“The partnership with Bosch and Daimler illustrates that the NVIDIA DRIVE Pegasus architecture solves the critical needs of automakers as they tackle the challenge of automated driving,” said IHS Markit Senior Research Director for Artificial Intelligence Luca De Ambroggi. “The combination of NVIDIA’s AI silicon, software, integrated platforms, and tools for simulation and validation adds value for AV development.”
A Thriving Ecosystem for Mobility-as-a-Service
The NVIDIA DRIVE ecosystem continues to expand in all areas of autonomous driving, from robotaxis to trucking to delivery vehicles, as more than 370 companies have already adopted the DRIVE platform. And now our work with Daimler and Bosch will create innovative new driverless vehicles and services that will do more than just transform our streets, they’ll transform our lives.
Automotive safety isn’t a box you check. It’s not a feature. Safety is the whole point of autonomous vehicles. And it starts with a new class of computer, a new type of software and a new breed of chips.
Safety is designed into the NVIDIA DRIVE computer for autonomous vehicles from the ground up. Experts architect safety technology into every aspect of our computing system, from the hardware to the software stack. Tools and methods are developed to create software that performs as intended, reliably and with backups. Stringent engineering processes are developed to ensure no corners are cut.
“Safety-first” computer design is equal parts expertise, architecture, design, tools, methods and best practices. Safety needs to be everywhere — permeating our engineering culture.
Top Experts Agree – Xavier Is Architected for Safety
We didn’t stop there. We invited the world’s top automotive safety and reliability company, TÜV SÜD, to perform a safety concept assessment of our new NVIDIA Xavier system-on-chip (SoC). The 150-year-old German firm’s 24,000 employees assess compliance to national and international standards for safety, durability and quality in cars, as well as for factories, buildings, bridges and other infrastructure.
“NVIDIA Xavier is one of the most complex processors we have evaluated,” said Axel Köhnen, Xavier lead assessor at TÜV SÜD RAIL. “Our in-depth technical assessment confirms the Xavier SoC architecture is suitable for use in autonomous driving applications and highlights NVIDIA’s commitment to enable safe autonomous driving.”
Feeds and Speeds Built Around a Single Need: Safety
Let’s walk through what that means.
As the world’s first autonomous driving processor, Xavier is the most complex SoC ever created. Its 9 billion transistors enable Xavier to process vast amounts of data. Its GMSL (gigabit multimedia serial link) high-speed IO connects Xavier to the largest array of lidar, radar and camera sensors of any chip ever built.
Inside the SoC, six types of processors — ISP (image signal processor), VPU (video processing unit), PVA (programmable vision accelerator), DLA (deep learning accelerator), CUDA GPU, and CPU — process nearly 40 trillion operations per second, 30 trillion for deep learning alone. This level of processing is 10x more powerful than our previous generation DRIVE PX 2 reference design, which is used in today’s most advanced production cars.
These aren’t feeds and speeds we enabled just because we could. They’re essential to safety.
1 Chip, 6 Processors, 40 TOPS – Diversity and Redundancy Need Performance
Xavier is the brain of the self-driving car. From a safety perspective, this means building in diversity, redundancy and fault detection from end to end. From sensors, to specialized processors, to algorithms, to the computer, all the way to the car’s actuation — each function is performed using multiple methods, which gives us diversity. And each vital function has a fallback system, which ensures redundancy.
For example, objects detected by radar, lidar or cameras are handled with different processors and perceived using a variety of computer vision, signal processing and point cloud algorithms. Multiple deep learning networks run concurrently to recognize objects that should be avoided, while other networks determine where it’s safe to drive, achieving both diversity and redundancy. Different processors, running diverse algorithms in parallel, backing each other up, reduce the chance of an undetected single point of failure.
Xavier also includes many types of hardware diagnostics. Key areas of logic are duplicated and voted in hardware using lockstep comparators. Error-correcting codes on memories detect faults and improve availability. A unique built-in self-test helps to find faults in the diagnostics, wherever they may be on chip.
Xavier’s safety architecture was created over several years by more than 300 architects, designers and safety experts who analyzed over 150 safety-related modules. With Xavier, the auto industry can achieve the highest functional safety rating: ASIL-D.
Building for diversity and redundancy needed for safety demands a huge amount of extra processing. For self-driving cars, processing power translates to safety.
Measuring Up to the Highest Standards
Thousands of engineers writing millions of lines of code — how do we ensure Xavier does what we designed it to do?
We created DRIVE as an open platform so that the experts in the world’s best car companies can engage our platform to make it industrial strength. We also turned to TÜV SÜD, among the world’s most respected safety experts, who measured Xavier against the automotive industry’s standard for functional safety — ISO 26262.
Established by the International Organization for Standardization, the world’s chief standards body, ISO 26262 is the definitive global standard for the functional safety — a system’s ability to avoid, identify and manage failures — of road vehicles’ systems, hardware and software.
To meet that standard, an SoC must have an architecture that doesn’t just detect hardware failures during operation. It also needs to be developed in a process that mitigates potential systematic faults. That is, the SoC must avoid failures whenever possible, but detect and respond to them if they cannot be avoided.
TÜV SÜD’s team determined Xavier’s architecture meets the ISO 26262 requirements to avoid unreasonable risk in situations that could result in serious injury.
Our Journey to Zero Accidents
Inventing technology that will one day eliminate accidents on our roads is one of NVIDIA’s most important endeavors. We are inspired to tackle this grand computing challenge that will have great social impact.
We had to re-invent every aspect of computing, starting with the Xavier processor. We created processing power not for speed, but for safety. We benchmarked ourselves against the highest standards: ASIL-D and ISO 26262. And we engaged every expert — from the best car companies to TÜV SÜD — to test and challenge us.
The journey is long, but the destination is worth every step.
The remarkable success of our GPU Technology Conference this month demonstrated to anyone still in doubt the extraordinary momentum of the AI revolution.
Throughout the four-day event here in Silicon Valley, attendees from the world’s leading companies in media and entertainment, manufacturing, healthcare and transportation shared stories of their breakthroughs made possible by GPU computing.
The numbers tell a powerful story. With more than 7,000 attendees, 150 exhibitors and 600 technical sessions, our eighth annual GPU Technology Conference this month was our largest yet. The world’s top 15 tech companies were there, as were the world’s top 10 automakers, and more than 100 startups focusing on AI and VR.
Behind these numbers is a confluence of powerful trends. AI is being driven forward by leaps in computing power that defy the slowdown in Moore’s law. AI developers are racing to build new frameworks to tackle some of the greatest challenges of our time. They want to run their AI software on everything from powerful cloud services to devices at the edge of the cloud.
The Era of AI Computing – The Era of GPU Computing
At GTC, we unveiled Volta, our greatest generational leap since the invention of CUDA. It incorporates 21 billion transistors. It’s built on a 12nm NVIDIA-optimized TSMC process. It includes the fastest HBM memories from Samsung. Volta features a new numeric format and CUDA instruction that performs 4×4 matrix operations – an elemental deep learning operation – at super high speeds
Each Volta GPU is 120 teraflops. And our DGX-1 AI supercomputer interconnects eight Tesla V100 GPUs to generate nearly one petaflops of deep learning performance.
Also last week, Google announced at its I/O conference, its TPU2 chip, with 45 teraflops of performance.
It’s great to see the two leading teams in AI computing race while we collaborate deeply across the board – tuning TensorFlow performance, and accelerating the Google cloud with NVIDIA CUDA GPUs. AI is the greatest technology force in human history. Efforts to democratize AI and enable its rapid adoption are great to see.
Powering Through the End of Moore’s Law
The AI revolution has arrived despite the fact Moore’s law – the combined effect of Dennard scaling and CPU architecture advance – began slowing nearly a decade ago. Dennard scaling, whereby reducing transistor size and voltage allowed designers to increase transistor density and speed while maintaining power density, is now limited by device physics.
CPU architects can harvest only modest ILP – instruction-level parallelism – but with large increases in circuitry and energy. So, in the post-Moore’s law era, a large increase in CPU transistors and energy results in a small increase in application performance. Performance recently has increased by only 10 percent a year, versus 50 percent a year in the past.
The accelerated computing approach we pioneered targets specific domains of algorithms; adds a specialized processor to offload the CPU; and engages developers in each industry to accelerate their application by optimizing for our architecture. We work across the entire stack of algorithms, solvers and applications to eliminate all bottlenecks and achieve the speed of light.
That’s why Volta unleashes incredible speedups for AI workloads. It provides a 5x improvement over Pascal, the current-generation NVIDIA GPU architecture, in peak teraflops, and 15x over the Maxwell architecture, launched just two years ago, and well beyond what Moore’s Law would have predicted.
Accelerate Every Approach to AI
Such leaps in performance have drawn innovators from every industry, with the number of startups building GPU-driven AI services has grown more than 4x over the past year to 1,300.
No one wants to miss the next breakthrough. Software is eating the world, as Marc Andreessen said, but AI is eating software.
The number of software developers following the leading AI frameworks on the GitHub open-source software repository has grown to more than from 75,000 from fewer than 5,000 over the past two years.
Deep learning is a strategic imperative for every major tech company. It increasingly permeates every aspect of work from infrastructure, to tools, to how products are made. We partner with every framework maker to wring out the last drop of performance. By optimizing each framework for our GPU, we can improve engineer productivity by hours and days for each of the hundreds of iterations needed to train a model. Every network – from Caffe2, Chainer, Microsoft Cognitive Toolkit, MXNet, PyTorch, TensorFlow – will be meticulously optimized for Volta.
We want to create an environment that lets developers do their work anywhere, and with any framework. For companies that want to keep their data in-house, we introduced powerful new workstations and servers at GTC.
Perhaps the most vibrant environment is the $247 billion market for public cloud services. Over the past six months, Alibaba, Amazon, Baidu, Facebook, Google, IBM, Microsoft and Tencent have added NVIDIA GPUs to their data centers.
To help innovators move seamlessly to cloud services such as these, at GTC we launched the NVIDIA GPU Cloud platform, which contains a registry of pre-configured and optimized stacks of every framework. Each layer of software and all of the combinations have been tuned, tested and packaged up into an NVDocker container. We will continuously enhance and maintain it. We fix every bug that comes up. It all just works.
A Cambrian Explosion of Autonomous Machines
Deep learning’s ability to detect features from raw data has created the conditions for a Cambrian explosion of autonomous machines – IoT with AI. There will be billions, perhaps trillions, of devices powered by AI.
At GTC, we announced that one of the 10 largest companies in the world, and one of the most admired, Toyota, has selected NVIDIA for their autonomous car.
We also announced Isaac, a virtual robot that helps make robots. Today’s robots are hand programmed, and do exactly and only what it was programmed to do. Just as convolutional neural networks gave us the computer vision breakthrough needed to tackle self-driving cars, reinforcement learning and imitation learning may be the breakthroughs we need to tackle robotics.
Once trained, the brain of the robot would be downloaded into Jetson, our AI supercomputer in a module. The robot would stand, adapt to any differences between the virtual and real world. A new robot is born. For GTC, Isaac learned how to play hockey and golf.
Finally, we’re open-sourcing the DLA, Deep Learning Accelerator – our version of a dedicated inferencing TPU – designed into our Xavier superchip for AI cars. We want to see the fastest possible adoption of AI everywhere. No one else needs to invest in building an inferencing TPU. We have one for free – designed by some of the best chip designers in the world.
Enabling the Einsteins and Da Vincis of Our Era
These are just the latest examples of how NVIDIA GPU computing has become the essential tool of the da Vincis and Einsteins of our time. For them, we’ve built the equivalent of a time machine. Building on the insatiable technology demand of 3D graphics and market scale of gaming, NVIDIA has evolved the GPU into the computer brain that has opened a floodgate of innovation at the exciting intersection of virtual reality and artificial intelligence.