Filmmaker Achieves Razer-Sharp Graphics and Real-Time Rendering with NVIDIA Studio Laptop

Film studios worldwide have embraced real-time technology to create stunning graphics faster than before.

Now independent creative professionals are giving some love to these same capabilities that allow them to design amazing visuals wherever they go, thanks to powerful NVIDIA RTX Studio laptops and mobile workstations.

Filmmaker Hasraf “HaZ” Dulull writes, directs and produces TV, film and animation projects. He’s worked on titles such as indie sci-fi film The Beyond on Netflix, and directed episodes for Disney’s Fast Layne miniseries. More recently, he’s been working on Battlesuit, a pilot episode for an animated series based on the graphic novel Theory.

When I saw the quality of real-time visuals, I thought to myself: with a little love, real-time ray tracing, and a powerful RTX GPU, you can make really cool animated content at a cinematic level.
—  HaZ Dulull, filmmaker

With only four months and a small team to put this project together, Dulull used a Quadro RTX-powered Razer Blade 15 Studio Edition laptop to harness the power of real-time ray tracing and produce the entire animated episode in Unreal Engine.

Quadro RTX Brings the Mobile Power

Dulull’s main workspace is his home studio, which he also uses to connect to his team remotely. But for the pilot, he needed something compact that still delivered impressive graphics power and allowed him to use virtual production filmmaking techniques for his animations.

The Quadro RTX 5000 graphics card built into the svelte Razer Blade 15 Studio Edition laptop provided Dulull with the flexibility, speed and power to design and test shot ideas quickly and easily.

One scene required a gripping action sequence, so Dulull used DragonFly by Glassbox as a virtual camera. Built for Unreal Engine, DragonFly allowed Dulull to take the animations he receives from his team and quickly shoot different takes to achieve the final result.

DragonFly is directly connected via Wi-Fi to his Razer laptop, where the animations are playing back in real time and being streamed to his camera viewport.

Dulull constantly pushed the laptop with scenes that included demanding amounts of geometry, effects and lighting. A standard VFX shot is composed of 100 frames, but some of Dulull’s scenes were up to 700 frames long. With the Quadro RTX, he employed real-time rendering and produced final graphics in Unreal Engine.

“Traditional CGI animation can be expensive because it includes multiple workflow steps in the process, including rendering several passes and compositing them afterwards,” said Dulull. “But with the Quadro RTX, I was able to use real-time rendering and create every single shot with final pixels in Unreal Engine. There was no need for compositing or post-processing, apart from color grading, because what you see is what you get.”

More Time for Style and Iterations

When working on a proof of concept or a pilot like Battlesuit, filmmakers need to get the animations and graphics to a level that is acceptable for networks and studios. Traditionally, teams will create storyboards or mockups to get an idea of what the style of the project will look like.

But with GPU ray tracing, filmmakers can find the style in real time. For Battlesuit, Dulull went into Unreal Engine and explored different styles on his own. He didn’t need to rely on other artists to create shaders or mood boards, or to send notes or comments back and forth.

“While we’re in the film, we were able to play around with style and find the look we wanted to go for,” said Dulull. “It was easy to explore and change the look in real time, and this is only possible through NVIDIA GPUs.”

The ability to make changes quickly also allowed Dulull to iterate as much as he wanted, since he didn’t have to wait for scenes or images to render. He could make revisions and tweak shots up until the final product was delivered.

With the power of real-time ray tracing at his fingertips, Dulull could push the graphic quality as far as he wanted to quickly achieve the animated film he envisioned without making any creative compromises.

Battlesuit is available to watch here.

Learn more about Quadro RTX and other NVIDIA solutions that are helping professionals work from home.

The post Filmmaker Achieves Razer-Sharp Graphics and Real-Time Rendering with NVIDIA Studio Laptop appeared first on The Official NVIDIA Blog.

AI to Hit Mars, Blunt Coronavirus, Play at the London Symphony Orchestra

AI is the rocket fuel that will get us to Mars. It’s the vaccine that will save us on Earth. And it’s the people who aspire to make a dent in the universe.

Our latest “I Am AI” video, unveiled during NVIDIA CEO Jensen Huang’s keynote address at the GPU Technology Conference, pays tribute to the scientists, researchers, artists and many others making historic advances with AI.

To grasp AI’s global impact, consider: the technology is expected to generate $2.9 trillion worth of business value by 2021, according to Gartner.

It’s on course to classify 2 trillion galaxies to understand the universe’s origin, and to zero in on the molecular structure of the drugs needed to treat coronavirus and cancer.

As depicted in the latest video, AI has an artistic side, too. It can paint as well as Bob Ross. And its ability to assist in the creation of original compositions is worthy of the London Symphony Orchestra, which plays the accompanying theme music, a piece that started out written by a recurrent neural network.

AI is also capable of creating text-to-speech synthesis for narrating a short documentary. And that’s just what it did.

These fireworks and more are the story of I Am AI. Sixteen companies and research organizations are featured in the video. The action moves fast, so grab a bowl of popcorn, kick back and enjoy this tour of some of the highlights of AI in 2020.

Reaching Into Outer Space

Understanding the formation of the structure and the amount of matter in the universe requires observing and classifying celestial objects such as galaxies. With an estimated 2 trillion galaxies to examine in the observable universe, it’s what cosmologists call a “computational grand challenge.”

The recent Dark Energy Survey collected data from over 300 million galaxies. To study them with unprecedented precision, the Center for Artificial Intelligence Innovation at the National Center for Supercomputing Applications at the University of Illinois at Urbana Champaign teamed up with the Argonne Leadership Computing Facility at the U.S. Department of Energy’s Argonne National Laboratory.

NCSA tapped the Galaxy Zoo project, a crowdsourced astronomy effort that labeled millions of galaxies observed by the Sloan Digital Sky Survey. Using that data, an AI model with 99.6 percent accuracy can now chew through unlabeled galaxies to ID them and accelerate scientific research.

With Mars targeted for human travel, scientists are seeking the safest path. In that effort, the NASA Solar Dynamics Observatory takes images of the sun every 1.3 seconds. And researchers have developed an algorithm that removes errors from the images, which are placed into a growing archive for analysis.

Using such data, NASA is tapping into NVIDIA GPUs to analyze solar surface flows so that it can build better models for predicting the weather in space. NASA also aims to identify origins of energetic particles in Earth’s orbit that could damage interplanetary spacecraft, jeopardizing trips to Mars.

Restoring Voice and Limb

Voiceitt — a Tel Aviv-based startup that’s developed signal processing, speech recognition technologies and deep neural nets — offers a synthesized voice for those whose speech has been distorted. The company’s app converts unintelligible speech into easily understood speech.

The University of North Carolina at Chapel Hill’s Neuromuscular Rehabilitation Engineering Laboratory and North Carolina State University’s Active Robotic Sensing (ARoS) Laboratory develop experimental robotic limbs used in the labs.

The two research units have been working on walking environment recognition, aiming to develop environmental adaptive controls for prostheses. They’ve been using CNNs for prediction running on NVIDIA GPUs. And they aren’t alone.

Helping in Pandemic

Whiteboard Coordinator remotely monitors the temperature of people entering buildings to minimize exposure to COVID-19. The Chicago-based startup provides temperature-screening rates of more than 2,000 people per hour at checkpoints. Whiteboard Coordinator and NVIDIA bring AI to the edge of healthcare with NVIDIA Clara Guardian, an application framework that simplifies the development and deployment of smart sensors.

Viz.ai uses AI to inform neurologists about strokes much faster than traditional methods. With the onset of the pandemic, Viz.ai moved to help combat the new virus with an app that alerts care teams to positive COVID-19 results.

Axial3D is a Belfast, Northern Ireland, startup that enlists AI to accelerate the production time of 3D-printed models for medical images used in planning surgeries. Having redirected its resources at COVID-19, the company is now supplying face shields and is among those building ventilators for the U.K.’s National Health Service. It has also begun 3D printing of swab kits for testing as well as valves for respirators. (Check out their on-demand webinar.)

Autonomizing Contactless Help

KiwiBot, a cheery-eyed food delivery bot from Berkeley, Calif., has included in its path a way to provide COVID-19 services. It’s autonomously delivering masks, sanitizers and other supplies with its robot-to-human service.

Masterpieces of Art, Compositions and Narration

Researchers from London-based startup Oxia Palus demonstrated in a paper, “Raiders of the Lost Art,” that AI could be used to recreate lost works of art that had been painted over. Beneath Picasso’s 1902 The Crouching Beggar lies a mountainous landscape that art curators believe is of Parc del Laberint d’Horta, near Barcelona.

They also know that Santiago Rusiñol painted Parc del Laberint d’Horta. Using a modified X-ray fluorescence image of The Crouching Beggar and Santiago Rusiñol’s Terraced Garden in Mallorca, the researchers applied neural style transfer, running on NVIDIA GPUs, to reconstruct the lost artwork, creating Rusiñol’s Parc del Laberint d’Horta.

 

For GTC a few years ago, Luxembourg-based AIVA AI composed the start — melodies and accompaniments — of what would become an original classical music piece meriting an orchestra. Since then we’ve found it one.

Late last year, the London Symphony Orchestra agreed to play the moving piece, which was arranged for the occasion by musician John Paesano and was recorded at Abbey Road Studios.

 

NVIDIA alum Helen was our voice-over professional for videos and events for years. When she left the company, we thought about how we might continue the tradition. We turned to what we know: AI. But there weren’t publicly available models up to the task.

A team from NVIDIA’s Applied Deep Learning Research group published the answer to the problem: Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis. Licensing Helen’s voice, we trained the network on dozens of hours of it.

First, Helen produced multiple takes, guided by our creative director. Then our creative director was able to generate multiple takes from Flowtron and adjust parameters of the model to get the desired outcome. And what you hear is “Helen” speaking in the I Am AI video narration.

The post AI to Hit Mars, Blunt Coronavirus, Play at the London Symphony Orchestra appeared first on The Official NVIDIA Blog.

AI to Hit Mars, Blunt Coronavirus, Play at the London Symphony Orchestra

AI is the rocket fuel that will get us to Mars. It’s the vaccine that will save us on Earth. And it’s the people who aspire to make a dent in the universe. Our latest “I Am AI” video, unveiled during NVIDIA CEO Jensen Huang’s keynote address at the GPU Technology Conference, pays tribute to Read article >

The post AI to Hit Mars, Blunt Coronavirus, Play at the London Symphony Orchestra appeared first on The Official NVIDIA Blog.

Revved Up Retail: Mercedes-Benz Consulting Optimizes Dealership Layout Using Modcam Store Analytics

Retailers are bringing the power of AI to their stores to better understand customer buying behavior and preferences and provide them a better experience.

AI startup Modcam, based in Sweden, uses smart sensors to provide detailed data on retail, showroom and office space traffic. These sensors, powered by NVIDIA Jetson Nano modules, perform real-time compute of AI algorithms using this data at the edge.

This allows retailers of all sorts to securely extract valuable insights regarding customer buying preferences.

Mercedes-Benz Consulting is working with the startup to hit the accelerator on the next generation of the automotive retail experience. Just outside its headquarters in Stuttgart, Germany, the company has constructed an experimental showroom equipped with Modcam sensors to test different layouts and new in-store technologies.

Since cars are an occasional, big-ticket purchase for most people, much of an automaker’s retail success relies on the showroom experience. As car companies invest in autonomous and intelligent driving technologies, they’re also looking to their storefronts to deliver an easy-to-navigate, optimized layout that enhances the shopping experience.

With the help of Modcam’s AI algorithms for edge computing, Mercedes-Benz Consulting has gained valuable insight into consumer behavior in both the show floor and service areas.

Smart Shopping

Modcam’s intelligent AI analyzes how people move in spaces. This helps retailers determine patterns, like whether a certain layout or signage is effective, and identifies customer interest in products that they may linger over.

It does so without collecting or storing private information. The deep neural networks running at the edge detect customers as people with non-identifying characteristics, and the smart sensors don’t store any of the images they analyze.

Modcam relies on the high-performance, energy-efficient Jetson Nano and is optimized using the NVIDIA Metropolis application framework to perform this real-time compute. This small, yet powerful computer lets Modcam run multiple deep neural networks in parallel for applications such as object detection, segmentation and tracking — all in an easy-to-use platform that runs in as little as 5 watts.

“Our previous generation of sensors and processors wasn’t powerful enough, so we upgraded to the NVIDIA Jetson Nano to deliver a 60x increase in neural network performance demanded by our next-generation systems,” said Andreas Nordgren, chief operating officer at Modcam.

And with this high-performance, intelligent edge solution, Modcam is helping retailers around the world deliver more optimized merchandising and a better shopping experience.

Modcam is a member of NVIDIA Inception, a virtual accelerator program that enables early-stage companies with fundamental tools, expertise and go-to-market support.

AI-Powered Luxury

With the concept store near its headquarters, Mercedes-Benz Consulting can test different floor layouts as well as touchscreen promotions and signage to display product information.

By outfitting the store with Modcam’s edge AI system, the automaker is able to measure the success of these layouts and campaigns, determining how much traffic flows to which models and how customers interact with different store configurations.

The luxury automaker can then extend these learnings to its dealerships around the world to optimize the customer buying and service experience.

And with the help of real-time edge computing, they can iterate quickly to consistently provide the best possible in-store experience.

The post Revved Up Retail: Mercedes-Benz Consulting Optimizes Dealership Layout Using Modcam Store Analytics appeared first on The Official NVIDIA Blog.

Revved Up Retail: Mercedes-Benz Consulting Optimizes Dealership Layout Using Modcam Store Analytics

Retailers are bringing the power of AI to their stores to better understand customer buying behavior and preferences and provide them a better experience. AI startup Modcam, based in Sweden, uses smart sensors to provide detailed data on retail, showroom and office space traffic. These sensors, powered by NVIDIA Jetson Nano modules, perform real-time compute Read article >

The post Revved Up Retail: Mercedes-Benz Consulting Optimizes Dealership Layout Using Modcam Store Analytics appeared first on The Official NVIDIA Blog.

Robotics Reaps Rewards at ICRA: NVIDIA’s Dieter Fox Named RAS Pioneer

Thousands of researchers from around the globe will be gathering — virtually — next week for the IEEE International Conference on Robotics and Automation.

As a flagship conference on all things robotics, ICRA has become a renowned forum since its inception in 1984. This year, NVIDIA’s Dieter Fox will receive the RAS Pioneer Award, given by the IEEE Robotics and Automation Society.

Fox is the company’s senior director of robotics research and head of the NVIDIA Robotics Research Lab, in Seattle, as well as a professor at the University of Washington Paul G. Allen School of Computer Science & Engineering and head of the UW Robotics and State Estimation Lab. At the NVIDIA lab, Fox leads over 20 researchers and interns, fostering collaboration with the neighboring UW.

He’s receiving the RAS Pioneer Award “for pioneering contributions to probabilistic state estimation, RGB-D perception, machine learning in robotics, and bridging academic and industrial robotics research.”

“Being recognized with this award by my research colleagues and the IEEE society is an incredible honor,” Fox said. “I’m very grateful for the amazing collaborators and students I had the chance to work with during my career. I also appreciate that IEEE sees the importance of connecting academic and industrial research — I believe that bridging these areas allows us to make faster progress on the problems we really care about.”

Fox will also give a talk at the conference, where a total of 19 papers that investigate a variety of topics in robotics will be presented by researchers from NVIDIA Research.

Here’s a preview of some of the show-stopping NVIDIA research papers that were accepted at ICRA:

Robotics Work a Finalist for Best Paper Awards

6-DOF Grasping for Target-Driven Object Manipulation in Clutter” is a finalist for both the Best Paper Award in Robot Manipulation and the Best Student Paper.

The paper delves into the challenging robotics problem of grasping in cluttered environments, which is a necessity in most real-world scenes, said Adithya Murali, one of the lead researchers and a graduate student at the Robotics Institute at Carnegie Mellon University. Much current research considers only planar grasping, in which a robot grasps from the top down rather than moving in more dimensions.

Arsalan Mousavian, another lead researcher on the paper and a senior research scientist at the NVIDIA Robotics Research Lab, explained that they performed this research in simulation. “We weren’t bound by any physical robot, which is time-consuming and very expensive,” he said.

Mousavian and his colleagues trained their algorithms on NVIDIA V100 Tensor Core GPUs, and then tested on NVIDIA TITAN GPUs. For this particular paper, the training data consisted of simulating 750,000 robot object interactions in less than half a day, and the models were trained in a week. Once trained, the robot was able to robustly manipulate objects in the real world.

Replanning for Success

NVIDIA Research also considered how robots could plan to accomplish a wide variety of tasks in challenging environments, such as grasping an object that isn’t visible, in a paper called “Online Replanning in Belief Space for Partially Observable Task and Motion Problems.”

The approach makes a variety of tasks possible. Caelan Garrett, graduate student at MIT and a lead researcher on the paper, explained, “Our work is quite general in that we deal with tasks that involve not only picking and placing things in the environment, but also pouring things, cooking, trying to open doors and drawers.”

Garrett and his colleagues created an open-source algorithm, SS-Replan, that allows the robot to incorporate observations when making decisions, which it can adjust based on new observations it makes while trying to accomplish its goal.

They tested their work in NVIDIA Isaac Sim, a simulation environment used to develop, test and evaluate virtual robots, and on a real robot.

DexPilot: A Teleoperated Robotic Hand-Arm System

In another paper, NVIDIA researchers confronted the problem that current robotics algorithms don’t yet allow for a robot to complete precise tasks such as pulling a tea bag out of a drawer, removing a dollar bill from a wallet or unscrewing the lid off a jar autonomously.

In “DexPilot: Depth-Based Teleoperation of Dexterous Robotic Hand-Arm System,” NVIDIA researchers present a system in which a human can remotely operate a robotic system. DexPilot observes the human hand using cameras, and then uses neural networks to relay the motion to the robotic hand.

Whereas other systems require expensive equipment such as motion-capture systems, gloves and headsets, DexPilot archives teleoperation through a combination of deep learning and optimization.

It took 15 hours to train on a single GPU once we collected the data, according to NVIDIA researchers Ankur Handa and Karl Van Wyk, two of the authors of the paper. They and their colleagues used the NVIDIA TITAN GPU for their research.

Learn all about these papers and more at ICRA 2020.

The NVIDIA research team has more than 200 scientists around the globe, focused on areas such as AI, computer vision, self-driving cars, robotics and graphics. Learn more at www.nvidia.com/research.

The post Robotics Reaps Rewards at ICRA: NVIDIA’s Dieter Fox Named RAS Pioneer appeared first on The Official NVIDIA Blog.

Robotics Reaps Rewards at ICRA: NVIDIA’s Dieter Fox Named RAS Pioneer

Thousands of researchers from around the globe will be gathering — virtually — next week for the IEEE International Conference on Robotics and Automation. As a flagship conference on all things robotics, ICRA has become a renowned forum since its inception in 1984. This year, NVIDIA’s Dieter Fox will receive the RAS Pioneer Award, given Read article >

The post Robotics Reaps Rewards at ICRA: NVIDIA’s Dieter Fox Named RAS Pioneer appeared first on The Official NVIDIA Blog.

Qure.ai Helps Clinicians Answer Questions from COVID-19 Lung Scans

Qure.ai, a Mumbai-based startup, has been developing AI tools to detect signs of disease from lung scans since 2016. So when COVID-19 began spreading worldwide, the company raced to retool its solution to address clinicians’ urgent needs.

In use in more than two dozen countries, Qure.ai’s chest X-ray tool, qXR, was trained on 2.5 million scans to detect lung abnormalities — signs of tumors, tuberculosis and a host of other conditions.

As the first COVID-specific datasets were released by countries with early outbreaks — such as China, South Korea and Iran — the company quickly incorporated those scans, enabling qXR to mark areas of interest on a chest X-ray image and provide a COVID-19 risk score.

“Clinicians around the world are looking for tools to aid critical decisions around COVID-19 cases — decisions like when a patient should be admitted to the hospital, or be moved to the ICU, or be intubated,” said Chiranjiv Singh, chief commercial officer of Qure.ai. “Those clinical decisions are better made when they have objective data. And that’s what our AI tools can provide.”

While doctors have data like temperature readings and oxygen levels on hand, AI can help quantify the impact on a patient’s lungs — making it easier for clinicians to triage potential COVID-19 cases where there’s a shortage of testing kits, or compare multiple chest X-rays to track the progression of disease.

In recent weeks, the company deployed the COVID-19 version of its tool in around 50 sites around the world, including hospitals in the U.K., India, Italy and Mexico. Healthcare workers in Pakistan are using qXR in medical vans that actively track cases in the community.

A member of the NVIDIA Inception program, which provides resources to help startups scale faster, Qure.ai uses NVIDIA TITAN GPUs on premises, and V100 Tensor Core GPUs through Amazon Web Services for training and inference of its AI models. The startup is in the process of seeking FDA clearance for qXR, which has received the CE mark in Europe.

Capturing an Image of COVID-19

For coronavirus cases, chest X-rays are just one part of the picture — because not every case shows impact on the lungs. But due to the wide availability of X-ray machines, including portable bedside ones, they’ve quickly become the imaging modality of choice for hospitals admitting COVID-19 patients.

“Based on the literature to date, we know certain indicators of COVID-19 are visible in chest  X-rays. We’re seeing what’s called ground-glass opacities and consolidation, and noticed that the virus tends to settle in both sides of the lung,” Singh said. “Our AI model applies a positive score to these factors and relevant findings, and a negative score to findings like calcifications and pleural effusion that suggest it’s not COVID.”

The qXR tool provides clinicians with one of four COVID-19 risk scores: high, medium, low or none. Within a minute, it also labels and quantifies lesions, providing an objective measurement of lung impact.

By rapidly processing chest X-ray images, qXR is helping some doctors triage patients with COVID-19 symptoms while they wait for test results. Others are using the tool to monitor disease progression by comparing multiple scans taken of the same patient over time. For ease of use, qXR integrates with radiologists’ existing workflows, including the PACS imaging system.

“Workflow integration is key, as the more you can make your AI solution invisible and smoothly embedded into the healthcare workflow, the more it’ll be adopted and used,” Singh said.

While the first version of qXR with COVID-19 analysis was trained and validated on around 11,500 scans specific to the virus, the team has been adding a couple thousand additional scans to the dataset each week, improving accuracy as more data becomes available.

Singh credits the company’s ability to pivot quickly in part to the diverse dataset of chest X-rays it’s collected over the years. In total, Qure.ai has almost 8 million studiess, spread evenly across North America, Europe, the Middle East and Asia, as well as a mix of studies taken on different equipment manufacturers and in different healthcare settings.

“The volume and variety of data helps our AI model’s accuracy,” Singh said. “You don’t want something built on perfect, clean data from a single site or country, where the moment it goes to a new environment, it fails.”

From the Cloud to Clinicians’ Hands

The Bolton NHS Foundation Trust in the U.K. and San Rafaelle University Hospital in Milan, are among dozens of sites that have deployed qXR to help radiologists monitor COVID-19 disease progression in patients.

Most clients can get up and running with qXR within an hour, with deployment over the cloud. In an urgent environment like the current pandemic, this allows hospitals to move quickly, even when travel restrictions make live installations impossible. Hospital customers with on-premises data centers can choose to use their onsite compute resources for inference.

Qure.ai’s next step, Singh said, “is to get this tool in the hands of as many radiologists and other clinicians directly interacting with patients around the world as we can.”

The company has also developed a natural language processing tool, qScout, that uses a chatbot to handle regular check-ins with patients who feel they may have the virus or are recovering at home. Keeping in contact with people in an outpatient setting is an important tool to monitor symptoms, alerting healthcare workers when a patient may need to be admitted to the hospital or track patient recovery without overburdening hospital infrastructure.

It took the team just six weeks to take qScout from a concept to its first customer: the Ministry of Health in Oman.

To learn more about Qure.ai, watch the recent COMPUTE4COVID webinar session, Healthcare AI Startups Against COVID-19. Visit our COVID page to explore how other startups are using AI and accelerated computing to fight the virus.

The post Qure.ai Helps Clinicians Answer Questions from COVID-19 Lung Scans appeared first on The Official NVIDIA Blog.

Qure.ai Helps Clinicians Answer Questions from COVID-19 Lung Scans

Qure.ai, a Mumbai-based startup, has been developing AI tools to detect signs of disease from lung scans since 2016. So when COVID-19 began spreading worldwide, the company raced to retool its solution to address clinicians’ urgent needs. In use in more than two dozen countries, Qure.ai’s chest X-ray tool, qXR, was trained on 2.5 million Read article >

The post Qure.ai Helps Clinicians Answer Questions from COVID-19 Lung Scans appeared first on The Official NVIDIA Blog.

At a Crossroads: How AI Helps Autonomous Vehicles Understand Intersections

Editor’s note: This is the latest post in our NVIDIA DRIVE Labs series. With this series, we’re taking an engineering-focused look at individual autonomous vehicle challenges and how the NVIDIA DRIVE AV Software team is mastering them. Catch up on our earlier posts, here.

Intersections are common roadway features, whether four-way stops in a neighborhood or traffic-light-filled exchanges on busy multi-lane thoroughfares.

Given the frequency, variety and risk associated with intersections — more than 50 percent of serious accidents in the U.S. happen at or near them — it’s critical that an autonomous vehicle be able to accurately navigate them.

Handling intersections autonomously presents a complex set of challenges for self-driving cars. This includes the ability to stop accurately at an intersection wait line or crosswalk, correctly process and interpret right of way traffic rules in various scenarios, and determine and execute the correct path for a variety of maneuvers, such as proceeding straight through the intersection and unprotected intersection turns.

Earlier in the DRIVE Labs series, we demonstrated how we detect intersections, traffic lights, and traffic signs with the WaitNet DNN. And how we classify traffic light state and traffic sign type with the LightNet and SignNet DNNs. In this episode, we go further to show how NVIDIA uses AI to perceive the variety of intersection structures that an autonomous vehicle could encounter on a daily drive.

Manual Mapmaking

Previous methods have relied on high-definition 3D semantic maps of an intersection and its surrounding area to understand the intersection structure and create paths to navigate safely.

Human labeling is heavily involved to create such a map, hand-encoding all potentially relevant intersection structure features, such as where the intersection entry/exit lines and dividers are located, where any traffic lights or signs are, and how many lanes there are in each direction. The more complex the intersection scenario, the more heavily the map would need to be manually annotated.

An important practical limitation of this approach is lack of scalability. Every intersection in the world would need to be manually labeled before an autonomous vehicle could navigate them, which would create highly impractical data collection, labeling, and cost challenges.

Another challenge lies in temporary conditions, such as construction zones. Because of the temporary nature of these scenarios, writing them into and out of a map can be highly complex.

In contrast, our approach is analogous to how humans drive. Humans use live perception rather than maps to understand intersection structure and navigate intersections.

A Structured Approach to Intersections

Our algorithm extends the capability of our WaitNet DNN to predict intersection structure as a collection of points we call “joints,” which are analogous to joints in a human body. Just as the actuation of human limbs is achieved through connections between our joints, in our approach, actuation of an autonomous vehicle through an intersection may be achieved by connecting the intersection structure joints into a path for the vehicle to follow.

Figure 1 illustrates intersection structure prediction using our DNN-based method. As shown, we can detect and classify intersection structure features into different classes, such as intersection entry and exit points for both the ego car and other vehicles on the scene, as well as pedestrian crossing entries and exits.

Figure 1. Prediction of intersection structure. Red = intersection entry wait line for ego car; Yellow = intersection entry wait line for other cars; Green = intersection exit line. In this figure, the green lines indicate all the possible ways the ego car could exit the intersection if arriving at it from the left-most lane – specifically, it could continue driving straight, take a left turn, or make a U-turn.

Rather than segment the contours of an image, our DNN is able to differentiate intersection entry and exit points for different lanes. Another key benefit of our approach is that the intersection structure prediction is robust to occlusions and partial occlusions, and it’s able to predict both painted and inferred intersection structure lines.

The intersection key points of figure 1 may also be connected into paths for navigating the intersection. By connecting intersection entry and exit points, paths and trajectories that represent the ego car movements can be predicted.

Our live perception approach enables scalability for handling various types of intersections without the burden of manually labeling each intersection individually. It can also be combined with map information, where high-quality data is available, to create diversity and redundancy for complex intersection handling.

Our DNN-based intersection structure perception capability will become available to developers in an upcoming DRIVE Software release as an additional head of our WaitNet DNN. To learn more about our DNN models, visit our DRIVE Perception page.

The post At a Crossroads: How AI Helps Autonomous Vehicles Understand Intersections appeared first on The Official NVIDIA Blog.