Non-Stop Shopping: Startup’s AI Let’s Supermarkets Skip the Line

Eli Gorovici loves to take friends sailing on the Mediterranean. As the new pilot of Trigo, a Tel Aviv-based startup, he’s inviting the whole retail industry on a cruise to a future with AI.

“We aim to bring the e-commerce experience into the brick-and-mortar supermarket,” said Gorovici, who joined the company as its chief business officer in May.

The journey starts with the sort of shopping anyone who’s waited in a long checkout line has longed for.

You fill up your bags at the market and just walk out. Magically, the store knows what you bought, bills your account and sends you a digital receipt, all while preserving your privacy.

Trigo is building that experience and more. Its magic is an AI engine linked to cameras and a few weighted shelves for small items a shopper’s hand might completely cover.

With these sensors, Trigo builds a 3D model of the store. Neural networks recognize products customers put in their bags.

When shoppers leave, the system sends the grocer the tally and a number it randomly associated with them when they chose to swipe their smartphone as they entered the store. The grocer matches the number with a shopper’s account, charges it and sends off a digital bill.

And that’s just the start.

An Online Experience in the Aisles

Shoppers get the same personalized recommendation systems they’re used to seeing online.

“If I’m standing in front of pasta, I may see on my handset a related coupon or a nice Italian recipe tailored for me,” said Gorovici. “There’s so much you can do with data, it’s mind blowing.”

The system lets stores fine-tune their inventory management systems in real time. Typical shrinkage rates from shoplifting or human error could sink to nearly zero.

AI Turns Images into Insights

Making magic is hard work. Trigo’s system gathers a petabyte of video data a day for an average-size supermarket.

It uses as many as four neural networks to process that data at mind-melting rates of up to a few hundred frames per second. (By contrast, your TV displays high-definition movies at 60 fps.)

Trigo used a dataset of up to 500,000 2D product images to train its neural networks. In their daily operations, the system uses those models to run millions of inference tasks with help from NVIDIA TensorRT software.

The AI work requires plenty of processing muscle. A supermarket outside London testing the Trigo system uses servers in its back room with 40-50 NVIDIA RTX GPUs. To boost efficiency, Trigo plans to deliver edge servers using NVIDIA T4 Tensor Core GPUs and join the NVIDIA Metropolis ecosystem starting next year.

Trigo got early access to the T4 GPUs thanks to its participation in NVIDIA Inception, a program that gives AI startups traction with tools, expertise and go-to-market support. The program also aims to introduce Trigo to NVIDIA’s retail partners in Europe.

In 2021, Trigo aims to move some of the GPU processing to Google, Microsoft and other cloud services, keeping some latency- or privacy-sensitive uses inside the store. It’s the kind of distributed architecture businesses are just starting to adopt, thanks in part to edge computing systems such as NVIDIA’s EGX platform.

Big Supermarkets Plug into AI

Tesco, the largest grocer in the U.K., has plans to open its first market using Trigo’s system. “We’ve vetted the main players in the industry and Trigo is the best by a mile,” said Tesco CEO Dave Lewis.

Israel’s largest grocer, Shufersal, also is piloting Trigo’s system, as are other retailers around the world.

Trigo was founded in 2018 by brothers Michael and Daniel Gabay, leveraging tech and operational experience from their time in elite units of the Israeli military.

Seeking his next big opportunity in his field of video technology, Gorovici asked friends who were venture capitalists for advice. “They said Trigo was the future of retail,” Gorovici said.

Like sailing in the aqua-blue Mediterranean, AI in retail is a compelling opportunity.

“It’s a trillion-dollar market — grocery stores are among the biggest employers in the world. They are all being digitized, and selling more online now given the pandemic, so maybe this next stage of digital innovation for retail will now move even faster,” he said.

Calling AI: Researchers Dial in Machine Learning for 5G

5G researchers from three top institutions have joined NVIDIA in bringing AI to telecom.

The Heinrich Hertz Institute (HHI), the Technical University in Berlin (TU Berlin) and Virginia Tech are collaborating with NVIDIA to harness the power of GPUs for next-generation cellular networks.

The journey began in October at MWC Los Angeles, where NVIDIA and partners announced plans to enable virtual radio access networks (vRANs) for 5G with GPUs.

NVIDIA also debuted Aerial, a software development kit for accelerating vRANs. And partners Ericsson, Microsoft and Red Hat are working with us to deliver 5G at the edge of the network powered by GPUs.

These vRANs will bring cellular network operators the kind of operational efficiencies that cloud service providers already enjoy. Carriers will program network functions in high-level software languages, easing the work of adding new capabilities and deploying capacity where and when it’s needed.

Forging Wireless Ties

Our new research partnerships with HHI, TU Berlin and Virginia Tech will explore multiple ways to accelerate 5G with AI.

They’ll define novel techniques leveraging GPUs that help wireless networks use precious spectrum more efficiently. The work will span research in reinforcement learning and other techniques that build on the product plans announced in October.

HHI is part of Germany’s Fraunhofer Society, a research group founded in 1928 that has a history of pioneering technologies in mobile and optical networking as well as video compression. The collaboration with TU Berlin includes a new 5G test bed with participation from a number of wireless companies in Germany.

“I want to redesign many algorithms in radio access networks (RAN) so we can perform tasks in parallel, and the GPU is a good architecture for this because it exploits massive parallelism,” said Slawomir Stanczak, a professor at TU Berlin and head of HHI’s wireless networking department.

Stanczak’s teams will explore use cases such as adapting AI to deliver improved 5G receivers. “If we are successful, they could offer a breakthrough in dramatically increasing performance and improving spectral efficiency, which is important because spectrum is very expensive,” he said.

In a session for GTC Digital, Stanczak recently described ways to apply AI to the private 5G campus networks which he believes will be market drivers for vRANs. Stanczak chairs a focus group on the use of AI in 5G for the ITU, a leading communications standards group. He’s also author of a widely cited text on the math behind optimizing wireless networks.

Hitting 5G’s Tight Timing Targets

Work at Virginia Tech is led by Tom Hou, a professor of computer engineering whose team specializes in solving some of the most complex and challenging problems in telecom.

His Ph.D. student, Yan Huang, described in a 2018 paper how he used an NVIDIA Quadro P6000 GPU to solve a complex scheduling problem in a tight 100-microsecond window set by the 5G standard. His latest effort cut the time to 60 microseconds using an NVIDIA Tensor Core V100 GPU.

The work “got an enormous response because at that time people using traditional computational techniques would hit roadblocks — no one in the world could solve such a complex problem in 100 microseconds,” said Hou.

“Using GPUs transformed our research group, now we are looking at AI techniques on top of our newly acquired parallel techniques,” he added.

Specifically, Virginia Tech researchers will explore how AI can automatically find and solve in real time thorny problems optimizing 5G networks. For instance, AI could uncover new ways to weave multiple services on a single frequency band, making much better use of spectrum.

“We have found that for some very hard telecom problems, there’s no math formulation, but AI can learn the problem models automatically, enhancing our GPU-based parallel solutions” said Huang.

Groundswell Starts in AI for 5G

Other researchers, including two who presented papers at GTC Digital, are starting to explore the potential for AI in 5G.

Addressing one of 5G’s top challenges, researchers at Arizona State University showed a new method for directing millimeter wave beams, leveraging AI and the ray-tracing features in NVIDIA Turing GPUs.

And Professor Terng-Yin Hsu described a campus network at Taiwan’s National Chiao Tung University that ran a software-defined cellular base station on NVIDIA GPUs.

“We are very much at the beginning, especially in AI for vRAN,” said Stanczak. “In the end, I think we will use hybrid solutions that are driven both by data and domain knowledge.”

Compared to 4G LTE, 5G targets a much broader set of use cases with a much more complex air interface. “AI methods such as machine learning are promising solutions to tackle these challenges,” said Hou of Virginia Tech.

NVIDIA GPUs bring the programming flexibility of CUDA and cuDNN environments and the scalability of multiple GPUs connected on NVLink. That makes them the platform of choice for AI on 5G, he said.

Today we stand at a pivot point in the history of telecom. The traditional principles of wireless signal processing are based on decades-old algorithms. AI and deep learning promise a revolutionary new approach, and NVIDIA’s GPUs are at the heart of it.

The post Calling AI: Researchers Dial in Machine Learning for 5G appeared first on The Official NVIDIA Blog.

With DLSS 2.0, AI Continues to Revolutionize Gaming

From in-game physics to animation simulation to AI-assisted broadcasting, artificial intelligence is revolutionizing gaming.

DLSS 2.0, releasing this week in Control and MechWarrior 5: Mercenaries, represents another major advance for AI in gaming.

DLSS 2.0 — A Big Leap in AI Rendering

Powered by dedicated AI processors on GeForce RTX GPUs called Tensor Cores, DLSS 2.0 is an improved deep learning neural network that boosts frame rates while generating beautiful game images.

It gives gamers the performance headroom to maximize ray tracing settings and increase output resolutions.

DLSS 2.0 offers several key enhancements over the original version:

  • Superior Image Quality — DLSS 2.0 offers image quality comparable to native resolution while only having to render one quarter to one half of the pixels. It employs new temporal feedback techniques for sharper image details and improved stability from frame to frame.
  • Great Scaling Across All RTX GPUs and Resolutions — a new AI model more efficiently uses Tensor Cores to execute 2x faster than the original, improving frame rates and removing restrictions on supported GPUs, settings and resolutions.
  • One Network for All Games —The original DLSS required training the AI network for each new game. DLSS 2.0 trains using non-game-specific content, delivering a generalized network that works across games. This means faster game integrations, and ultimately more DLSS games.
  • Customizable Options — DLSS 2.0 offers users three image quality modes (Quality, Balanced and Performance) that control render resolution, with Performance mode now enabling up to a 4x super resolution. (i.e. 1080p → 4K). This means more user choice, and even bigger performance boosts.

In addition to Control and MechWarrior 5, DLSS 2.0 has delivered big performance boosts to Deliver Us The Moon and Wolfenstein: Youngblood.

DLSS 2.0 is now also available to Unreal Engine 4 developers through the DLSS Developer Program that will accelerate deployment in one of the world’s most popular game engines.

RTX Momentum Builds Across the Ecosystem

DLSS is one of several major graphics innovations, including ray tracing and variable rate shading, introduced in 2018 with the launch of our NVIDIA RTX GPUs.

Since then, more than 15 million NVIDIA RTX GPUs have been sold. More than 30 major games have been released or announced with ray tracing or NVIDIA DLSS powered by NVIDIA RTX. And ray tracing has been adopted by all major APIs and game engines.

That momentum continues with Microsoft’s announcement last week of DirectX 12 Ultimate, as well as NVIDIA RTX Global Illumination SDK, new tool support for the Vulkan graphics API, and new Photoshop texture tool plugins.

DirectX 12 Ultimate

Last week, Microsoft unveiled DirectX 12 Ultimate, the latest version of the widely used DirectX graphics standard.

DirectX 12 Ultimate codifies NVIDIA RTX’s innovative technologies, including ray tracing, variable rate shading, mesh shading, and texture space shading, as the standard for multi-platform, next-gen games.

Game developers can take full advantage of all these technologies knowing they’ll be compatible with PCs equipped with NVIDIA RTX GPUs and Microsoft’s upcoming Xbox Series X console.

RTX Global Illumination SDK 

The NVIDIA RTX Global Illumination SDK, released Monday, gives developers a scalable solution to implement beautiful, ray-traced indirect lighting in games while still achieving performance targets.

The RTX GI SDK is supported on any DirectX raytracing-enabled GPU. It’s an ideal starting point to bring the benefits of ray tracing to more games.

Vulkan Game Developers Get New Tools

NVIDIA also Monday added support for the Vulkan graphics API to two of its most popular game development tools.

Nsight Aftermath, which provides precise information on where GPU crashes occur and why, is now available for the first time for Vulkan.

“NVIDIA will also provide Vulkan developers with GPU Trace, a low-level profiler in Nsight Graphics that provides hardware unit metrics and precise timing information

NVIDIA Texture Tools Exporter

The NVIDIA Texture Tools Exporter allows users to create highly compressed texture files that stay small on disk and in memory.

It allows game and app developers to use higher quality textures in their applications, and provides users a smaller, faster, download size.

It’s available as a standalone tool or as an Adobe Photoshop plug-in for game developers and texture artists.

More Advancements for Gamers

Add it all up: DLSS 2.0, support for RTX technologies with DirectX 12 Ultimate, the introduction of the NVIDIA RTX Global Illumination SDK, support for the Vulkan graphics API in more NVIDIA developer tools, as well as more texture tools. The result is more advancements in the hands of more gamers, developers, and creators than ever before.

The post With DLSS 2.0, AI Continues to Revolutionize Gaming appeared first on The Official NVIDIA Blog.

Now That Everyone’s a Gamer, New Gaming Technologies Matter More Than Ever

Henry Cavill is a gamer. Keanu Reeves is a gamer. Gaming used to be a hobby for the few, appreciated by fewer still.

Now it’s woven into every aspect of popular culture, even as its global reach surpasses those of movies, sports or television.

And why not? Smartphones have brought gaming to billions. Consoles to hundreds of millions more. But it’s PC gaming — always big — that has become a cultural phenomenon, celebrated in esports arenas filled with tens of thousands of screaming fans — and a global audience of half a billion more online.

Esports alone will generate $1.1 billion in revenue this year.

With more than 2.7 billion gamers, worldwide, there’s no place where gaming, as a cultural phenomenon, isn’t on the move — even as the technology underpinning it continues to accelerate.

The support for real-time ray tracing built into NVIDIA GeForce RTX GPUs, introduced in 2018, gives game developers control over light, shadows and reflections, once only available to top moviemakers.

Deep learning — born on the NVIDIA GPUs that got their start with gaming — now makes graphics sharper and games smarter.

Laptops from every major brand featuring NVIDIA Max-Q design and a new generation of desktop-class mobile GPUs deliver the performance of a console so gamers can play amazing games anywhere. As a result, gaming laptop revenue has surged 12x in six years.

And cloud gaming — thanks to our long, continued investment in our GeForce NOW service — makes high-quality games available on the next billion devices: underpowered PCs, Macs and smartphones.

NVIDIA’s led the way on all these next-generation technologies, which are now spilling out across the gaming industry and accelerating a cultural phenomenon that’s breaking out everywhere you look.

Lights, Camera, Action

Ray tracing — which models the way light moves around the real world — has long been a mainstay of movies. Films rely on powerful banks of servers, known as render farms, to shape each scene.

The hardware-based ray-tracing capabilities built into NVIDIA GeForce RTX GPUs brought this cinematic tool to gaming, letting developers create more immersive environments.

But the change goes beyond just realistic light, shadows and reflections.

Real-time ray tracing helps game developers move faster and gives gamers much more freedom. By modeling the way light moves, in real time, ray tracing promises to free developers from painstakingly “pre-baking” every scene.

Hardware-accelerated ray tracing also makes the task of producing that pre-computed lighting less onerous. GPUs accelerate that process by an order of magnitude over traditional approaches.

Just look to Microsoft’s Minecraft for an example.

By pouring real-time ray tracing into the world-building game with Minecraft with RTX, Minecraft’s blocky environment is transformed into a more cinematic world.

Modeling a few of the billions of possible configurations of the game would have been impossible just a few years ago.

Real-time ray tracing, however, means gamers get to play in a sandbox where the way a scene is lit reacts to the environment they create.

And we’re working to bring the benefits of ray tracing, first, to as many as possible.

We’ve worked to enable ray tracing on our RTX GPUs, and with Microsoft and and the Khronos Group to advance support software standards that make it possible far beyond PCs.

That work continues with the developers and publishers creating the next generation of games across all platforms.

Faster Games

Games are also getting faster. That’s because the modern AI revolution — grounded in deep learning — was born on the NVIDIA GPUs gamers rely on. And now that revolution is coming back to where it all started, gaming, to make games faster still.

Powered by Turing’s Tensor Cores, which perform lightning-fast deep neural network processing, GeForce RTX GPUs support Deep Learning Super-Sampling. DLSS applies deep learning and AI to rendering techniques, resulting in crisp, smooth edges on rendered objects in games.

NVIDIA GPUs are being used to power sophisticated deep learning algorithms that allow the style from a work of art to be applied to an entire scene, changing the look and feel of a gaming experience.

More’s coming. That’s because the more AI races ahead, the more of these capabilities can be deployed in games, thanks to the NVIDIA GPUs that power them.

Imagine game developers being able to tap into powerful AI voice-generation algorithms to create realistic and emotive voices from a script — making games available in all the globe’s major languages.

Or, even wilder, imagine a conversational AI able to generate that script — and react to your actions — on the fly. Or, having the games non-player characters actually learn from the players, making encounters constantly evolve and challenging.


Cloud gaming brings state-of-the-art experiences to every platform without waiting.

It extends the highest quality gaming to the billion people who are playing on phones and tablets.

And it gives these gamers the same freedom as with a PC, so they can play the games they already enjoy on their laptops and PCs in more places.

It’s another way to access the games they already own — and that’s what we’ve done with GeForce NOW, which we opened to all last month.

Everything Else

There’s much more, of course.

Driven by a host of NVIDIA design innovations, gaming laptops continue to get thinner and lighter.

NVIDIA G-SYNC display technology — now ubiquitous in competitive gaming — eliminates tearing and stuttering, giving competitive gamers more accuracy, so they compete more effectively.

Variable rate shading, a technology pioneered by NVIDIA, increases rendering performance and quality by varying the shading rate for different regions of the frame. It will be coming to the next-generation Xbox.

NVIDIA has pioneered a host of technologies designed to reduce latency, or the lag, between a gamer’s input and what they see on their screens.

And equipped with a new generation of NVIDIA-pioneered game-streaming tools, gamers on Twitch and YouTube wield influence that rivals the celebrities who now flaunt their gaming credentials.

Features like these — and performance like no other — are why GeForce gamers will continue to enjoy the best experiences first on the most anticipated games, such as Cyberpunk 2077.

There’s no part of this ecosystem that we’re not working to support. Even the parts that aren’t cool … yet.

Gamer? Dig in to our GeForce channel for all the latest news, and deep-dive videos detailing all the latest developments. 

Image credit: DOTA 2 The International, Some Rights Reserved


The post Now That Everyone’s a Gamer, New Gaming Technologies Matter More Than Ever appeared first on The Official NVIDIA Blog.

Life of Pie: How AI Delivers at Domino’s

Some like their pies with extra cheese, extra sauce or double pepperoni. Zack Fragoso’s passion is for pizza with plenty of data.

Fragoso, a data science and AI manager at pizza giant Domino’s, got his Ph.D. in occupational psychology, a field that employs statistics to sort through the vagaries of human behavior.

“I realized I liked the quant part of it,” said Fragoso, whose nimbleness with numbers led to consulting jobs in analytics for the police department and symphony orchestra in his hometown of Detroit before landing a management job on Domino’s expanding AI team.

The pizza maker “has grown our data science team exponentially over the last few years, driven by the impact we’ve had on translating analytics insights into action items for the business team.”

Making quick decisions is important when you need to deliver more than 3 billion pizzas a year — fast. So, Domino’s is exploring the use of AI for a host of applications, including more accurately predicting when an order will be ready.

Points for Pie, launched at last year’s Super Bowl, has been Domino’s highest profile AI project to date. Snap a smartphone picture of whatever pizza you’re eating and the company gave the customer loyalty points toward a free pizza.

“There was a lot of excitement for it in the organization, but no one was sure how to recognize purchases and award points,” Fragoso recalled.

“The data science team said this is a great AI application, so we built a model that classified pizza images. The response was overwhelmingly positive. We got a lot of press and massive redemptions, so people were using it,” he added.

Domino’s trained its model on an NVIDIA DGX system equipped with eight V100 Tensor Core GPUs using more than 5,000 images, including pictures some customers sent in of plastic pizza dog toys. A survey sent in response to the pictures helped automate some of the job of labeling the unique dataset now considered a strategic corporate asset.

AI Knows When the Order Will Be Ready

More recently, Fragoso’s team hit another milestone, boosting accuracy from 75% to 95% for predictions of when an order will be ready. The so-called load-time model factors in variables such as how many managers and employees are working, the number and complexity of orders in the pipeline and current traffic conditions.

The improvement has been well received and could be the basis for future ways to advance operator efficiencies and customer experiences, thanks in part to NVIDIA GPUs.

“Domino’s does a very good job cataloging data in the stores, but until recently we lacked the hardware to build such a large model,” said Fragoso.

At first, it took three days to train the load-time model, too long to make its use practical.

“Once we had our DGX server, we could train an even more complicated model in less than an hour,” he said of the 72x speed-up. “That let us iterate very quickly, adding new data and improving the model, which is now in production in a version 3.0,” he added.

More AI in the Oven

The next big step for Fragoso’s team is tapping a bank of NVIDIA Turing T4 GPUs to accelerate AI inferencing for all Domino’s tasks that involve real-time predictions.

Some emerging use cases in the works are still considered secret ingredients at Domino’s. However, the data science team is exploring computer vision applications to make getting customers their pizza as quick and easy as possible.

“Model latency is extremely important, so we are building out an inference stack using T4s to host our AI models in production. We’ve already seen pretty extreme improvements with latency down from 50 milliseconds to sub-10ms,” he reported.

Separately, Domino’s recently tapped BlazingSQL, open-source software to run data-science queries on GPUs. NVIDIA RAPIDS software eased the transition, supporting the APIs from a prior CPU-based tool while delivering better performance.

It’s delivering an average 10x speed-up across all use cases in the part of the AI process that involves building datasets.

“In the past some of the data-cleaning and feature-engineering operations might have taken 24 hours, but now we do them in less than an hour,” he said.

Try Out AI at NRF 2020

Domino’s is one of many forward-thinking companies using GPUs to bring AI to retail.

NVIDIA GPUs helped power Alibaba to $38 billion in revenue on Singles Day, the world’s largest shopping event. And the world’s largest retailer, Walmart, talked about its use of GPUs and NVIDIA RAPIDS at an event earlier this year.

Separately, IKEA uses AI software from NVIDIA partner Winnow to reduce food waste in its cafeterias.

You can learn more about best practices of using AI in retail at this week’s NRF 2020, the National Retail Federation’s annual event. NVIDIA and some of its 100+ retail partners will be on hand demonstrating our EGX edge computing platform, which scales AI to local environments where data is gathered — store aisles, checkout counters and warehouses.

The EGX platform’s real-time edge compute abilities can notify store associates to intervene during shrinkage, open new checkout counters when lines are getting long and deliver the best customer shopping experiences.

Book a meeting with NVIDIA at NRF here.

The post Life of Pie: How AI Delivers at Domino’s appeared first on The Official NVIDIA Blog.

AWS Outposts Station a GPU Garrison in Your Datacenter

All the goodness of GPU acceleration on Amazon Web Services can now also run inside your own data center.

AWS Outposts powered by NVIDIA T4 Tensor Core GPUs are generally available starting today. They bring cloud-based Amazon EC2 G4 instances inside your data center to meet user requirements for security and latency in a wide variety of AI and graphics applications.

With this new offering, AI is no longer a research project.

Most companies still keep their data inside their own walls because they see it as their core intellectual property. But for deep learning to transition from research into production, enterprises need the flexibility and ease of development the cloud offers — right beside their data. That’s a big part of what AWS Outposts with T4 GPUs now enables.

With this new offering, enterprises can install a fully managed rack-scale appliance next to the large data lakes stored securely in their data centers.

AI Acceleration Across the Enterprise

To train neural networks, every layer of software needs to be optimized, from NVIDIA drivers to container runtimes and application frameworks. AWS services like Sagemaker, Elastic MapReduce and many others designed on custom-built Amazon Machine Images require model development to start with the training on large datasets. With the introduction of NVIDIA-powered AWS Outposts, those services can now be run securely in enterprise data centers.

The GPUs in Outposts accelerate deep learning as well as high performance computing and other GPU applications. They all can access software in NGC, NVIDIA’s hub for GPU-accelerated software optimization, which is stocked with applications, frameworks, libraries and SDKs that include pre-trained models.

For AI inference, the NVIDIA EGX edge-computing platform also runs on AWS Outposts and works with the AWS Elastic Kubernetes Service. Backed by the power of NVIDIA T4 GPUs, these services are capable of processing orders of magnitudes more information than CPUs alone. They can quickly derive insights from vast amounts of data streamed in real time from sensors in an Internet of Things deployment whether it’s in manufacturing, healthcare, financial services, retail or any other industry.

On top of EGX, the NVIDIA Metropolis application framework provides building blocks for vision AI, geared for use in smart cities, retail, logistics and industrial inspection, as well as other AI and IoT use cases, now easily delivered on AWS Outposts.

Alternatively, the NVIDIA Clara application framework is tuned to bring AI to healthcare providers whether it’s for medical imaging, federated learning or AI-assisted data labeling.

The T4 GPU’s Turing architecture uses TensorRT to accelerate the industry’s widest set of AI models. Its Tensor Cores support multi-precision computing that delivers up to 40x more inference performance than CPUs.

Remote Graphics, Locally Hosted

Users of high-end graphics have choices, too. Remote designers, artists and technical professionals who need to access large datasets and models can now get both cloud convenience and GPU performance.

Graphics professionals can benefit from the same NVIDIA Quadro technology that powers most of the world’s professional workstations not only on the public AWS cloud, but on their own internal cloud now with AWS Outposts packing T4 GPUs.

Whether they’re working locally or in the cloud, Quadro users can access the same set of hundreds of graphics-intensive, GPU-accelerated third-party applications.

The Quadro Virtual Workstation AMI, available in AWS Marketplace, includes the same Quadro driver found on physical workstations. It supports hundreds of Quadro-certified applications such as Dassault Systèmes SOLIDWORKS and CATIA; Siemens NX; Autodesk AutoCAD and Maya; ESRI ArcGIS Pro; and ANSYS Fluent, Mechanical and Discovery Live.

Learn more about AWS and NVIDIA offerings and check out our booth 1237 and session talks at AWS re:Invent.

The post AWS Outposts Station a GPU Garrison in Your Datacenter appeared first on The Official NVIDIA Blog.

SUPER Powers for All Gamers: Best In Class Performance, Ray Tracing, Latest Features

Most gamers don’t buy a GPU to accelerate a single game. They make investments. That’s why — with the deluge of new ray-traced games announced over the past few weeks — we’re doubling down on our gaming GPUs.

SUPER, our new line of faster Turing GPUs announced Tuesday — and maybe not our best kept secret — is perfect for them.

These new GPUs — the GeForce RTX 2080 SUPER, GeForce RTX 2070 SUPER, and GeForce RTX 2060 SUPER — deliver up to 25 percent faster performance than the original RTX 20 series.

They offer more cores and higher clock speeds, so gamers — who want the best performance they can afford — know they’ll be able to play the blockbusters today, and the ones on the horizon.

And with so many ray-traced mega titles publicly embracing real-time ray tracing, why would anyone buy a GPU that doesn’t support it?

Game On, Literally

Dubbed the “graphics holy grail,” real-time ray tracing brings cinematic quality lighting effects to interactive experiences for the first-time.

The ecosystem now driving real-time ray tracing is immense – tens of millions of GPUs, industry standard APIs, leading game engines and an all-star roster of game franchises.

Turing, which was introduced last year, is a key part of that ecosystem. The world’s most advanced GPU architecture, it fuses next-generation shaders with real-time ray tracing and all-new AI capabilities.

Turing’s hybrid graphics capability represents the biggest generational leap ever in gaming GPUs, delivering up to 6x more performance than previous 10 Series Pascal GPUs.

And our killer line-up of SUPER GPUs — which represents a year of tweaking and tuning our Turing architecture — will deliver even more performance.

So, demanding PC gamers can be ready to take on the new generation of games that every gamer now knows are coming.

E3, Computex Kick Off a Deluge of New Ray-Traced Games

Last month’s Computex and Electronic Entertainment Expo marked a milestone for real time ray tracing, as blockbuster after blockbuster announced that they would be using it to create stunning visuals in their titles.

Call of Duty: Modern Warfare, Control, Cyberpunk 2077, Doom Eternal, Sword and Fairy 7, Watch Dogs: Legion, and Wolfenstein: Youngblood joined the list of major titles that will be using ray tracing. 

Battlefield V, Metro Exodus  Quake II RTX, Shadow of the Tomb Raider, and Stay in the Light (early access) are already shipping with ray tracing support.

And more are coming.

Ray tracing is now supported in industry standard APIs, including Microsoft DirectX Raytracing and Vulkan.

The most popular game engines used by game developers to create games now support real-time ray tracing, including Unreal Engine, Unity, Frostbite, id Tech, Remedy, 4A and more.

Virtual Reality, More a Reality than Ever

More than just a ray tracing king, the RTX GPU series also designed for virtual reality.

NVIDIA Adaptive Shading (NAS) technology is built into the Turing architecture. NAS supports Variable Rate Shading (VRS), including motion and content adaptive shading for the highest performance and image quality, as well as Foveated Rendering, which puts the detail where the gamer is looking.

These technologies support a booming ecosystem of headsets, developer tools — and, increasingly — games. Valve’s Index HMD and controller began shipping just days ago. That follows the launch of the highly-anticipated Oculus Rift S earlier this year, as well as the Vive Focus Plus in February.

A Trio of GPUs for the Latest AAA Games

Our refreshed lineup of Turing GPUs are ready, joining our existing entry level GeForce RTX 2060, starting at $349; and GeForce RTX 2080 Ti flagship, starting at $999. They include:

  • GeForce RTX 2060 SUPER GPU – Starting at $399, Available July 9
    • Up to 22% faster (average 15%) than RTX 2060
    • 8GB GDDR6 – 2GB more than the RTX 2060
    • Faster than GTX 1080
    • 7+7 TOPs (FP32+INT32) and 57 Tensor TFLOPs
  • GeForce RTX 2070 SUPER GPU – Starting at $499, Available July 9
    • Up to 24% faster (average 16%) than RTX 2070, for the same price
    • Faster than GTX 1080 Ti
    • 9+9 TOPs (FP32+INT32) and 73 Tensor TFLOPs
  • GeForce RTX 2080 SUPER GPU – Starting at $699, Available July 23
    • More performance than RTX 2080, for the same price
    • Memory speed cranked up to 15.5Gbps
    • Faster than TITAN Xp
    • 11+11 TOPs (FP32+INT32) and 89 Tensor TFLOPs

NVIDIA GeForce RTX GPUs aren’t just the only gaming GPUs capable of real-time ray tracing; they’re the only ones to support other advanced gaming features, such as NVIDIA Adaptive Shading, Mesh Shading, Variable Rate Shading and NVIDIA Deep Learning Super Sampling, which uses AI to sharpen game visuals while increasing performance.

Future Proof

Gamers want to make a future proof investment and the future is clearly ray-tracing. They want the best performance and features they can afford. SUPER offers all of that, today.

The post SUPER Powers for All Gamers: Best In Class Performance, Ray Tracing, Latest Features appeared first on The Official NVIDIA Blog.

What’s the Difference Between Hardware and Software Accelerated Ray Tracing?

You don’t need specialized hardware to do ray tracing, but you want it.

Software-based ray tracing, of course, is decades old. And it looks great: movie makers have been using ray-tracing for decades now.

But it’s now clear that specialized hardware — like the RT cores built into NVIDIA’s latest Turing architecture — make a huge difference if you’re doing ray-tracing in real time. Games require real time ray tracing.

Once considered the “holy grail” of graphics, real-time ray tracing brings the same techniques long used by moviemakers to gamers and creators.

Thanks to a raft of new AAA games developers have introduced this year — and the introduction last year of NVIDIA GeForce RTX GPUs — this once wild idea is mainstream.

Millions are now firing up PCs that benefit from the RT Cores and Tensor Cores built into RTX. And they’re enjoying ray-tracing enhanced experiences many thought would be years, or decades, away.

Real-time ray tracing, however, is possible without dedicated hardware. That’s because – while ray tracing has been around since the 1970s – the real trend is much newer: GPU-accelerated ray tracing with dedicated cores.

The use of GPUs to accelerate ray-tracing algorithms gained fresh momentum last year with the introduction of Microsoft’s DirectX Raytracing (DXR) API. And that’s great news for gamers and creators.

Ray Tracing Isn’t New

So what is ray tracing? Look around you. The objects you’re seeing are illuminated by beams of light. Now follow the path of those beams backwards from your eye to the objects that light interacts with. That’s ray tracing.

It’s a technique first described by IBM’s Arthur Appel, in 1969, in “Some Techniques for Shading Machine Renderings of Solids.” Thanks to pioneers such as Turner Whitted, Lucasfilm’s Robert Cook, Thomas Porter and Loren Carpenter, CalTech’s Jim Kajiya, and a host of others, ray tracing is now the standard in the film and CG industry for creating lifelike lighting and images.

However, until last year, almost all ray-tracing was done offline. It’s very compute intensive. Even today, the effects you see at movie theaters require sprawling, CPU-equipped server farms. Gamers want to play interactive, real time games. The won’t wait minutes or hours per frame.

GPUs, by contrast, can move much faster, thanks to the fact they rely on larger numbers of computing cores to get complex tasks done more quickly. And, traditionally, they’ve used another rendering technique, known as “rasterization,” to display three-dimensional objects on a two-dimensional screen.

With rasterization, objects on the screen are created from a mesh of virtual triangles, or polygons, that create 3D models of objects. In this virtual mesh, the corners of each triangle — known as vertices — intersect with the vertices of other triangles of different sizes and shapes. It’s fast and the results have gotten very good, even if it’s still not always as good as what ray tracing can do.

GPUs Take on Ray Tracing

But what if you used these GPUs — and their parallel processing capabilities — to accelerate ray tracing? This is where GPU software-accelerated ray tracing comes in. NVIDIA OptiX, introduced in 2009, targeted design professionals with GPU-accelerated ray tracing. Over the next decade, OptiX rode the steady advance in speed delivered by successive generations of NVIDIA GPUs.

By 2015, NVIDIA was demonstrating at SIGGRAPH how ray tracing could turn a CAD model into a photorealistic image — indistinguishable from a photograph — in seconds, speeding up the work of architects, product designers and graphic artists.

That approach — GPU-accelerated software ray tracing — was endorsed by Microsoft early last year, with the introduction of DXR, which enables full support of NVIDIA’s RTX ray-tracing software through Microsoft’s DXR API.

Delivering high performance real time ray tracing required two innovations: dedicated ray tracing hardware, RT Cores; and Tensor Cores for high performance AI processing for advanced denoising, anti-aliasing, and super resolution.

RT Cores accelerate ray tracing by speeding up the process of finding out where a ray intersects with the 3D geometry of a scene. These specialized cores accelerate a tree-based ray tracing structure called a bounding volume hierarchy, or BVH, used to calculate where rays and the triangles that comprise a computer-generated image intersect.

Tensor Cores — first unveiled with NVIDIA’s Volta architecture aimed at enterprise and scientific computing in 2018 to accelerate AI algorithms — further accelerate graphically intense workloads. That’s through a special AI technique called NVIDIA DLSS, short for Deep Learning Super Sampling,. RTX’s Tensor Cores make this possible.

Turing at Work

You can see how this works by comparing how quickly Turing and our previous generation Pascal architecture render a single frame of Metro Exodus.

Metro rendered on one frame on Pascal, this is one frame of Pascal, and this time in the middle is spent ray tracing.

On Turing, you can see several things happening here. One is green, that’s our RT cores kicking in. As you can see, the same ray tracing done on Pascal GPU is done in 20% of the time on Turing.

Reinventing graphics, NVIDIA and our partners have been driving Turing to market through a stack of products that now range from the highest performance product, at $999, all the way down to an entry gamer, at $199. The RTX products, with RT cores and Tensor cores, start at $349.

Broad Support

There’s no question that real time ray tracing is the next generation of gaming.

Some of the most important ecosystem partners have announced their support, and are now opening the floodgates for real time ray tracing in games.

Inside of Microsoft’s DirectX 12 multimedia programming interfaces is a ray tracing component they call DirectX Raytracing (DXR). So every PC, if enabled by the GPU, is capable of accelerated ray tracing.

At the Game Developer Conference this past March we turned on DXR accelerated ray tracing on our Pascal and Turing GTX GPUs.

To be sure, earlier GPU architectures, such as Pascal, were designed to accelerate DirectX 12. So on this hardware, these calculations are performed on the programmable shader cores, a resource shared with many other graphics functions of the GPU.

So while your mileage will vary — since there are many ways ray tracing can be implemented — Turing will consistently perform better when playing games that make use of ray-tracing effects.

And that performance advantage on the most popular games is only going to grow.

EA’s AAA Engine Frostbite, supports ray tracing. Unity and Unreal, which together power 90 percent of the world’s games, now support Microsoft’s DirectX ray tracing in the engine.

Collectively, that opens up an easy path for thousands and thousands of game developers to implement ray tracing in their games.

All told, NVIDIA’s engaged somewhere in excess of 100 developers who are working on ray traced games.

To date we have millions, millions of gamers who are gaming on RTX hardware, hardware accelerated hardware, today.

And — thanks to ray-tracing —  that number is growing every week.

The post What’s the Difference Between Hardware and Software Accelerated Ray Tracing? appeared first on The Official NVIDIA Blog.

Intel Highlighted Why NVIDIA Tensor Core GPUs Are Great for Inference

It’s not every day that one of the world’s leading tech companies highlights the benefits of your products.

Intel did just that last week, comparing the inference performance of two of their most expensive CPUs to NVIDIA GPUs.

To achieve the performance of a single mainstream NVIDIA V100 GPU, Intel combined two power-hungry, highest-end CPUs with an estimated price of $50,000-$100,000, according to Anandtech. Intel’s performance comparison also highlighted the clear advantage of NVIDIA T4 GPUs, which are built for inference. When compared to a single highest-end CPU, they’re not only faster but also 7x more energy-efficient and an order of magnitude more cost-efficient.

Inference performance is crucial, as AI-powered services are growing exponentially. And Intel’s latest Cascade Lake CPUs include new instructions that improve inference, making them the best CPUs for inference. However, it’s hardly competitive with NVIDIA deep learning-optimized Tensor Core GPUs.

Inference (also known as prediction), in simple terms, is the “pattern recognition” that a neural network does after being trained. It’s where AI models provide intelligent capabilities in applications, like detecting fraud in financial transactions, conversing in natural language to search the internet, and predictive analytics to fix manufacturing breakdowns before they even happen.

While most AI inference today happens on CPUs, NVIDIA Tensor Core GPUs are rapidly being adopted across the full range of AI models. Tensor Core, a breakthrough innovation has transformed NVIDIA GPUs to highly efficient and versatile AI processors. Tensor Cores do multi-precision calculations at high rates to provide optimal precision for diverse AI models and have automatic support in popular AI frameworks.

It’s why a growing list of consumer internet companies — Microsoft, Paypal, Pinterest, Snap and Twitter among them — are adopting GPUs for inference.

Compelling Value of Tensor Core GPUs for Computer Vision

First introduced with the NVIDIA Volta architecture, Tensor Core GPUs are now in their second generation with NVIDIA Turing. Tensor Cores perform extremely efficient computations for AI for a full range of precision — from 16-bit floating point with 32-bit accumulate to 8-bit and even 4-bit integer operations with 32-bit accumulate.

They’re designed to accelerate both AI training and inference, and are easily enabled using automatic mixed precision features in the TensorFlow and PyTorch frameworks. Developers can achieve 3x training speedups by adding just two lines of code to their TensorFlow projects.

On computer vision, as the table below shows, when comparing the same number of processors, the NVIDIA T4 is faster, 7x more power-efficient and far more affordable. NVIDIA V100, designed for AI training, is 2x faster and 2x more energy efficient than CPUs on inference.

Table 1: Inference on ResNet-50.

Intel Xeon 9282
ResNet-50 Inference (images/sec)7,8787,8444,944
# of Processors211
Total Processor TDP800 W350 W70 W
Energy Efficiency (Taking TDP)10 img/sec/W22 img/sec/W71 img/sec/W
Performance per Processor (images/sec)3,9397,8444,944
GPU Performance Advantage1.0 (baseline)2.0x1.3x
GPU Energy-Efficiency Advantage1.0 (baseline)2.3x7.2x

Source: Intel Xeon performance; NVIDIA GPU performance

Compelling Value of Tensor Core GPUs for Understanding Natural Language

AI has been moving at a frenetic pace. This rapid progress is fueled by teams of AI researchers and data scientists who continue to innovate and create highly accurate and exponentially more complex AI models.

Over four years ago, computer vision was among the first applications where AI from Microsoft was able to perform at superhuman accuracy using models like ResNet-50. Today’s advanced models perform even more complex tasks like understanding language and speech at superhuman accuracy. BERT, a highly complex AI model open-sourced by Google last year, can now understand prose and answer questions with superhuman accuracy.

A measure of the complexity of AI models is the number of parameters they have. Parameters in an AI model are the variables that store information the model has learned. While ResNet-50 has 25 million parameters, BERT has 340 million, a 13x increase.

On an advanced model like BERT, a single NVIDIA T4 GPU is 56x faster than a dual-socket CPU server and 240x more power-efficient.

Table 2: Inference on BERT. Workload: Fine-Tune Inference on BERT Large dataset.

 Dual Intel Xeon
Gold 6240
BERT Inference,
Question-Answering (sentences/sec)
Processor TDP300 W (150 Wx2)70 W
Energy Efficiency (using TDP)0.007 sentences/sec/W1.7 sentences/sec/W
GPU Performance Advantage1.0 (baseline)59x
GPU Energy-Efficiency Advantage1.0 (baseline)240x

CPU server: Dual-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; FP32 precision; with Intel’s TF Docker container v. 1.13.1. Note: Batch-size 4 results yielded the best CPU score.

GPU results: T4: Dual-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; mixed precision; CUDA 10.1.105; NCCL 2.4.3, cuDNN, cuBLAS 10.1.105; NVIDIA driver 418.67; on TensorFlow using automatic mixed precision and XLA compiler; batch-size 4 and sequence length 128 used for all platforms tested. 

Compelling Value of Tensor Core GPUs for Recommender Systems

Another key usage of AI is in recommendation systems, which are used to provide relevant content recommendations on video sharing sites, news feeds on social sites and product recommendations on e-commerce sites.

Neural collaborative filtering, or NCF, is a recommender system that uses the prior interactions of users with items to provide recommendations. When running inference on the NCF model that is a part of the MLPerf 0.5 training benchmark, NVIDIA T4 brings 12x more performance and 24x higher energy efficiency than CPUs.

Table 3: Inference on NCF.

 Single Intel Xeon
Gold 6140
Recommender Inference Throughput (MovieLens)(thousands of samples/sec)2,86027,800
Processor TDP150 W70 W
Energy Efficiency (using TDP)19 samples/sec/W397 samples/sec/W
GPU Performance Advantage1.0 (baseline)10x
GPU Energy-Efficiency Advantage1.0 (baseline)20x

CPU server: Single-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; Used Intel Benchmark for NCF on TensorFlow with Intel’s TF Docker container version 1.13.1; FP32 precision. Note: Single-socket CPU config used for CPU tests as it yielded a better score than dual-socket.

GPU results: T4: Single-socket Xeon Gold 6140@2.3GHz; 384GB system RAM; CUDA 10.1.105; NCCL 2.4.3, cuDNN, cuBLAS 10.1.105; NVIDIA driver 418.40.04; on TensorFlow using automatic mixed precision and XLA compiler; batch-size: 2,048 for CPU, 1,048,576 for T4; precision: FP32 for CPU, mixed precision for T4. 

Unified Platform for AI Training and Inference

The use of AI models in applications is an iterative process designed to continuously improve their performance. Data scientist teams constantly update their models with new data and algorithms to improve accuracy. These models are then updated in applications by developers.

Updates can happen monthly, weekly and even on a daily basis. Having a single platform for both AI training and inference can dramatically simplify and accelerate this process of deploying and updating AI in applications.

NVIDIA’s data center GPU computing platform leads the industry in performance by a large margin for AI training, as demonstrated by the standard AI benchmark, MLPerf. And the NVIDIA platform provides compelling value for inference, as the data presented here attests. That value increases with the growing complexity and progress of modern AI.

To help fuel the rapid progress in AI, NVIDIA has deep engagements with the ecosystem and constantly optimizes software, including key frameworks like TensorFlow, Pytorch and MxNet as well as inference software like TensorRT and TensorRT Inference Server.

NVIDIA also regularly publishes pre-trained AI models for inference and model scripts for training models using your own data. All of this software is freely made available as containers, ready to download and run from NGC, NVIDIA’s hub for GPU-accelerated software.

Get the full story about our comprehensive AI platform.

The post Intel Highlighted Why NVIDIA Tensor Core GPUs Are Great for Inference appeared first on The Official NVIDIA Blog.

Turing Now Starts at $149: Introducing GeForce GTX 1650

Welcome to GeForce GTX PC gaming.

NVIDIA today launched worldwide the GeForce GTX 1650, giving you smooth performance on the latest games.

With the arrival of the GTX 1650, our acclaimed Turing family of GeForce GTX GPUs now starts at $149.

The GTX 1650 is the perfect choice for gamers looking for a quick, easy upgrade, or those building a compact, power-efficient system able to play modern games.

It offers 10x the performance of integrated graphics, up to twice the performance of our GTX 950 and up to 1.7x the performance of our previous generation GTX 1050.

The GeForce GTX 1650 uses the “TU117” Turing GPU, which has been carefully architected to balance performance, power and cost.

And of course, GTX 1650 includes all of the new Turing shader innovations that improve performance and efficiency, including support for concurrent floating point and integer operations, a unified cache architecture with larger L1 cache, and adaptive shading.

Power and Efficiency

As a result, the GeForce GTX 1650 excels in modern games with complex shaders. Yet thanks to Turing’s efficiency, the GPU consumes less than 75 watts of power.

It doesn’t even require an external power connector. So for those still running on older GPUs, this will be the easiest upgrade they will ever have to perform. Just plug and play.

Game Ready

With the GTX 1650, you’ll be ready for major game releases. NVIDIA works with developers to boost performance and fix bugs.

GeForce Experience automatically notifies you when new drivers are available. With one click, it lets you update to the latest drivers.

GeForce GTX 1650


GeForce Experience unlocks a host of features. NVIDIA Freestyle allows you to customize your game’s experience in real time.

NVIDIA ShadowPlay and NVIDIA Highlights let you record and capture your greatest moments.

And NVIDIA Ansel lets you take professional-grade photographs of your games with powerful image capture and sharing tools.

Availability and Pricing

GeForce GTX 1650 boards are available starting today from the world’s leading add-in card providers, including ASUS, Colorful, EVGA, Gainward, Galaxy, Gigabyte, Innovision 3D, MSI, Palit, PNY and Zotac. Pricing and features will vary based on partner designs and region.

The post Turing Now Starts at $149: Introducing GeForce GTX 1650 appeared first on The Official NVIDIA Blog.