NVIDIA Introduces GeForce RTX 30 Series Laptops, RTX 3060 Graphics Cards, New RTX Games & Features in Special Event

Bringing more gaming capabilities to millions more gamers, NVIDIA on Tuesday  announced more than 70 new laptops will feature GeForce RTX 30 Series Laptop GPUs and unveiled the NVIDIA GeForce RTX 3060 graphics card for desktops.

All are powered by the award-winning NVIDIA Ampere GPU architecture, the second generation of RTX with enhanced Ray Tracing Cores, Tensor Cores, and new streaming multiprocessors.

The announcements were among the highlights of a streamed presentation from Jeff Fisher, senior vice president of NVIDIA’s GeForce business.

Amid the unprecedented challenges of 2020, “millions of people tuned into gaming — to play, create and connect with one another,” Fisher said. “More than ever, gaming has become an integral part of our lives.” Among the stats he cited:

  • Steam saw its number of concurrent users more than double from 2018
  • Discord, a messaging and social networking service most popular with gamers, has seen monthly active users triple to 140 million from two years ago
  • In 2020 alone, more than 100 billion hours of gaming content have been watched on YouTube
  • Also in 2020, viewership of esports reached half a billion people

Meanwhile, NVIDIA has been delivering a series of major gaming advancements, Fisher explained.

RTX ‘the New Standard’

Two years ago, NVIDIA introduced a breakthrough in graphics real-time ray tracing and AI-based DLSS (deep learning super sampling), together called RTX, he said.

NVIDIA quickly partnered with Microsoft and top developers and game engines to bring the visual realism of movies to fully interactive gaming, Fisher said.

In fact, 36 games are now powered by RTX. They include the #1 Battle Royale game, the #1 RPG, the #1 MMO and the #1 best-selling game of all time – Minecraft.

Now, we’re announcing more games that support RTX technology, including DLSS, which is coming to both Call of Duty: Warzone and Square Enix’s new IP, Outriders. And Five Nights at Freddy’s: Security Breach and F.I.S.T.: Forged in Shadow Torch will be adding ray tracing and DLSS.

For more details, read our January 2021 RTX Games article.

“The momentum is unstoppable,” Fisher said. “As consoles and the rest of the ecosystem are now onboard — ray tracing is the new standard.

last year, NVIDIA launched its second generation of RTX, the GeForce RTX 30 Series GPUs. Based on the NVIDIA Ampere architecture, it represents “our biggest generational leap ever,” Fisher said.

NVIDIA built NVIDIA Reflex to deliver the lowest system latency for competitive gamers – from mouse click to display. Since Reflex’s launch in September, a dozen games have added support.

Fisher announced that Overwatch and Rainbow Six Siege are also adopting NVIDIA Reflex. Now, 7 of the top 10 competitive shooters support Reflex.

And over the past four months, NVIDIA has launched four NVIDIA Ampere architecture-powered graphics cards, from the ultimate BFGPU — the GeForce RTX 3090 priced at $1,499 — to the GeForce RTX 3060 Ti at $399.

“Ampere has been our fastest selling architecture ever, selling almost twice as much as our prior generation,” Fisher said.

GeForce RTX 3060: An NVIDIA Ampere GPU for Every Gamer

With gaming now a key part of global culture, the new GeForce RTX 3060 brings the power of the NVIDIA Ampere architecture to every gamer, Fisher said.

“The RTX 3060 offers twice the raster performance of the GTX 1060 and 10x the ray-tracing performance,” Fisher said, noting that the GTX 1060 is the world’s most popular GPU. “The RTX 3060 powers the latest games with RTX On at 60 frames per second.”

The RTX 3060 has 13 shader teraflops, 25 RT teraflops for ray tracing, and 101 tensor teraflops to power DLSS, an NVIDIA technology introduced in 2019 that uses AI to accelerate games. And it boasts 12 gigabytes of GDDR6 memory.

“With most of the installed base underpowered for the latest games, we’re bringing RTX to every gamer with the GeForce RTX 3060,” Fisher said.

The GeForce RTX 3060 starts at just $329 and will be available worldwide in late February.

“Amazing Gaming Doesn’t End at the Desktop”

NVIDIA also announced a new generation of NVIDIA Ampere architecture-powered laptop GPUs.

Laptops, Fisher explained, are the fastest-growing gaming platform. There are now 50 million gaming laptops, which powered over 14 billion gaming hours last year.

“Amazing gaming doesn’t end at the desktop,” Fisher said.

These high-performance machines also meet the needs of 45 million creators and everyone working and studying from home, Fisher said.

The new generation of NVIDIA Ampere architecture-powered laptops, with second-generation RTX and third-generation Max-Q technologies, deliver twice the power efficiency of previous generations.

Efficiency Is Paramount in Laptops

That’s why NVIDIA introduced Max-Q four years ago, Fisher explained.

Max-Q is a system design approach that delivers high performance in thin and light gaming laptops.

“It has fundamentally changed how laptops are built, every aspect — the CPU GPU, software, PCB design, power delivery, thermals — are optimized for power and performance,” Fisher said.

NVIDIA’s third-gen Max-Q technologies use AI and new system optimizations to make high-performance gaming laptops faster and better than ever, he said.

Fisher introduced Dynamic Boost 2.0, which for the first time uses AI to shift power between the CPU, GPU and now, GPU memory.

“So your laptop is constantly optimizing for maximum performance,” Fisher said.

Fisher also introduced WhisperMode 2.0, which delivers a new level of acoustic control for gaming laptops.

Pick your desired acoustics and WhisperMode 2.0’s AI-powered algorithms manage the CPU, GPU system temperature and fan speeds to “deliver great acoustics at the best possible performance,” Fisher explained.

Another new feature, Resizable BAR, uses the advanced capabilities of PCI Express to boost gaming performance.

Games use GPU memory for textures, shaders and geometry — constantly updating as the player moves through the world.

Today, only part of the GPU’s memory can be accessed at any one time by the CPU, requiring many memory updates, Fisher explained.

With Resizable BAR, the game can access the entire GPU memory, allowing for multiple updates at the same time, improving performance, Fisher said.

Resizable BAR will also be supported on GeForce RTX 30 Series graphics cards for desktops, starting with the GeForce RTX 3060. NVIDIA and GPU partners are readying VBIOS updates for existing GeForce RTX 30 series graphics cards starting in March.

Finally, NVIDIA DLSS offers a breakthrough for gaming laptops. It uses AI and RTX Tensor Cores to deliver up to 2x to performance in the same power envelope.

World’s Fastest Laptops for Gamers and Creators

Starting at $999, RTX 3060 laptops are “faster than anything on the market today,” Fisher said.

They’re 30 percent faster than the PlayStation 5 and deliver 90 frames per second on the latest games at ultra settings 1080p, Fisher said.

Starting at $1,299, GeForce RTX 3070 laptops are “a 1440p gaming beast.”

Boasting twice the pixels of 1080p, this new generation of laptops “provides the perfect mix of high-fidelity graphics and great performance.”

And starting at $1,999, GeForce RTX 3080 laptops will come with up to 16 gigabytes of GDDR6 memory.

They’re “the world’s fastest laptop for gamers and creators,” Fisher said, delivering hundreds of frames per second with RTX on.

As a result, laptop gamers will be able to play at 240 frames per second, across top titles like Overwatch, Rainbow Six Siege, Valorant and Fortnite, Fisher said.


Manufacturers worldwide, starting Jan. 26, will begin shipping over 70 different GeForce RTX gaming and creator laptops featuring GeForce RTX 3080 and GeForce RTX 3070 laptop GPUs, followed by GeForce RTX 3060 laptop GPUs on Feb. 2.

The GeForce RTX 3060 graphics card will be available in late February, starting at $329, as custom boards — including stock-clocked and factory-overclocked models — from top add-in card providers such as ASUS, Colorful, EVGA, Gainward, Galaxy, Gigabyte, Innovision 3D, MSI, Palit, PNY and Zotac.

Look for GeForce RTX 3060 GPUs at major retailers and etailers, as well as in gaming systems by major manufacturers and leading system builders worldwide.

“RTX is the new standard, and the momentum continues to grow,” Fisher said.

The post NVIDIA Introduces GeForce RTX 30 Series Laptops, RTX 3060 Graphics Cards, New RTX Games & Features in Special Event appeared first on The Official NVIDIA Blog.

All AIs on Quality: Startup’s NVIDIA Jetson-Enabled Inspections Boost Manufacturing

Once the founder of a wearable computing startup, Arye Barnehama understands the toils of manufacturing consumer devices. He moved to Shenzhen in 2014 to personally oversee production lines for his brain waves-monitoring headband, Melon.

It was an experience that left an impression: manufacturing needed automation.

His next act is Elementary Robotics, which develops robotics for manufacturing. Elementary Robotics, based in Los Angeles, was incubated at Pasadena’s Idealab.

Founded in 2017, Elementary Robotics recently landed a $12.7 million Series A round of funding, including investment from customer Toyota.

Elementary Robotics is in deployment with customers who track thousands of parts. Its system is constantly retraining algorithms for improvements to companies’ inspections.

“Using the NVIDIA Jetson edge AI platform, we put quite a bit of engineering effort into tracking for 100 percent of inferences, at high frame rates,” said Barnehama, the company’s CEO.

Jetson for Inspections

Elementary Robotics has developed its own hardware and software for inspections used in manufacturing. It offers a Jetson-powered robot that can examine parts for defects. It aims to improve quality with better tracking of parts and problems.

Detecting the smallest of defects on a fast moving production line requires processing of high-resolution camera data with AI in real time. This is made possible with the embedded CUDA-enabled GPU and the CUDA-X AI software on Jetson. As the Jetson platform makes decisions from video streams, these are all ingested into its cloud database so that customers are able to observe and query the data.

The results, along with the live video, are also then published to the Elementary Robotics web application, which can be accessed from anywhere.

Elementary Robotics’ system also enables companies to inspect parts from suppliers before putting them into the production line, avoiding costly failures. It is used for inspections of assemblies on production lines as well as for quality control at post-production.

Its applications include inspections of electronic printed circuit boards and assemblies, automotive components, and gears for light industrial use. Elementary Robotics customers also use its platform in packaging and consumer goods such as bottles, caps and labels.

“Everyone’s demand for quality is always going up,” said Barnehama. “We run real-time inference on the edge with NVIDIA systems for inspections to help improve quality.”

The Jetson platform recently demonstrated leadership in MLPerf AI inference benchmarks in SoC-based edge devices for computer vision and conversational AI use cases.

Elementary Robotics is a member of NVIDIA Inception, a virtual accelerator program that helps startups in AI and data science get to market faster.

Traceability of Operations

The startup’s Jetson-enabled machine learning system can handle split-second anomaly detection to catch mistakes on the production lines. And when there’s a defective part returned, companies that rely on Elementary Robotics can try to understand how it happened. Use cases include electronics, automotive, medical, consumer packaged goods, logistics and other applications.

For manufacturers, such traceability of operations is important so that companies can go back and find and fix the causes of problems for improved reliability, said Barnehama.

“You want to be able to say, ‘OK, this defective item got returned, let me look up when it was inspected and make sure I have all the inspection data,’”  added Barnehama.

NVIDIA Jetson is used by enterprise customers, developers and DIY enthusiasts for creating AI applications, as well as students and educators for learning and teaching AI.

The post All AIs on Quality: Startup’s NVIDIA Jetson-Enabled Inspections Boost Manufacturing appeared first on The Official NVIDIA Blog.

Pinterest Trains Visual Search Faster with Optimized Architecture on NVIDIA GPUs

Pinterest now has more than 440 million reasons to offer the best visual search experience. That’s because its monthly active users are tracking this high for its popular image sharing and social media service.

Visual search enables Pinterest users to search for images using text, screenshots or camera photos. It’s the core AI behind how people build their Boards of Pins — collections of images by themes —  around their interests and plans. It’s also how people on Pinterest can take action on the inspiration they discover, such as shopping and making purchases based on the products within scenes.

But tracking more than 240 billion images and 5 billion Boards is no small data trick.

This requires visual embeddings — mathematical representations of objects in a scene. Visual embeddings use models for automatically generating and evaluating visualizations to show how similar two images are — say, a sofa in a TV show’s living room compared to ones for sale at retailers.

Pinterest is improving its search results by pretraining its visual embeddings on a smaller dataset. The overall goal is to improve for one unified visual embedding that can perform well for its key business features.

Powered by NVIDIA V100 Tensor Core GPUs, this technique pre-trains Pinterest’s neural nets on a subset of about 1.3 billion images to yield improved relevancy across the wider set of hundreds of billions of images.

Improving results on the unified visual embedding in this fashion can benefit all applications on Pinterest, said Josh Beal, a machine learning researcher for Visual Search at the company.

“This model is fine-tuned on various multitask datasets. And the goal of this project was to scale the model to a large scale,” he said.

Benefitting Shop the Look 

With so many visuals, and new ones coming in all the time, Pinterest is continuously training its neural networks to identify them in relation to others.

A popular visual search feature, Pinterest’s Shop the Look enables people to shop for home and fashion items. By tapping into visual embeddings, Shop the Look can identify items in Pins and connect Pinners to those products online.

Product matches are key to its visual-driven commerce. And it isn’t an easy problem to solve at Pinterest scale.

Yet it matters. Another Pinterest visual feature is the ability to search specific products within an image, or Pin. Improving the accuracy or recommendations with visual embedding improves the magic factor in matches, boosting people’s experience of discovering relevant products and ideas.

An additional feature, Pinterest’s Lens camera search, aims to recommend visually relevant Pins based on the photos Pinners take with their cameras.

“Unified embedding for visual search benefits all these downstream applications,” said Beal.

Making Visual Search More Powerful

Several Pinterest teams have been working to improve visual search on the hundreds of billions of images within Pins. But given the massive scale of the effort and its cost and engineering resource restraints, Pinterest wanted to optimize its existing architecture.

With some suggested ResNeXt-101 architecture optimizations and by simply upgrading to the latest releases of NVIDIA libraries, including cuDNN v8, automated mixed precision and NCCL, Pinterest was able to improve training performance of their models by over 60 percent.

NVIDIA’s GPU-accelerated libraries are constantly being updated to enable companies such as Pinterest to get more performance out of their existing hardware investment.

“It has improved the quality of the visual embedding, so that leads to more relevant results in visual search,” said Beal.

The post Pinterest Trains Visual Search Faster with Optimized Architecture on NVIDIA GPUs appeared first on The Official NVIDIA Blog.

NVIDIA Boosts Academic AI Research for Business Innovation

Academic researchers are developing AI to solve challenging problems with everything from agricultural robotics to autonomous flying machines.

To help AI research like this make the leap from academia to commercial or government deployment, NVIDIA today announced the Applied Research Accelerator Program. The program supports applied research on NVIDIA platforms for GPU-accelerated application deployments.

The program will initially focus on robotics and autonomous machines. Worldwide spending on robotics systems and drones is forecast to reach $241 billion by 2023, an 88 percent increase from the $128.7 billion in spending expected for 2020, according to IDC. The program will also extend to other domains such as Data Science, NLP, Speech and Conversational AI in the months ahead.

The new program will support researchers and the organizations they work with in rolling out the next generation of applications developed on NVIDIA AI platforms, including the Jetson developer kits and SDKs like DeepStream and Isaac.

Researchers working with sponsoring organizations will also gain support from NVIDIA through technical guidance, hardware grants, funding, grant application support, AI training programs, not to mention networking and marketing opportunities.

NVIDIA is now accepting applications to the program from researchers working to apply robotics and AI for automation in collaboration with enterprises seeking to deploy new technologies in the market.

Accelerating and Deploying AI Research

The NVIDIA Applied Research Accelerator Program’s first group of participants have already demonstrated AI capabilities meriting further development for agriculture, logistics and healthcare.

  • The University of Florida is developing AI applications for smart sprayers used in agriculture, and working with Chemical Containers Inc. to deploy AI on machines running NVIDIA Jetson to reduce the amount of plant protection products applied to tree crops.
  • The Institute for Factory Automation and Production Systems at Friedrich-Alexander-University Erlangen-Nuremberg, based in Germany, is working with materials handling company KION and the intralogistics research association IFL to design drones for warehouse autonomy using NVIDIA Jetson.
  • The Massachusetts Institute of Technology is developing AI applications for disinfecting surfaces with UV-C light using NVIDIA Jetson. It’s also working with Ava Robotics to deploy autonomous disinfection on robots to minimize human supervision and additional risk of exposure to COVID-19.

Applied Research Accelerator Program Benefits  

NVIDIA offers hardware grants along with funding in some cases for academic researchers who can demonstrate AI feasibility in practical applications. The program also provides letters of support for third-party grant applications submitted by researchers.

Members will also have access to technical guidance on using NVIDIA platforms, including Jetson, as well as Isaac and DeepStream.

Membership in the new program includes access to training courses via the Deep Learning Institute to help researchers master a wide range of AI technologies.

NVIDIA also offers researchers opportunities to present and network at the GPU Technology Conferences.

Interested researchers can apply today for the Applied Research Accelerator Program.

The post NVIDIA Boosts Academic AI Research for Business Innovation appeared first on The Official NVIDIA Blog.

Supercomputing Chops: Tsinghua U. Takes Top Flops in SC20 Student Cluster Battle

Props to team top flops.

Virtual this year, the SC20 Student Cluster Competition was still all about teams vying for top supercomputing performance in the annual battle for HPC bragging rights.

That honor went to Beijing-based Tsinghua University, whose six-member undergraduate student team clocked in 300 teraflops of processing performance.

A one teraflop computer can process one trillion floating-point operations per second.

The Virtual Student Cluster Competition was this year’s battleground for 19 teams. Competitors consisted of either high school or undergraduate students. Teams were made up of six members, an adviser and vendor partners.

Real-World Scenarios

In the 72-hour competition, student teams designed and built virtual clusters running NVIDIA GPUs in the Microsoft Azure cloud. Students completed a set of benchmarks and real-world scientific workloads.

Teams ran the Gromac molecular dynamics application, tackling COVID-19 research. They also ran the CESM application to work on optimizing climate modeling code. The “reproducibility challenge” called on the teams to replicate results from an SC19 research paper.

Among other hurdles, teams were tossed a surprise exascale computing project mini-application, miniVite, to test their chops at compiling, running and optimizing.

A leaderboard tracked performance results of their submissions and the amount of money spent on Microsoft Azure as well as the burn rate of their spending by the hour on cloud resources.

Roller-Coaster Computing Challenges

The Georgia Institute of Technology competed for its second time. This year’s squad, dubbed Team Phoenix, had the good fortune of landing advisor Vijay Thakkar, a Gordon Bell Prize nominee this year.

Half of the team members were teaching assistants for introductory systems courses at Georgia Tech, said team member Sudhanshu Agarwal.

Georgia Tech used NVIDIA GPUs “wherever it was possible, as GPUs reduced computation time,” said Agarwal.

“We had a lot of fun this year and look forward to participating in SC21 and beyond,” he said.

Pan Yueyang, a junior in computer science at Peking University, joined his university’s supercomputing team before taking the leap to participate in the SC20 battle. But it was full of surprises, he noted.

He said that during the competition his team ran into a series of unforeseen hiccups. “Luckily it finished as required and the budget was slightly below the limitation,” he said.

Jacob Xiaochen Li, a junior in computer science at the University of California, San Diego, said his team was relying on NVIDIA GPUs for the MemXCT portion of the competition to reproduce the scaling experiment along with memory bandwidth utilization. “Our results match the original chart closely,” he said, noting there were some hurdles along the way.

Po Hao Chen, a sophmore in computer science at Boston University, said he committed to the competition because he’s always enjoyed algorithmic optimization. Like many, he had to juggle the competition with the demands of courses and exams.

“I stayed up for three whole days working on the cluster,” he said. “And I really learned a lot from this competition.”

Teams and Flops

Tsinghua University, China

ETH Zurich

Southern University of Science and Technology

Texas A&M University

Georgia Institute of Technology

Nanyang Technological University, Singapore

University of Warsaw

University of Illinois

Massachusetts Institute of Technology

Peking University

University of California, San Diego

North Carolina State University

Clemson University

Friedrich-Alexander University Erlangen-Nuremberg

Northeastern University

Shanghai Jiao Tong University

ShanghaiTech University

University of Texas

Wake Forest University
9.172 TFLOPS


The post Supercomputing Chops: Tsinghua U. Takes Top Flops in SC20 Student Cluster Battle appeared first on The Official NVIDIA Blog.

Take the A100 Train: HPC Centers Worldwide Jump Aboard NVIDIA AI Supercomputing Fast Track

Supercomputing centers worldwide are onboarding NVIDIA Ampere GPU architecture to serve the growing demands of heftier AI models for everything from drug discovery to energy research.

Joining this movement, Fujitsu has announced a new exascale system for Japan-based AI Bridging Cloud Infrastructure (ABCI), offering 600 petaflops of performance at the National Institute of Advanced Industrial Science and Technology.

The debut comes as model complexity has surged 30,000x in the past five years, with booming use of AI in research. With scientific applications, these hulking datasets can be held in memory, helping to minimize batch processing as well as to achieve higher throughput.

To fuel this next research ride, NVIDIA Monday introduced the NVIDIA A100 80GB GPU with HBM2e technology. It doubles the A100 40GB GPU’s high-bandwidth memory to 80GB and delivers over 2 terabytes per second of memory bandwidth.

New NVIDIA A100 80GB GPUs let larger models and datasets run in-memory at faster memory bandwidth, enabling higher compute and faster results on workloads. Reducing internode communication can boost AI training performance by 1.4x with half the GPUs.

NVIDIA also introduced new NVIDIA Mellanox 400G InfiniBand architecture, doubling data throughput and offering new in-network computing engines for added acceleration.

Europe Takes Supercomputing Ride

Europe is leaping in. Italian inter-university consortium CINECA announced the Leonardo system, the world’s fastest AI supercomputer. It taps 14,000 NVIDIA Ampere architecture GPUs and NVIDIA Mellanox InfiniBand networking for 10 exaflops of AI. France’s Atos is set to build it.

Leonardo joins a growing pack of European systems on NVIDIA AI platforms supported by the EuroHPC initiative. Its German neighbor, the Jülich Supercomputing Center, recently launched the first NVIDIA GPU-powered AI exascale system to come online in Europe, delivering the region’s most powerful AI platform. The new Atos-designed Jülich system, dubbed JUWELS, is a 2.5 exaflops AI supercomputer that captured No. 7 on the latest TOP500 list.

Those also getting on board include Luxembourg’s MeluXina supercomputer; IT4Innovations National Supercomputing Center, the most powerful supercomputer in the Czech Republic; and the Vega supercomputer at the Institute of Information Science in Maribor, Slovenia.

Linköping University is planning to build Sweden’s fastest AI supercomputer, dubbed BerzeLiUs, based on the NVIDIA DGX SuperPOD infrastructure. It’s expected to provide 300 petaflops of AI performance for cutting-edge research.

NVIDIA is building Cambridge-1, an 80-node DGX SuperPOD with 400 petaflops of AI performance. It will be the fastest AI supercomputer in the U.K. It’s planned to be used in collaborative research within the country’s AI and healthcare community across academia, industry and startups.

Full Steam Ahead in North America

North America is taking the exascale AI supercomputing ride. NERSC (the U.S. National Energy Research Scientific Computing Center) is adopting NVIDIA AI for projects on Perlmutter, its system packing 6,200 A100 GPUs. NERSC now lays claim to 3.9 exaflops of AI performance.

NVIDIA Selene, a cluster based on the DGX SuperPOD, provides a public reference architecture for large-scale GPU clusters that can be deployed in weeks. The NVIDIA DGX SuperPOD system landed the top spot on the Green500 list of most efficient supercomputers, achieving a new world record in power efficiency of 26.2 gigaflops per watt, and it has set eight new performance milestones for MLPerf inference.

The University of Florida and NVIDIA are building the world’s fastest AI supercomputer in academia, aiming to deliver 700 petaflops of AI performance. The partnership puts UF among leading U.S. AI universities, advances academic research and helps address some of Florida’s most complex challenges.

At Argonne National Laboratory, researchers will use a cluster of 24 NVIDIA DGX A100 systems to scan billions of drugs in the search for treatments for COVID-19.

Los Alamos National Laboratory, Hewlett Packard Enterprise and NVIDIA are teaming up to deliver next-generation technologies to accelerate scientific computing.

All Aboard in APAC

Supercomputers in APAC will also be fueled by NVIDIA Ampere architecture. Korean search engine NAVER and Japanese messaging service LINE are using a DGX SuperPOD built with 140 DGX A100 systems with 700 petaflops of peak AI performance to scale out research and development of natural language processing models and conversational AI services.

The Japan Agency for Marine-Earth Science and Technology, or JAMSTEC, is upgrading its Earth Simulator with NVIDIA A100 GPUs and NVIDIA InfiniBand. The supercomputer is expected to have 624 petaflops of peak AI performance with a maximum theoretical performance of 19.5 petaflops of HPC performance, which today would rank high among the TOP500 supercomputers.

India’s Centre for Development of Advanced Computing, or C-DAC, is commissioning the country’s fastest and largest AI supercomputer, called PARAM Siddhi – AI. Built with 42 DGX A100 systems, it delivers 200 exaflops of AI performance and will address challenges in healthcare, education, energy, cybersecurity, space, automotive and agriculture.

Buckle up. Scientific research worldwide has never enjoyed such a ride.

The post Take the A100 Train: HPC Centers Worldwide Jump Aboard NVIDIA AI Supercomputing Fast Track appeared first on The Official NVIDIA Blog.

What Is Computer Vision?

Computer vision has become so good that the days of general managers screaming at umpires in baseball games in disputes over pitches may become a thing of the past.

That’s because developments in image classification along with parallel processing make it possible for computers to see a baseball whizzing by at 95 miles per hour. Pair that with image detection to help geolocate balls, and you’ve got a potent umpire tool that’s hard to argue with.

But computer vision doesn’t stop at baseball.

What Is Computer Vision?

Computer vision is a broad term for the work done with deep neural networks to develop human-like vision capabilities for applications, most often run on NVIDIA GPUs. It can include specific training of neural nets for segmentation, classification and detection using images and videos for data.

Major League Baseball is testing AI-assisted calls at the plate using computer vision. Judging balls and strikes on baseballs that can take just .4 seconds to reach the plate isn’t easy for human eyes. It could be better handled by a camera feed run on image nets and NVIDIA GPUs that can process split-second decisions at a rate of more than 60 frames per second.

Hawk-Eye, based in London, is making this a reality in sports. Hawk-Eye’s NVIDIA GPU-powered ball tracking and SMART software is deployed in more than 20 sports, including baseball, basketball, tennis, soccer, cricket, hockey and NASCAR.

Yet computer vision can do much more than just make sports calls.

What Is Computer Vision Beyond Sports?

Computer vision can handle many more tasks. Developed with convolutional neural networks, computer vision can perform segmentation, classification and detection for a myriad of applications.

Computer vision has infinite applications. With industry changes from computer vision spanning sports, automotive, agriculture, retail, banking, construction, insurance and beyond, much is at stake.

3 Things to Know About Computer Vision

  • Segmentation: Image segmentation is about classifying pixels to belong to a certain category, such as a car, road or pedestrian. It’s widely used in self-driving vehicle applications, including the NVIDIA DRIVE software stack, to show roads, cars and people.  Think of it as a sort of visualization technique that makes what computers do easier to understand for humans.
  • Classification: Image classification is used to determine what’s in an image. Neural networks can be trained to identify dogs or cats, for example, or many other things with a high degree of precision given sufficient data.
  • Detection: Image detection allows computers to localize where objects exist. It puts rectangular bounding boxes — like in the lower half of the image below — that fully contain the object. A detector might be trained to see where cars or people are within an image, for instance, as in the numbered boxes below.

What You Need to Know: Segmentation, Classification and Detection

Good at delineating objectsIs it a cat or a dog?Where does it exist in space?
Used in self-driving vehiclesClassifies with precisionRecognizes things for safety


NVIDIA’s Deep Learning Institute offers courses such as Getting Started with Image Segmentation and Fundamentals of Deep Learning for Computer Vision

The post What Is Computer Vision? appeared first on The Official NVIDIA Blog.

Bada Bing Bada Boom: Microsoft Turns to Turing-NLG, NVIDIA GPUs to Instantly Suggest Full-Phrase Queries

Hate hunting and pecking away at your keyboard every time you have a quick question? You’ll love this.

Microsoft’s Bing search engine has turned to Turing-NLG and NVIDIA GPUs to suggest full sentences for you as you type.

Turing-NLG is a cutting-edge, large-scale unsupervised language model that has achieved strong performance on language modeling benchmarks.

It’s just the latest example of an AI technique called unsupervised learning, which makes sense of vast quantities of data by extracting features and patterns without the need for humans to provide any pre-labeled data.

Microsoft calls this Next Phrase Prediction, and it can feel like magic, making full-phrase suggestions in real time for long search queries.

Turing-NLG is among several innovations — from model compression to state caching and hardware acceleration — that Bing has harnessed with Next Phrase Prediction.

Over the summer, Microsoft worked with engineers at NVIDIA to optimize Turing-NLG to their needs, accelerating the model on NVIDIA GPUs to power the feature for users worldwide.

A key part of this optimization was to run this massive AI model extremely fast to power real-time search experience. With a combination of hardware and model optimization Microsoft and NVIDIA achieved an average latency below 10 milliseconds.

By contrast, it takes more than 100 milliseconds to blink your eye.

Learn more about the next wave of AI innovations at Bing.

Before the introduction of Next Phrase Prediction, the approach for handling query suggestions for longer queries was limited to completing the current word being typed by the user.

Now type in “The best way to replace,” and you’ll immediately see three suggestions for completing the phrase: wood, plastic and metal. Type in “how can I replace a battery for,” and you’ll see “iphone, samsung, ipad and kindle” all suggested.

With Next Phrase Prediction, Bing can now present users with full-phrase suggestions.

The more characters you type, the closer Bing gets to what you probably want to ask.

And because these suggestions are generated instantly, they’re not limited to previously seen data or just the current word being typed.

So, for some queries, Bing won’t just save you a few keystrokes — but multiple words.

As a result of this work, the coverage of autosuggestion completions increases considerably, Microsoft reports, improving the overall user experience “significantly.”

The post Bada Bing Bada Boom: Microsoft Turns to Turing-NLG, NVIDIA GPUs to Instantly Suggest Full-Phrase Queries appeared first on The Official NVIDIA Blog.

NVIDIA AI on Microsoft Azure Machine Learning to Power Grammar Suggestions in Microsoft Editor for Word

It’s been said that good writing comes from editing. Fortunately for discerning readers everywhere, Microsoft is putting an AI-powered grammar editor at the fingertips of millions of people.

Like any good editor, it’s quick and knowledgeable. That’s because Microsoft Editor’s grammar refinements in Microsoft Word for the web can now tap into NVIDIA Triton Inference Server, ONNX Runtime and Microsoft Azure Machine Learning, which is part of Azure AI, to deliver this smart experience.

Speaking at the digital GPU Technology Conference, NVIDIA CEO Jensen Huang announced the news during the keynote presentation on October 5.

Everyday AI in Office

Microsoft is on a mission to wow users of Office productivity apps with the magic of AI. New, time-saving experiences will include real-time grammar suggestions, question-answering within documents — think Bing search for documents beyond “exact match” — and predictive text to help complete sentences.

Such productivity-boosting experiences are only possible with deep learning and neural networks. For example, unlike services built on traditional rules-based logic, when it comes to correcting grammar, Editor in Word for the web is able to understand the context of a sentence and suggest the appropriate word choices.


And these deep learning models, which can involve hundreds of millions of parameters, must be scalable and provide real-time inference for an optimal user experience. Microsoft Editor’s AI model  for grammar checking in Word on the web alone is expected to handle more than 500 billion queries a year.

Deployment at this scale could blow up deep learning budgets. Thankfully, NVIDIA Triton’s dynamic batching and concurrent model execution features, accessible through Azure Machine Learning, slashed the cost by about 70 percent and achieved a throughput of 450 queries per second on a single NVIDIA V100 Tensor Core GPU, with less than 200-millisecond response time. Azure Machine Learning provided the required scale and capabilities to manage the model lifecycle such as versioning and monitoring.

High Performance Inference with Triton on Azure Machine Learning

Machine learning models have expanded in size, and GPUs have become necessary during model training and deployment. For AI deployment in production, organizations are looking for scalable inference serving solutions, support for multiple framework backends, optimal GPU and CPU utilization and machine learning lifecycle management.

The NVIDIA Triton and ONNX Runtime stack in Azure Machine Learning deliver scalable high-performance inferencing. Azure Machine Learning customers can take advantage of Triton’s support for multiple frameworks, real time, batch and streaming inferencing, dynamic batching and concurrent execution.

Writing with AI in Word

Author and poet Robert Graves was quoted as saying, “There is no good writing, only good rewriting.”  In other words, write, and then edit and improve.

Editor in Word for the web lets you do both simultaneously. And while Editor is the first feature in Word to gain the speed and breadth of advances enabled by Triton and ONNX Runtime, it is likely just the start of more to come.


It’s not too late to get access to hundreds of live and on-demand talks at GTC. Register now through Oct. 9 using promo code CMB4KN to get 20 percent off.


The post NVIDIA AI on Microsoft Azure Machine Learning to Power Grammar Suggestions in Microsoft Editor for Word appeared first on The Official NVIDIA Blog.

American Express Adopts NVIDIA AI to Help Prevent Fraud and Foil Cybercrime

Financial fraud is surging along with waves of cybersecurity breaches.

Cybercrime cost the global economy $600 billion annually, or 0.8 percent of worldwide GDP, according to an estimate in 2018 from McAfee. And consulting firm Accenture forecasts cyberattacks could cost companies $5.2 trillion worldwide by 2024.

Credit and bank cards are a major target. American Express, which handles more than eight  billion transactions a year, is using deep learning on the NVIDIA GPU computing platform to combat fraud detection.

American Express has now deployed deep-learning-based models optimized with NVIDIA TensorRT and running on NVIDIA Triton Inference Server to detect fraud, NVIDIA CEO Jensen Huang announced at the GPU Technology Conference on Monday.

NVIDIA TensorRT is a high performance deep learning inference optimizer and runtime that minimizes latency and maximizes throughput.

NVIDIA Triton Inference Server software simplifies model deployment at scale and can be used as a  microservice that enables applications to use AI models in datacenter production.

“Our fraud algorithms monitor in real time every American Express transaction around the world for more than $1.2 trillion spent annually, and we generate fraud decisions in mere milliseconds,” said Manish Gupta, vice president of Machine Learning and Data Science Research at American Express.

Online Shopping Spree

Online shopping has spiked since the pandemic. In the U.S. alone, online commerce rose 49 percent in April compared with early March, according to Adobe’s Digital Economy Index.

That means less cash, more digital dollars. And more digital dollars demand bank and credit card usage, which has already seen increased fraud.

“Card fraud netted criminals $3.88 billion more in 2018 than in 2017,” said David Robertson, publisher of The Nilson Report, which tracks information about the global payments industry.

American Express, with more than 115 million active credit cards, has maintained the lowest fraud rate in the industry for 13 years in a row, according to The Nilson Report

“Having our card members and merchants’ back is our top priority, so keeping our fraud rates low is key to achieving that goal,” said Gupta.

Anomaly Detection with GPU Computing

With online transactions rising, fraudsters are waging more complex attacks as financial firms step up security measures.

One area that is easier to monitor is anomalous spending patterns. These types of transactions on one card — known as “out of pattern” — could show a coffee was purchased in San Francisco and then five minutes later a tank of gas was purchased in Los Angeles.

Such anomalies are red-flagged using recurrent neural networks, or RNNs, which are particularly good at guessing what comes next in a sequence of data.

American Express has deployed long short-term memory networks, or LSTMs, which can provide improved performance in RNNs.

And that can mean the closing gaps on latency and accuracy, two areas where American Express has made leaps. The teams there used NVIDIA DGX systems to accelerate the building and training of these LSTM models on mountains of structured and unstructured data using TensorFlow.

50x Gains Over CPUs

The recently released TensorRT-optimized LSTM network aids the system that analyzes transaction data on tens of millions of daily transactions in real time. This LSTM is now deployed using the NVIDIA Triton Inference Server on NVIDIA T4 GPUs for split-second inference.

Results are in: American Express was able to implement this enhanced, real-time fraud detection system for improved accuracy. It operates within a tight two-millisecond latency requirement, and this new system delivers a 50x improvement over a CPU-based configuration, which couldn’t meet the goal.

The financial services giant’s GPU-accelerated LSTM deep neural network combined with its long-standing gradient boosting machine (GBM) model — used for regression and classification — has improved fraud detection accuracy by up to six percent in specific segments.

Accuracy matters. A false positive that denies a customer’s legitimate transaction is an unpleasant situation to be in for card members and merchants, says American Express.

“Especially in this environment, our customers need us now more than ever, so we’re supporting them with best-in-class fraud protection and servicing,” Gupta said.

It’s not too late to get access to hundreds of live and on-demand talks at GTC. Register now through Oct. 9 using promo code CMB4KN to get 20 percent off.

The post American Express Adopts NVIDIA AI to Help Prevent Fraud and Foil Cybercrime appeared first on The Official NVIDIA Blog.