BERT Does Europe: AI Language Model Learns German, Swedish

BERT is at work in Europe, tackling natural-language processing jobs in multiple industries and languages with help from NVIDIA’s products and partners.

The AI model formally known as Bidirectional Encoder Representations from Transformers debuted just last year as a state-of-the-art approach to machine learning for text. Though new, BERT is already finding use in avionics, finance, semiconductor and telecom companies on the continent, said developers optimizing it for German and Swedish.

“There are so many use cases for BERT because text is one of the most common data types companies have,” said Anders Arpteg, head of research for Peltarion, a Stockholm-based developer that aims to make the latest AI techniques such as BERT inexpensive and easy for companies to adopt.

Natural-language processing will outpace today’s AI work in computer vision because “text has way more apps than images — we started our company on that hypothesis,” said Milos Rusic, chief executive of deepset in Berlin. He called BERT “a revolution, a milestone we bet on.”

Deepset is working with PricewaterhouseCoopers to create a system that uses BERT to help strategists at a chip maker query piles of annual reports and market data for key insights. In another project, a manufacturing company is using NLP to search technical documents to speed maintenance of their products and predict needed repairs.

Peltarion, a member of NVIDIA’s Inception program that nurtures startups with access to its technology and ecosystem, packed support for BERT into its tools in November. It is already using NLP to help a large telecom company automate parts of its process for responding to product and service requests. And it’s using the technology to let a large market research company more easily query its database of surveys.

Work in Localization

Peltarion is collaborating with three other organizations on a three-year, government-backed project to optimize BERT for Swedish. Interestingly, a new model from Facebook called XLM-R suggests training on multiple languages at once could be more effective than optimizing for just one.

“In our initial results, XLM-R, which Facebook trained on 100 languages at once, outperformed a vanilla version of BERT trained for Swedish by a significant amount,” said Arpteg, whose team is preparing a paper on their analysis.

Nevertheless, the group hopes to have before summer a first version of a Swedish BERT model that performs really well, said Arpteg, who headed up an AI research group at Spotify before joining Peltarion three years ago.

An analysis by deepset of its German version of BERT.

In June, deepset released as open source a version of BERT optimized for German. Although its performance is only a couple percentage points ahead of the original model, two winners in an annual NLP competition in Germany used the deepset model.

Right Tool for the Job

BERT also benefits from optimizations for specific tasks such as text classification, question answering and sentiment analysis, said Arpteg. Peltarion researchers plans to publish in 2020 results of an analysis of gains from tuning BERT for areas with their own vocabularies such as medicine and legal.

The question-answering task has become so strategic for deepset it created Haystack, a version of its FARM transfer-learning framework to handle the job.

In hardware, the latest NVIDIA GPUs are among the favorite tools both companies use to tame big NLP models. That’s not surprising given NVIDIA recently broke records lowering BERT training time.

“The vanilla BERT has 100 million parameters and XML-R has 270 million,” said Arpteg, whose team recently purchased systems using NVIDIA Quadro and TITAN GPUs with up to 48GB of memory. It also has access to NVIDIA DGX-1 servers because “for training language models from scratch, we need these super-fast systems,” he said.

More memory is better, said Rusic, whose German BERT models weigh in at 400MB. Deepset taps into NVIDIA V100 Tensor Core 100 GPUs on cloud services and uses another NVIDIA GPU locally.

The post BERT Does Europe: AI Language Model Learns German, Swedish appeared first on The Official NVIDIA Blog.

AWS Outposts Station a GPU Garrison in Your Datacenter

All the goodness of GPU acceleration on Amazon Web Services can now also run inside your own data center.

AWS Outposts powered by NVIDIA T4 Tensor Core GPUs are generally available starting today. They bring cloud-based Amazon EC2 G4 instances inside your data center to meet user requirements for security and latency in a wide variety of AI and graphics applications.

With this new offering, AI is no longer a research project.

Most companies still keep their data inside their own walls because they see it as their core intellectual property. But for deep learning to transition from research into production, enterprises need the flexibility and ease of development the cloud offers — right beside their data. That’s a big part of what AWS Outposts with T4 GPUs now enables.

With this new offering, enterprises can install a fully managed rack-scale appliance next to the large data lakes stored securely in their data centers.

AI Acceleration Across the Enterprise

To train neural networks, every layer of software needs to be optimized, from NVIDIA drivers to container runtimes and application frameworks. AWS services like Sagemaker, Elastic MapReduce and many others designed on custom-built Amazon Machine Images require model development to start with the training on large datasets. With the introduction of NVIDIA-powered AWS Outposts, those services can now be run securely in enterprise data centers.

The GPUs in Outposts accelerate deep learning as well as high performance computing and other GPU applications. They all can access software in NGC, NVIDIA’s hub for GPU-accelerated software optimization, which is stocked with applications, frameworks, libraries and SDKs that include pre-trained models.

For AI inference, the NVIDIA EGX edge-computing platform also runs on AWS Outposts and works with the AWS Elastic Kubernetes Service. Backed by the power of NVIDIA T4 GPUs, these services are capable of processing orders of magnitudes more information than CPUs alone. They can quickly derive insights from vast amounts of data streamed in real time from sensors in an Internet of Things deployment whether it’s in manufacturing, healthcare, financial services, retail or any other industry.

On top of EGX, the NVIDIA Metropolis application framework provides building blocks for vision AI, geared for use in smart cities, retail, logistics and industrial inspection, as well as other AI and IoT use cases, now easily delivered on AWS Outposts.

Alternatively, the NVIDIA Clara application framework is tuned to bring AI to healthcare providers whether it’s for medical imaging, federated learning or AI-assisted data labeling.

The T4 GPU’s Turing architecture uses TensorRT to accelerate the industry’s widest set of AI models. Its Tensor Cores support multi-precision computing that delivers up to 40x more inference performance than CPUs.

Remote Graphics, Locally Hosted

Users of high-end graphics have choices, too. Remote designers, artists and technical professionals who need to access large datasets and models can now get both cloud convenience and GPU performance.

Graphics professionals can benefit from the same NVIDIA Quadro technology that powers most of the world’s professional workstations not only on the public AWS cloud, but on their own internal cloud now with AWS Outposts packing T4 GPUs.

Whether they’re working locally or in the cloud, Quadro users can access the same set of hundreds of graphics-intensive, GPU-accelerated third-party applications.

The Quadro Virtual Workstation AMI, available in AWS Marketplace, includes the same Quadro driver found on physical workstations. It supports hundreds of Quadro-certified applications such as Dassault Systèmes SOLIDWORKS and CATIA; Siemens NX; Autodesk AutoCAD and Maya; ESRI ArcGIS Pro; and ANSYS Fluent, Mechanical and Discovery Live.

Learn more about AWS and NVIDIA offerings and check out our booth 1237 and session talks at AWS re:Invent.

The post AWS Outposts Station a GPU Garrison in Your Datacenter appeared first on The Official NVIDIA Blog.

NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data

With over 100 exhibitors at the annual Radiological Society of North America conference using NVIDIA technology to bring AI to radiology, 2019 looks to be a tipping point for AI in healthcare.

Despite AI’s great potential, a key challenge remains: gaining access to the huge volumes of data required to train AI models while protecting patient privacy. Partnering with the industry, we’ve created a solution.

Today at RSNA, we’re introducing NVIDIA Clara Federated Learning, which takes advantage of a distributed, collaborative learning technique that keeps patient data where it belongs — inside the walls of a healthcare provider.

Clara Federated Learning (Clara FL) runs on our recently announced NVIDIA EGX intelligent edge computing platform.

Federated Learning — AI with Privacy

Clara FL is a reference application for distributed, collaborative AI model training that preserves patient privacy. Running on NVIDIA NGC-Ready for Edge servers from global system manufacturers, these distributed client systems can perform deep learning training locally and collaborate to train a more accurate global model.

Here’s how it works: The Clara FL application is packaged into a Helm chart to simplify deployment on Kubernetes infrastructure. The NVIDIA EGX platform securely provisions the federated server and the collaborating clients, delivering everything required to begin a federated learning project, including application containers and the initial AI model.

NVIDIA Clara Federated Learning uses distributed training across multiple hospitals to develop robust AI models without sharing patient data.

Participating hospitals label their own patient data using the NVIDIA Clara AI-Assisted Annotation SDK integrated into medical viewers like 3D slicer, MITK, Fovia and Philips Intellispace Discovery. Using pre-trained models and transfer learning techniques, NVIDIA AI assists radiologists in labeling, reducing the time for complex 3D studies from hours to minutes.

NVIDIA EGX servers at participating hospitals train the global model on their local data. The local training results are shared back to the federated learning server over a secure link. This approach preserves privacy by only sharing partial model weights and no patient records in order to build a new global model through federated averaging.

The process repeats until the AI model reaches its desired accuracy. This distributed approach delivers exceptional performance in deep learning while keeping patient data secure and private.

US and UK Lead the Way

Healthcare giants around the world — including the American College of Radiology, MGH and BWH Center for Clinical Data Science, and UCLA Health — are pioneering the technology. They aim to develop personalized AI for their doctors, patients and facilities where medical data, applications and devices are on the rise and patient privacy must be preserved.

ACR is piloting NVIDIA Clara FL in its AI-LAB, a national platform for medical imaging. The AI-LAB will allow the ACR’s 38,000 medical imaging members to securely build, share, adapt and validate AI models. Healthcare providers that want access to the AI-LAB can choose a variety of NVIDIA NGC-Ready for Edge systems, including from Dell, Hewlett Packard Enterprise, Lenovo and Supermicro.

UCLA Radiology is also using NVIDIA Clara FL to bring the power of AI to its radiology department. As a top academic medical center, UCLA can validate the effectiveness of Clara FL and extend it in the future across the broader University of California system.

Partners HealthCare in New England also announced a new initiative using NVIDIA Clara FL. Massachusetts General Hospital and Brigham and Women’s Hospital’s Center for Clinical Data Science will spearhead the work, leveraging data assets and clinical expertise of the Partners HealthCare system.

In the U.K., NVIDIA is partnering with King’s College London and Owkin to create a federated learning platform for the National Health Service. The Owkin Connect platform running on NVIDIA Clara enables algorithms to travel from one hospital to another, training on local datasets. It provides each hospital a blockchain-distributed ledger that captures and traces all data used for model training.

The project is initially connecting four of London’s premier teaching hospitals, offering AI services to accelerate work in areas such as cancer, heart failure and neurodegenerative disease, and will expand to at least 12 U.K. hospitals in 2020.

Making Everything Smart in the Hospital 

With the rapid proliferation of sensors, medical centers like Stanford Hospital are working to make every system smart. To make sensors intelligent, devices need a powerful, low-power AI computer.

That’s why we’re announcing NVIDIA Clara AGX, an embedded AI developer kit that can handle image and video processing at high data rates, bringing AI inference and 3D visualization to the point of care.

NVIDIA Clara AGX scales from small, embedded devices to sidecar systems to full-size servers.

Clara AGX is powered by NVIDIA Xavier SoCs, the same processors that control self-driving cars. They consume as little as 10W, making them suitable for embedding inside a medical instrument or running in a small adjacent system.

A perfect showcase of Clara AGX is Hyperfine, the world’s first portable point-of-care MRI system. The revolutionary Hyperfine system will be on display in NVIDIA’s booth at this week’s RSNA event.

Hyperfine’s system is among the first of many medical instruments, surgical suites, patient monitoring devices and smart medical cameras expected to use Clara AGX. We’re witnessing the beginning of an AI-enabled internet of medical things.

Hyperfine’s mobile MRI system uses an NVIDIA GPU and will be on display at NVIDIA’s booth.

The NVIDIA Clara AGX SDK will be available soon through our early access program. It includes reference applications for two popular uses — real-time ultrasound and endoscopy edge computing.


Visit NVIDIA and our many healthcare partners in booth 10939 in the RSNA AI Showcase. We’ll be showing our latest AI-driven medical imaging advancements, including keeping patient data secure with AI at the edge.

Find out from our deep learning experts how to use AI to advance your research and accelerate your clinical workflows. See the full lineup of talks and learn more on our website.


The post NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data appeared first on The Official NVIDIA Blog.

Life Observed: Nobel Winner Sees Biology’s Future with GPUs

Five years ago, when Eric Betzig got the call he won a Nobel Prize for inventing a microscope that could see features as small as 20 nanometers, he was already working on a new one.

The new device captures the equivalent of 3D video of living cells — and now it’s using NVIDIA GPUs and software to see the results.

Betzig’s collaborator at the University of California at Berkeley, Srigokul Upadhyayula (aka Gokul), helped refine the so-called Lattice Light Sheet Microscopy (LLSM) system. It generated 600 terabytes of data while exploring part of the visual cortex of a mouse in work published earlier this year in Science magazine. A 1.3TB slice of that effort was on display at NVIDIA’s booth at last week’s SC19 supercomputing show.

Attendees got a glimpse of how tomorrow’s scientists may unravel medical mysteries. Researchers, for example, can use LLSM to watch how protein coverings on nerve axons degrade as diseases such as muscular sclerosis take hold.

Future of Biology: Direct Visualization

“It’s our belief we will never understand complex living systems by breaking them into parts,” Betzig said of methods such as biochemistry and genomics. “Only optical microscopes can look at living systems and gather information we need to truly understand the dynamics of life, the mobility of cells and tissues, how cancer cells migrate. These are things we can now directly observe.

“The future of biology is direct visualization of living things rather than piecing together information gleaned by very indirect means,” he added.

It Takes a Cluster — and More

Such work comes with heavy computing demands. Generating the 600TB dataset for the Science paper “monopolized our institution’s computing cluster for days and weeks,” said Betzig.

“These microscopes produce beautifully rich data we often cannot visualize because the vast majority of it sits in hard drives, completely useless,” he said. “With NVIDIA, we are finding ways to start looking at it.”

The SC19 demo — a multi-channel visualization of a preserved slice of mouse cortex — ran remotely on six NVIDIA DGX-1 servers, each packing eight NVIDIA V100 Tensor Core GPUs. The systems are part of an NVIDIA SATURNV cluster located near its headquarters in Santa Clara, Calif.

Berkeley researchers gave SC19 attendees a look inside the visual cortex of a mouse — visualized using NVIDIA IndeX.

The key ingredient for the demo and future visualizations is NVIDIA IndeX software, an SDK that allows scientists and researchers to see and interact in real time with massive 3D datasets.

Version 2.1 of IndeX debuted at SC19, sporting a host of new features, including GPUDirect Storage, as well as support for Arm and IBM POWER9 processors.

After seeing their first demos of what IndeX can do, the research team installed it on a cluster at UC Berkeley that uses a dozen NVIDIA TITAN RTX and four V100 Tensor Core GPUs. “We could see this had incredible potential,” Gokul said.

Closing a Big-Data Gap

The horizon holds plenty of mountains to climb. The Lattice scope generates as much as 3TB of data an hour, so visualizations are still often done on data that must be laboriously pre-processed and saved offline.

“In a perfect world, we’d have all the information for analysis as we get the data from the scope, not a month or six months later,” said Gokul. The time between collecting and visualizing data can stretch from weeks to months, but “we need to tune parameters to react to data as we’re collecting it” to make the scope truly useful for biologists, he added.

NVIDIA IndeX software, running on its increasingly powerful GPUs, helps narrow that gap.

In the future, the team aims to apply the latest deep learning techniques, but this too presents heady challenges. “There are no robust AI models to deploy for this work today,” Gokul said.

Making the data available to AI specialists who could craft AI models would require shipping crates of hard drives on an airplane, a slow and expensive proposition. That’s because the most recent work produced over half a petabyte of data, but cloud services often limit uploads and downloads to a terabyte or so per day.

Betzig and Gokul are talking with researchers at cloud giants about new options, and they’re exploring new ways to leverage the power of GPUs because the potential of their work is so great.

Coping with Ups and Downs

“Humans are visual animals,” said Betzig. “When most people I know think about a hypothesis, they create mental visual models.

“The beautiful thing about microscopy is you can take a model in your head with all its biases and immediately compare it to the reality of living biological images. This capability already has and will continue to reveal surprises,” he said.

The work brings big ups and downs. Winning a Nobel Prize “was a shock,” Betzig said. “It kind of felt like getting hit by a bus. You feel like your life is settled and then something happens to change you in ways you wouldn’t expect — it has good and bad sides to it.”

Likewise, “in the last several years working with Gokul, every microscope had its limits that led us to the next one. You take five or six steps up to a plateau of success and then there is a disappointment,” he said.

In the partnership with NVIDIA, “we get to learn what we may have missed,” he added. “It’s a chance for us to reassess things, to understand the GPU from folks who designed the architecture, to see how we can merge our problem sets with new solutions,” he said.

Note: The picture at top shows Berkeley researchers Eric Betzig, Ruixian Gao and Srigokul Upadhyayula with the Lattice Light Sheet microscope.

The post Life Observed: Nobel Winner Sees Biology’s Future with GPUs appeared first on The Official NVIDIA Blog.

Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show

Back in the day, the annual SC supercomputing conference was filled with tabletops hung with research posters. Three decades on, the show’s Denver edition this week was a sea of sharp-angled booths, crowned with three-dimensional signage, promoting logos in a multitude of blues and reds.

But nowhere on the SC19 show floor drew more of the show’s 14,000 attendees than NVIDIA’s booth, built around a broad, floor-to-ceiling triangle with 2,500 square feet of ultra-high def LED screens. With a packed lecture hall on one side and HPC simulations playing on a second, it was the third wall that drew the most buzz.

Cycling through was a collection of AI-enhanced photos of several hundred GPU developers — grad students, CUDA pioneers, supercomputing rockstars — together with descriptions of their work.

Like accelerated computing’s answer to baseball cards, they were rendered into art using AI style transfer technology inspired by various painters — from the classicism of Vermeer to van Gogh’s impressionism to Paul Klee’s abstractions.

Meanwhile, NVIDIA sprinted through the show, kicking things off with a news-filled keynote by founder and CEO Jensen Huang, helping to power research behind the two finalists nominated for the Gordon Bell prize, and joining in to celebrate its partner Mellanox.

And in its booth, 200 engineers took advantage of free AI training through the Deep Learning Institute and dozens of tech talks were provided by leading researchers packed in shoulder to shoulder.

Wall in the Family 

Piecing together the Developer Wall project took a dozen NVIDIANs scrambling for weeks in their spare time. The team of designers, technologists and marketers created an app where developers could enter some background, which would be paired with their photo once it’s run through style filters at, a German startup that’s part of NVIDIA’s Inception startup incubator.

“What we’re trying to do is showcase and celebrate the luminaries in our field. They amazing work they’ve done is the reason this show exists,” said Doug MacMillian, a developer evangelist who helped run the big wall initiative.

Behind him flashed an image of Jensen Huang, rendered as if painted by Cezanne. Alongside him was John Stone, the legendary HPC researcher at the University of Illinois, as if painted by Vincent Van Gogh. Close by were Erik Lindahl, who heads the international GROMACS molecular simulation project, right out of a Joan Miró painting. Paresh Kharya, a data center specialist at NVIDIA, looked like an abstracted sepia-tone circuit board.

Enabling the Best and Brightest 

That theme — how NVIDIA’s working to accelerate the work of people in an ever growing array of industries — continued behind the scenes.

In a final rehearsal hours before Huang’s keynote, Ashley Korzun — a Ph.D. engineer who’s spent years working on the manned mission to Mars set for the 2030s — saw for the first time a demo visualizing her life’s work at the space agency.

As she stood on stage, she witnessed an event she’s spent years simulating purely with data – the fiery path that the Mars lander, a capsule the size of a two-story condo, will take as it slows in seven dramatic minutes from 12,000 miles an hour to gently stick its landing on the Red Planet.

“This is amazing,” she quietly said through tears. “I never thought I’d be able to visualize this.”

Flurry of News

Huang later took the stage and in a broad-sweeping two hour keynote set out a range of announcements that show how NVIDIA’s helping others do their life’s work, including:

Award-Winning Work

SC19 plays host to a series of awards throughout the show, and NVIDIA featured in a number of them.

Both finalists for the Gordon Bell Prize for outstanding achievement in high performance computing — the ultimate winner, ETH Zurich, as well as University of Michigan — ran their work on Oak Ridge National Laboratory’s Summit supercomputer, powered by nearly 28,000 V100 GPUs.

NVIDIA’s founding chief scientist, David Kirk, received this year’s Seymour Cray Computer Engineering Award, for innovative contributions to HPC systems. He was recognized for his path-breaking work around development of the GPU.

And NVIDIA’s Vasily Volkov co-authored with UC Berkeley’s James Demmel a seminal paper 11 years ago recognized with the Time of Time Award  for a work of lasting impact. The paper, which has resulted in a new way of thinking and modeling algorithms on GPUs, has had nearly 1,000 citations.

Looking Further Ahead

If the SC show is about powering the future, no corner of the show was more forward looking than the annual Supercomputing Conference Student Cluster Competition.

This year, China’s Tsinghua University captured the top crown. It beat out 15 other undergrad teams using NVIDIA V100 Tensor Core GPUs in an immersive HPC challenge demonstrating the breadth of skills, technologies and science that it takes to build, maintain and use supercomputers. Tsinghua also won the IO500 competition, while two other prizes were won by Singapore’s Nanyang Technological University.

The teams came from xx different markets, including Germany, Latvia, Poland and Taiwan, in addition to China and Singapore.

Up Next: More Performance for the World’s Data Centers

NVIDIA’s frenetic week at SC19 ended with a look at what’s next, with Jensen joining Mellanox CEO Eyal Waldman on stage at an evening event hosted by the networking company, which NVIDIA agreed to acquire earlier this year.

Jensen and Eyal discussed how their partnership will enable the future of computing, with Jensen detailing the synergies between the companies. “Mellanox has an incredible vision,” Huang said. ““In a couple years we’re going to bring more compute performance to data centers than all of the compute since the beginning of time.”

The post Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show appeared first on The Official NVIDIA Blog.

Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm

Broadening support for GPU-accelerated supercomputing to a fast-growing new platform, NVIDIA founder and CEO Jensen Huang Monday introduced a reference design for building GPU-accelerated Arm servers, with wide industry backing.

Huang — speaking Monday at the SC19 supercomputing show in Denver — also announced that Microsoft has built NDv2, a “supersized instance” that’s the world’s largest GPU-accelerated cloud-based supercomputer — a supercomputer in the cloud — on its Azure cloud-computing platform.

He additionally unveiled NVIDIA Magnum IO, a suite of GPU-accelerated I/O and storage software to eliminate data transfer bottlenecks for AI, data science and HPC workloads.

In a two-hour talk, Huang wove together these announcements with an update on developments from around the industry, setting out a sweeping vision of how high performance computing is expanding out in all directions.

HPC Universe Expanding in All Directions

“The HPC universe is expanding in every single direction at the same time,” Huang told a standing-room only crowd of some 1,400 researchers and technologists at the start of the world’s biggest supercomputing event. “HPC is literally everywhere today. It’s in supercomputing centers, in the cloud, at the edge.”

Driving that expansion are factors such as streaming HPC from massive sensor arrays; using edge computing to do more sophisticated filtering; running HPC in the cloud; and using AI to accelerate HPC.

“All of these are undergoing tremendous change,” Huang said.

Putting an exclamation mark on his talk, Huang debuted the world’s largest interactive volume visualization: An effort with NASA to simulate a Mars landing in which a craft the size of a two-story condominium traveling at 12,000 miles an hour screeches safely to a halt in just seven minutes. And it sticks the landing.

Huang said the simulation enables 150 terabytes of data, equivalent to 125,000 DVDs, to be flown through at random access. “To do that, we’ll have a supercomputing analytics instrument that sits next to a supercomputer.”

Expanding the Universe for HPC

Kicking off his talk, Huang detailed how accelerated computing powers the work of today’s computational scientists, whom he calls the da Vincis of our time.

The first AI supercomputers already power scientific research into phenomena as diverse as fusion energy and gravitational waves, Huang explained.

Accelerated computing, meanwhile, powers exascale systems tackling some of the world’s most challenging problems.

They include efforts to identify extreme weather patterns at Lawrence Berkeley National Lab … Research into the genomics of opioid addiction at Oak Ridge National Laboratory … Nuclear waste remediation efforts led by LBNL, the Pacific Northwest National Lab and Brown University at the Hanford site … And cancer-detection research led by Oak Ridge National Laboratory and the State University of New York at Stony Brook.

At the same time, AI is being put to work across an ever-broader array of industries. Earlier this month, the U.S. Post Office, the world’s largest delivery service — which processes nearly 500 million pieces of mail a day — announced it’s adopting end-to-end AI technology from NVIDIA.

“It’s the perfect application for a streaming AI computer,” Huang said.

And last month, in partnership with Ericsson, Microsoft, Red Hat and others, Huang revealed that NVIDIA is powering AI at the edge of enterprise and 5G telco networks with the NVIDIA EGX Edge Supercomputing platform.

Next up for HPC: harnessing vast numbers of software-defined sensors to relay data to programmable edge computers, which in turn pass on the most interesting data to supercomputers able to wring insights out of oceans of real-time data.

Arm in Arm: GPU-Acceleration Speeds Emerging HPC Architecture

Monday’s news marks a milestone for the Arm community. The processor architecture — ubiquitous in smartphones and IoT devices — has long been the world’s most popular. Arm has more than 100 billion computing devices and will cross the trillion mark in the coming years, Huang predicted.

NVIDIA’s moving fast to bring HPC tools of all kinds to this thriving ecosystem.

“We’ve been working with the industry, all of you, and the industry has really been fantastic, everybody is jumping on,” Huang said, adding that 30 applications are already up and running. “This is going to be a great ecosystem — basically everything that runs in HPC should run on any CPU as well.”

World-leading supercomputing centers have already begun testing GPU-accelerated Arm-based computing systems, Huang said. This includes Oak Ridge and Sandia National Laboratories, in the United States; the University of Bristol, in the United Kingdom; and Riken, in Japan.

NVIDIA’s reference design for GPU-accelerated Arm servers — comprising both hardware and software building blocks — has already won support from key players in HPC and Arm ecosystems, Huang said.

In the Arm ecosystem, NVIDIA is teaming with Arm, Ampere, Fujitsu and Marvell. NVIDIA is also working with Cray, a Hewlett Packard Enterprise company, and HPE. A wide range of HPC software companies are already using NVIDIA CUDA-X libraries to bring their GPU-enabled management and monitoring tools to the Arm ecosystem.

The reference platform’s debut follows NVIDIA’s announcement earlier this year that it will bring its CUDA-X software platform to Arm. Fulfilling this promise, NVIDIA is previewing its Arm-compatible software developer kit — available for download now — consisting of NVIDIA CUDA-X libraries and development tools for accelerated computing.

Microsoft Brings GPU-Powered Supercomputer to Azure

“This puts a supercomputer in the hands of every scientist in the world,” Huang said he announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure.

Giving HPC researchers and others instant access to unprecedented amounts of GPU computing power, Huang announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure that ranks among the world’s fastest.

“Now you can open up an instance, you grab one of the stacks … in the container, you launch it, on Azure, and you’re doing science,” Huang said. “It’s really quite fantastic.”

Built to handle the most demanding AI and HPC applications, the Azure NDv2 instance can scale up to 800 NVIDIA V100 Tensor Core GPUs interconnected with Mellanox InfiniBand.

For the first time, researchers and others can rent an entire AI supercomputer on demand, matching the capabilities of large-scale, on-premise supercomputers that can take months to deploy.

AI researchers needing fast solutions can quickly spin up multiple Azure NDv2 instances and train complex conversational AI models in just hours, Huang explained.

For example, Microsoft and NVIDIA engineers used 64 NDv2 instances on a pre-release version of the cluster to train BERT, a popular conversational AI model, in roughly three hours.

Magnum IO Software

Helping AI researchers and data scientists move data in minutes, rather than hours, Huang introduced the NVIDIA Magnum IO software suite.

A standing-room only crowd of some 1,400 researchers and technologists came to hear NVIDIA’s keynote at the start of SC19, the world’s top supercomputing event.

Delivering up to 20x faster data processing for multi-server, multi-GPU computing nodes, Mangum IO eliminates a key bottleneck faced by those carrying out complex financial analysis, climate modeling and other high-performance workloads.

“This is an area that is going to be rich with innovation, and we are going to be putting a lot of energy into helping you move information in and out of the system,” Huang said.

A key feature of Magnum IO is NVIDIA GPUDirect Storage, which provides a direct data path between GPU memory and storage, enabling data to bypass CPUs and travel unencumbered on “open highways” offered by GPUs, storage and networking devices.

NVIDIA developed Magnum in close collaboration with industry leaders in networking and storage, including DataDirect Networks, Excelero, IBM, Mellanox and WekaIO.

The post Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm appeared first on The Official NVIDIA Blog.

DC Startup Casts an AI Net to Stop Phishing and Malware

When the price went way up on a key service a small Washington, D.C., firm was using to protect its customers’ internet connectivity, the company balked.

After not finding a suitable alternative, the company decided to build its own. The result was a whole new business, called DNSFilter, which is casting a wide net around the market to combat phishing and malware.

Its innovation: It ditched the crowdsourcing model that has served for more than a decade as the bedrock for identifying whether websites are valid or corrupt. It opted, instead, for GPU-powered AI to make web surfing safer by identifying threats and objectionable content much faster than traditional offerings.

“We figured that if we built a whole new DNS from the ground up, built on artificial intelligence and machine learning, we could find threats faster and more effectively,” said Rustin Banks, chief revenue officer and one of four principals at DNSFilter.

Spinning Up Phishing Protection

DNS, or domain name system, is the naming system for computers, phones and services that connect to the internet. DNSFilter’s aim is to protect these assets from malicious websites and attacks.

The company’s algorithm takes seconds to compare websites to a machine learning model generated from 30,000 known phishing sites. To date, its AI prevents over 90 percent of new requests to visit potentially corrupt sites.

It’s this speed that largely separates DNSFilter from the rest of the industry, Banks said. It gets results in near real time, while competitors typically take around 24 hours.

The company’s algorithm has been built and trained in the cloud using NVIDIA P4 GPU clusters.

“NVIDIA GPUs allow us to rapidly train AI, while being able to use cutting-edge frameworks. It’s not a job I would want to do without them,” said Adam Spotton, chief data scientist at DNSFilter.

Inferencing occurs at 48 locations worldwide, hosted by 10 vendors who’ve passed DNSFilter’s rigorous security standards.

Banks said the company’s rivals primarily use a company in the Philippines that has a team of 150 people classifying sites all day. But for DNSFilter, the more corrupt sites it identifies, the faster and more accurate its algorithm becomes. (Disclosure: NVIDIA is one of the company’s biggest customers.)

Moreover, DNSFilter’s solution works at the network level so there’s no plug-in necessary and the solution works with any email client, protecting organizations regardless of where employees are or what device they’re using.

“If the CFO uses his Yahoo mail on his mobile device, it doesn’t matter,” said Banks. “It’s built right into the fabric of the internet request.”

Upping the Ante

Banks estimates that DNS filtering represents a billion-dollar market, and he’s confident that the $10 billion firewall market is in play for DNSFilter.

Already, the startup is fielding more than a billion DNS requests a day. Banks foresees that number rising to 10 billion by the end of 2020. He also expects accuracy will come to exceed 99 percent as the dataset of corrupt sites grows.

The company isn’t stopping there. More services are planned, including a log -analysis product currently in beta. It scans logos on sites linked from phishing emails and compares them against a database of approved sites to determine whether the logo is real. It then blocks phishing sites in real time.

Eventually, Banks said, the company intends to evolve from its current machine learning feedback loop to a neural network with sufficient cognition to identify things that its algorithms can’t find.

This, he said, would be like having an extra pair of eyes inside an organization’s security team, constantly monitoring suspicious web surfing wherever employees may be working.

“This is taking phishing protection to a new level,” said Banks. “It’s like network-level protection that comes with you wherever you go.”

The post DC Startup Casts an AI Net to Stop Phishing and Malware appeared first on The Official NVIDIA Blog.

AI’s New Onramp: Meet the Data Science PC

The trip to AI and big-data analytics is now just a click away. Starting today, three NVIDIA partners are selling online a new class of computers we call data science PCs.

The systems bundle the hardware and software data scientists need to hit an “on” button and start managing datasets and models to make AI predictions. Data science PCs tap NVIDIA TITAN RTX GPUs and RAPIDS software to deliver 3-6x speed-ups compared to CPU-only desktops.

Three experts in building high-end PCs — Digital Storm, Maingear and Puget Systems — are offering the products now. They’re targeting an expanding class of independent data scientists to help them achieve better results faster.

data science PC benchmark
A data science PC handled extract-transform-load (ETL) and XGBoost training on a dataset derived from New York City taxis, delivering end-to-end predictions in one-sixth the time of a CPU-only desktop.

Some of the world’s largest and most innovative organizations are already using GPU-accelerated servers and workstations to tackle their demanding data-science jobs.

For example, Walmart’s supermarket of the future that can compute in real time more than 1.6 terabytes of data generated per second using NVIDIA’s EGX platform. The Summit system at Oak Ridge National Laboratory can tap its 27,648 NVIDIA V100 Tensor Core GPUs to drive 3.3 exaflops of mixed-precision horsepower on AI tasks.

But data science isn’t just for large enterprises. Startups, researchers, students and enthusiasts are jumping into this burgeoning field. They’re contributing to the corporate momentum making the role of data scientist one of the fastest growing jobs in the U.S.

The data science PC aims to fuel this growing class of independent data science practitioners. The combination of powerful, pre-configured systems and a tested software stack can jumpstart their work.

The Speeds and Feeds

Under the hood, a data science PC includes one or two TITAN RTX GPUs, each with up to 24GB of memory. NVLink high-speed interconnect technology connects the two GPUs to tackle datasets that demand more GPU memory.

The systems can accommodate 48-128GB of main memory and storage options include drives that range up to 10TB.

Each data science PC will ship with Linux and RAPIDS, NVIDIA’s data science software stack, powered by its popular CUDA-X AI programming libraries.

NVIDIA RAPIDS eases the job of porting existing code for GPU acceleration. Its APIs are modeled after popular libraries used in data science. In many cases, it’s only necessary to change a few lines of code in order to tap the potential of GPU acceleration.

Here are some of the key elements of RAPIDS:

  • cuDF is a Python GPU data-frame library for loading, joining, aggregating, filtering and otherwise manipulating data. The API is designed to be similar to Pandas, so existing code easily maps to the GPU.
  • cuML accelerates popular machine learning algorithms, including XGBoost, PCA, K-means, k-Nearest Neighbors and more. It is closely aligned with sciKit-learn.
  • cuGraph is a library of graph algorithms, similar to NetworkX, that works with data stored in a GPU data frame.

An ecosystem of startups in Inception, NVIDIA virtual accelerator program for startups focused on AI and data science, provides applications and services that run on top of RAPIDS. They include companies, such as Graphistry and OmniSci, that offer big-data visualization tools.

Data scientists can also use NVIDIA’s data science developer forum to ask questions and learn more about data science on GPUs.

The data science PC is here, ready to propel you to an AI future.  Learn more from our partners Digital Storm, Maingear and Puget Systems.

The post AI’s New Onramp: Meet the Data Science PC appeared first on The Official NVIDIA Blog.

NVIDIA CEO Introduces Aerial — Software to Accelerate 5G on NVIDIA GPUs

Speeding the mass adoption of AI at the 5G edge, NVIDIA has introduced Aerial, a software developer kit enabling GPU-accelerated, software-defined wireless radio access networks.

In his keynote at Mobile World Congress Los Angeles, NVIDIA founder and CEO Jensen Huang detailed how Aerial, running on the NVIDIA EGX platform, enables AI services and immersive content at the edge of 5G networks.

5G offers plenty of speed, of course, delivering 10x lower latency, 1,000x the bandwidth and millions of connected devices per square kilometer. 5G also introduces the critical concept of “network slicing.” This allows telcos to dynamically — on a session-by-session basis — offer unique services to customers.

Traditional solutions cannot be reconfigured quickly, therefore telco operators need a new network architecture. One that’s high performance and reconfigurable by the second, Huang explained.

Such virtualized radio access networks run in the wireless infrastructure closest to customers, making it well suited to offer AI services at the edge. They’re critical to building a modern 5G infrastructure capable of running a range of applications that are dynamically provisioned on a common platform.

With NVIDIA Aerial, the same computing infrastructure required for 5G networking can be used to provide AI services such as smart cities, smart factories, AR/VR and cloud gaming.

Aerial provides two critical SDKs — CUDA Virtual Network Function (cuVNF) and CUDA Baseband (cuBB) — to simplify building highly scalable and programmable, software-defined 5G RAN networks using off-the-shelf servers with NVIDIA GPUs.

  • The NVIDIA cuVNF SDK provides optimized input/output and packet processing, sending 5G packets directly to GPU memory from GPUDirect-capable network interface cards.
  • The NVIDIA cuBB SDK provides a GPU-accelerated 5G signal processing pipeline, including cuPHY for L1 5G Phy, delivering unprecedented throughput and efficiency by keeping all physical layer processing within the GPU’s high-performance memory.

The NVIDIA Aerial SDK runs on the NVIDIA EGX stack, bringing GPU acceleration to carrier-grade Kubernetes infrastructure.

The NVIDIA EGX stack includes an NVIDIA driver, NVIDIA Kubernetes plug-in, NVIDIA Container runtime plug-in and NVIDIA GPU monitoring software.

To simplify the management of GPU-enabled servers, telcos can install all required NVIDIA software as containers that run on Kubernetes — open-source software widely used to speed the deployment and management of sophisticated software of all kinds.

In short, Aerial enables the highest return on investment by providing elasticity as network traffic changes throughout the day, as well as the flexibility to offer services based on changing customer needs.

Aerial is already endorsed by some of the world’s leading telcos and cloud infrastructure providers:

“The telco industry is eagerly adopting cloud-native architecture to meet the growing compute demands of 5G. We are learning firsthand how the remarkable compute performance of NVIDIA GPUs, together with NVIDIA’s Aerial SDKs, can address the challenges of building flexible, high-performance virtualized telecom networks. We look forward to Aerial’s continued development.”

— Yasuyuki Nakajima, president and CEO, KDDI Research, Inc.

“5G networks must rely on software-defined infrastructure from the core to the edge to enable a range of high-value services, like AI/ML, IoT and autonomous driving. Red Hat’s vision of extending cloud-native technologies to the edge combined with NVIDIA’s flexible Aerial SDK aims to bring GPU acceleration to 5G RAN. We’ve teamed up with NVIDIA to provide our customers with standardized 5G infrastructure that enables them to develop and deploy their edge applications faster.”

— Chris Wright, senior vice president and chief technology officer, Red Hat

“SoftBank Corp. has been focused over the past decade on building centralized radio access networks that guarantee high capacity and stability. We believe that our 5G network will be completed through a software approach, or softwarization, and that NVIDIA’s Aerial SDKs will play an instrumental role in this effort. It enables an open ecosystem for software-defined 5G networks delivering both flexibility and high performance, which will help SoftBank Corp. drive the digital transformation of the telco industry.”

— Ryuji Wakikawa, vice president and head of the Advanced Technology Division, SoftBank Corp.


NVIDIA Aerial is available to early access partners today. Planned general availability is yearend. Sign up here to receive more information.

Visit NVIDIA at MWC Los Angeles

To experience firsthand the power of the EGX platform with Aerial, visit us at booth 1745 in Hall South at MWC Los Angeles this week.

The post NVIDIA CEO Introduces Aerial — Software to Accelerate 5G on NVIDIA GPUs appeared first on The Official NVIDIA Blog.

NVIDIA EGX Supercomputing Platform Simplifies AI Deployments to the Edge with Enterprise Kubernetes

AI is no longer just a research project. It’s solving real-world problems for organizations, which now need to figure out where to deploy their AI models to make faster decisions.

With the convergence of AI, the Internet of Things and the approaching 5G infrastructure, the opportunity is ripe for companies to push their models beyond the data center to the edge, where billions of sensors are streaming data and making real-time decisions is a reality.

Enterprises deploying AI workloads at scale are using a combination of on-premises data centers and the cloud, bringing the AI models to where the data is being collected. Deploying these workloads at the edge, say in a retail store or parking garage, can be very challenging if IT expertise is not available as one might have with data centers.

Kubernetes eliminates many of the manual processes involved in deploying, managing and scaling applications. It provides a consistent, cloud-native deployment approach across on-prem, the edge and the cloud.

However, setting up Kubernetes clusters to manage hundreds or even thousands of applications across remote locations can be cumbersome, especially when human expertise isn’t readily available at every edge locale. We’re addressing these challenges through the NVIDIA EGX Edge Supercomputing Platform.

Simplifying AI Deployments 

NVIDIA EGX is a cloud-native, software-defined platform designed to make large-scale hybrid-cloud and edge operations possible and efficient.

Within the platform is the EGX stack, which includes an NVIDIA driver, Kubernetes plug-in, NVIDIA container runtime and GPU monitoring tools, delivered through the NVIDIA GPU Operator. Operators codify operational knowledge and workflows to automate lifecycle management of containerized applications with Kubernetes.

The GPU Operator is a Helm chart deployed, cloud-native method to standardize and automate the deployment of all necessary components for provisioning GPU-enabled Kubernetes systems. NVIDIA, Red Hat and others in the cloud-native community have collaborated on creating the GPU Operator.

The GPU Operator also allows IT teams to manage remote GPU-powered servers the same way they manage CPU-based systems. This makes it easy to bring up a fleet of remote systems with a single image and run edge AI applications without additional technical expertise on the ground.

The EGX stack architecture is supported by hybrid-cloud management partners, such as Canonical, Cisco, Microsoft, Nutanix, Red Hat and VMware, to further simplify deployments and provide a consistent experience from cloud and data center to the edge.

Chart of NGC for NVIDIA EGX

NGC-Ready for Edge

NGC-Ready systems, offered by the world’s leading server manufacturers, are validated for functionality and performance of AI software from NGC, NVIDIA’s software hub for GPU-optimized containers.

Today at Mobile World Congress Los Angeles, we announced the expansion of the NGC-Ready program with NGC-Ready for Edge systems to support edge deployments. These systems undergo additional security and remote system management tests, which are fundamental requirements for edge deployments. Qualified systems like these are ideal for running the EGX stack, providing an easy onramp to hybrid deployments.

Validated NGC-Ready for Edge systems are available from the world’s leading manufacturers, including Advantech, Altos Computing, ASRock RACK, Atos, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, MiTAC, QCT, Supermicro and TYAN.

Expanded NGC Registry

NGC now offers a Helm chart registry for deploying and managing AI software. Helm charts are powerful cloud-native tools to customize and automate how and where applications are deployed across Kubernetes clusters.

NGC’s Helm chart registry contains AI frameworks, NVIDIA software including the GPU Operator, NVIDIA Clara for medical imaging and NVIDIA Metropolis for smart cities, smart retail and industrial inspection. NGC also hosts Helm charts for third-party AI applications, including DeepVision for vehicle analytics, IronYun for video search and Kinetica for streaming analytics.

With NGC-Ready Support Services, developer and operations teams get access to a private Helm registry for their NGC-Ready for Edge systems to push and share their Helm charts. This lets the teams take advantage of consistent, secure and reliable environments to speed up continuous cycles of integration and deployment.

Deploy AI Software Today with NGC

To easily provision GPU-powered Kubernetes clusters across different platforms and quickly deploy AI applications with Helm charts and containers, go to

The post NVIDIA EGX Supercomputing Platform Simplifies AI Deployments to the Edge with Enterprise Kubernetes appeared first on The Official NVIDIA Blog.