How NVIDIA EGX Is Forming Central Nervous System of Global Industries

Massive change across every industry is being driven by the rising adoption of IoT sensors, including cameras for seeing, microphones for hearing, and a range of other smart devices that help enterprises perceive and understand what’s happening in the physical world.

The amount of data being generated at the edge is growing exponentially. The only way to process this vast data in real time is by placing servers near the point of action and by harnessing the immense computational power of GPUs.

The enterprise data center of the future won’t have 10,000 servers in one location, but one or more servers across 10,000 different locations. They’ll be in office buildings, factories, warehouses, cell towers, schools, stores and banks. They’ll detect traffic jams and forest fires, route traffic safely and prevent crime.

By placing a network of distributed servers where data is being streamed from hundreds of sensors, enterprises can use networks of data centers at the edge to drive immediate action with AI. Additionally, by processing data at the edge, privacy concerns are mitigated and data sovereignty concerns are put to rest.

Edge servers lack the physical security infrastructure that enterprise IT takes for granted. And companies lack the budget to invest in roaming IT personnel to manage these remote systems. So edge servers need to be designed to be self-secure and easy to update, manage and deploy from afar.

Plus, AI systems need to be running all the time, with zero downtime.

We’ve built the NVIDIA EGX Edge AI platform to ensure security and resiliency on a global scale. By simplifying deployment and management, NVIDIA EGX allows always-on AI applications to automate the critical infrastructure of the future. The platform is a Kubernetes and container-native software platform that brings GPU-accelerated AI to everything from dual-socket x86 servers to Arm-based NVIDIA Jetson SoCs.

To date, there are over 20 server vendors building EGX-powered edge and micro-edge servers, including ADLINK, Advantech, Atos, AVerMedia, Cisco, Connect Tech, Dell Technologies, Diamond Systems, Fujitsu, Gigabyte, Hewlett Packard Enterprise, Inspur, Lenovo, Quanta Technologies and Supermicro. As well as dozens of hybrid-cloud and network security partners in the NVIDIA edge ecosystem, such as Canonical, Check Point, Excelero, Guardicore, IBM, Nutanix, Palo Alto Networks, Rancher, Red Hat, VMware, Weka and Wind River.

There are also hundreds of AI applications and integrated solutions vendors building on NVIDIA EGX to deliver industry-specific offerings to enterprises across the globe.

Enterprises running AI need to protect not just customer data, but also the AI models that transform the data into actions. By combining an NVIDIA Mellanox SmartNIC, the industry standard for secure, high-performance networking, with our AI processors into NVIDIA EGX A100, a combined converged accelerator, we’re introducing fundamental new innovations for edge AI.

Enhanced Security and Performance

A secure, authenticated boot of the GPU and SmartNIC from Hardware Root-of-Trust ensures the device firmware and lifecycle are securely managed. Third-generation Tensor Core technology in the NVIDIA Ampere architecture brings industry-leading AI performance. Specific to EGX A100, the confidential AI enclave uses a new GPU security engine to load encrypted AI models and encrypt all AI outputs, further preventing the theft of valuable IP.

As the edge moves to encrypted high-resolution sensors, SmartNICs support in-line cryptographic acceleration at the line rate. This allows encrypted data feeds to be decrypted and sent directly to the GPU memory, bypassing the CPU and system memory.

The edge also requires a greater level of security to protect against threats from other devices on the network. With dynamically reconfigurable firewall offloads in hardware, SmartNICs efficiently deliver the first line of defense for hybrid-cloud, secure service mesh communications.

NVIDIA Mellanox’s time-triggered transport technology for telco (5T for 5G) ensures commercial off-the-shelf solutions can meet the most time-sensitive use cases for 5G vRAN with our NVIDIA Aerial SDK. This will lead to a new wave of CloudRAN in the telecommunications industry.

With an NVIDIA Ampere GPU and Mellanox ConnectX-6 D on one converged product, the EGX A100 delivers low-latency, high-throughput packet processing for security and virtual network functions.

Simplified Deployment, Management and Security at Scale

Through NGC, NVIDIA’s catalog of GPU-optimized containers, we provide industry application frameworks and domain-specific AI toolkits to simplify getting started and for tuning AI applications to new edge environments. They can be used together or individually and open new possibilities for a variety of edge use cases.

And with the NGC private registry, applications can be signed before publication to ensure they haven’t been tampered with in transit, then authenticated before running at the edge. The NGC private registry also supports model versioning and encryption, so lightweight model updates can be delivered quickly and securely.

The future of edge computing requires secure, scalable, resilient, easy-to-manage fleets of AI-powered systems operating at the network edge. By bringing the combined acceleration of NVIDIA GPUs and NVIDIA Mellanox SmartNICs together with NVIDIA EGX, we’re building both the platform and the ecosystem to form the AI nervous system of every global industry.

The post How NVIDIA EGX Is Forming Central Nervous System of Global Industries appeared first on The Official NVIDIA Blog.

NVIDIA CEO Introduces NVIDIA Ampere Architecture, NVIDIA A100 GPU in News-Packed ‘Kitchen Keynote’

NVIDIA today set out a vision for the next generation of computing that shifts the focus of the global information economy from servers to a new class of powerful, flexible data centers.

In a keynote delivered in six simultaneously released episodes recorded from the kitchen of his California home, NVIDIA founder and CEO Jensen Huang discussed NVIDIA’s recent Mellanox acquisition, new products based on the company’s much-awaited NVIDIA Ampere GPU architecture and important new software technologies.

Original plans for the keynote to be delivered live at NVIDIA’s GPU Technology Conference in late March in San Jose were upended by the coronavirus pandemic.

Huang kicked off his keynote on a note of gratitude.

“I want to thank all of the brave men and women who are fighting on the front lines against COVID-19,” Huang said.

NVIDIA, Huang explained, is working with researchers and scientists to use GPUs and AI computing to treat, mitigate, contain and track the pandemic. Among those mentioned:

  • Oxford Nanopore Technologies has sequenced the virus genome in just seven hours.
  • Plotly is doing real-time infection rate tracing.
  • Oak Ridge National Laboratory and the Scripps Research Institute have screened a billion potential drug combinations in a day.
  • Structura Biotechnology, the University of Texas at Austin and the National Institutes of Health have reconstructed the 3D structure of the virus’s spike protein.

NVIDIA also announced updates to its NVIDIA Clara healthcare platform aimed at taking on COVID-19.

“Researchers and scientists applying NVIDIA accelerated computing to save lives is the perfect example of our company’s purpose — we build computers to solve problems normal computers cannot,” Huang said.

At the core of Huang’s talk was a vision for how data centers, the engine rooms of the modern global information economy, are changing, and how NVIDIA and Mellonox, acquired in a deal that closed last month, are together driving those changes.

“The data center is the new computing unit,” Huang said, adding that NVIDIA is accelerating performance gains from silicon, to the ways CPUs and GPUs connect, to the full software stack, and, ultimately, across entire data centers.

Systems Optimized for Data Center-Scale Computing

That starts with a new GPU architecture that’s optimized for this new kind of data center-scale computing, unifying AI training and inference, and making possible flexible, elastic acceleration.

NVIDIA A100, the first GPU based on the NVIDIA Ampere architecture, providing the greatest generational performance leap of NVIDIA’s eight generations of GPUs, is also built for data analytics, scientific computing and cloud graphics, and is in full production and shipping to customers worldwide, Huang announced.

Eighteen of the world’s leading service providers and systems builders are incorporating them, among them Alibaba Cloud, Amazon Web Services, Baidu Cloud, Cisco, Dell Technologies, Google Cloud, Hewlett Packard Enterprise, Microsoft Azure and Oracle.

The A100, and the NVIDIA Ampere architecture it’s built on, boost performance by up to 20x over its predecessors, Huang said. He detailed five key features of A100, including:

  • More than 54 billion transistors, making it the world’s largest 7-nanometer processor.
  • Third-generation Tensor Cores with TF32, a new math format that accelerates single-precision AI training out of the box. NVIDIA’s widely used Tensor Cores are now more flexible, faster and easier to use, Huang explained.
  • Structural sparsity acceleration, a new efficiency technique harnessing the inherently sparse nature of AI math for higher performance.
  • Multi-instance GPU, or MIG, allowing a single A100 to be partitioned into as many as seven independent GPUs, each with its own resources.
  • Third-generation NVLink technology, doubling high-speed connectivity between GPUs, allowing A100 servers to act as one giant GPU.

The result of all this: 6x higher performance than NVIDIA’s previous generation Volta architecture for training and 7x higher performance for inference.

NVIDIA DGX A100 Packs 5 Petaflops of Performance

NVIDIA is also shipping a third generation of its NVIDIA DGX AI system based on NVIDIA A100 — the NVIDIA DGX A100 — the world’s first 5-petaflops server. And each DGX A100 can be divided into as many as 56 applications, all running independently.

The U.S. Department of Energy’s Argonne National Laboratory will use DGX A100’s AI and computing power to better understand and fight COVID-19.

This allows a single server to either “scale up” to race through computationally intensive tasks such as AI training, or “scale out,” for AI deployment, or inference, Huang said.

Among initial recipients of the system are the U.S. Department of Energy’s Argonne National Laboratory, which will use the cluster’s AI and computing power to better understand and fight COVID-19; the University of Florida; and the German Research Center for Artificial Intelligence.

A100 will also be available for cloud and partner server makers as HGX A100.

A data center powered by five DGX A100 systems for AI training and inference running on just 28 kilowatts of power costing $1 million can do the work of a typical data center with 50 DGX-1 systems for AI training and 600 CPU systems consuming 630 kilowatts and costing over $11 million, Huang explained.

“The more you buy, the more you save,” Huang said, in his common keynote refrain.

Need more? Huang also announced the next-generation DGX SuperPOD. Powered by 140 DGX A100 systems and Mellanox networking technology, it offers 700 petaflops of AI performance, Huang said, the equivalent of one of the 20 fastest computers in the world.

The next-generation DGX SuperPOD delivers 700 petaflops of AI performance.

NVIDIA is expanding its own data center with four DGX SuperPODs, adding 2.8 exaflops of AI computing power — for a total of 4.6 exaflops of total capacity — to its SATURNV internal supercomputer, making it the world’s fastest AI supercomputer.

Huang also announced the NVIDIA EGX A100, bringing powerful real-time cloud-computing capabilities to the edge. Its NVIDIA Ampere architecture GPU offers third-generation Tensor Cores and new security features. Thanks to its NVIDIA Mellanox ConnectX-6 SmartNIC, it also includes secure, lightning-fast networking capabilities.

Software for the Most Important Applications in the World Today

Huang also announced NVIDIA GPUs will power major software applications for accelerating three critical usages: managing big data, creating recommender systems and building real-time, conversational AI.

These new tools arrive as the effectiveness of machine learning has driven companies to collect more and more data. “That positive feedback is causing us to experience an exponential growth in the amount of data that is collected,” Huang said.

To help organizations of all kinds keep up, Huang announced support for NVIDIA GPU acceleration on Spark 3.0, describing the big data analytics engine as “one of the most important applications in the world today.”

Built on RAPIDS, Spark 3.0 shatters performance benchmarks for extracting, transforming and loading data, Huang said. It’s already helped Adobe Intelligent Services achieve a 90 percent compute cost reduction.

Key cloud analytics platforms — including Amazon SageMaker, Azure Machine Learning, Databricks, Google Cloud AI and Google Cloud Dataproc — will all accelerate with NVIDIA, Huang announced.

“We’re now prepared for a future where the amount of data will continue to grow exponentially from tens or hundreds of petabytes to exascale and beyond,” Huang said.

Huang also unveiled NVIDIA Merlin, an end-to-end framework for building next-generation recommender systems, which are fast becoming the engine of a more personalized internet. Merlin slashes the time needed to create a recommender system from a 100-terabyte dataset to 20 minutes from four days, Huang said.

And he detailed NVIDIA Jarvis, a new end-to-end platform for creating real-time, multimodal conversational AI that can draw upon the capabilities unleashed by NVIDIA’s AI platform.

Huang highlighted its capabilities with a demo that showed him interacting with a friendly AI, Misty, that understood and responded to a sophisticated series of questions about the weather in real time.

Huang also dug into NVIDIA’s swift progress in real-time ray tracing since NVIDIA RTX was launched at SIGGRAPH in 2018, and he announced that NVIDIA Omniverse, which allows “different designers with different tools in different places doing different parts of the same design,” to work together simultaneously is now available for early access customers.

Autonomous Vehicles

Autonomous vehicles are one of the greatest computing challenges of our time, Huang said, an area where NVIDIA continues to push forward with NVIDIA DRIVE.

NVIDIA DRIVE will use the new Orin SoC with an embedded NVIDIA Ampere GPU to achieve the energy efficiency and performance to offer a 5-watt ADAS system for the front windshield as well as scale up to a 2,000 TOPS, level-5 robotaxi system.

Now automakers have a single computing architecture and single software stack to build AI into every one of their vehicles.

“It’s now possible for a carmaker to develop an entire fleet of cars with one architecture, leveraging the software development across their whole fleet,” Huang said.

The NVIDIA DRIVE ecosystem now encompasses cars, trucks, tier one automotive suppliers, next-generation mobility services, startups, mapping services, and simulation.

And Huang announced NVIDIA is adding NVIDIA DRIVE RC for managing entire fleets of autonomous vehicles to its suite of NVIDIA DRIVE technologies.


NVIDIA also continues to push forward with its NVIDIA Isaac software-defined robotics platform, announcing that BMW has selected NVIDIA Isaac robotics to power its next-generation factories.

BMW’s 30 factories around the globe build one vehicle every 56 seconds: that’s 40 different models, each with hundreds of different options, made from 30 million parts flowing in from nearly 2,000 suppliers around the world, Huang explained.

BMW joins a sprawling NVIDIA robotics global ecosystem that spans delivery services, retail, autonomous mobile robots, agriculture, services, logistics, manufacturing and healthcare.

In the future, factories will, effectively, be enormous robots. “All of the moving parts inside will be driven by artificial intelligence,” Huang said. “Every single mass-produced product in the future will be customized.”

The post NVIDIA CEO Introduces NVIDIA Ampere Architecture, NVIDIA A100 GPU in News-Packed ‘Kitchen Keynote’ appeared first on The Official NVIDIA Blog.

What’s a Recommender System?

Search and you might find.

Spend enough time online, however, and what you want will start finding you just when you need it.

This is what’s driving the internet right now.

They’re called recommender systems, and they’re among the most important applications today.

That’s because there is an explosion of choice and it’s impossible to explore the large number of available options.

If a shopper were to spend just one second each swiping on their mobile app through the two billion products available on one prominent ecommerce site, it would take 65 years — almost an entire lifetime — to go through their entire catalog. 

This is one of the major reasons why the Internet is now so personalized, otherwise it’s simply impossible for the billions of Internet users in the world to connect with the products, services, even expertise — among hundreds of billions of things — that matter to them.

They might be the most human, too. After all, what are you doing when you go to someone for advice? When you’re looking for feedback? You’re asking for a recommendation.

Now, driven by vast quantities of data about the preferences of hundreds of millions of individual users, recommender systems are racing to get better at doing just that.

The internet, of course, already knows a lot of facts: your name, your address, maybe your birthplace. But what the recommender systems seek to learn better, perhaps, than the people who know you are your preferences.

Looking to get started with recommender systems? Read more about NVIDIA Merlin, NVIDIA’s application framework for deep recommender systems

Key to Success of Web’s Most Successful Companies

Recommender systems aren’t a new idea. Jussi Karlgren formulated the idea of a recommender system, or a “digital bookshelf,” in 1990. Over the next two decades researchers at MIT and Bellcore steadily advanced the technique.

The technology really caught the popular imagination starting in 2007, when Netflix — then in the business of renting out DVDs through the mail — kicked off an open competition with a $1 million prize for a collaborative filtering algorithm that could improve on the accuracy of Netflix’s own system by more than 10 percent, a prize that was claimed in 2009.

Over the following decade, such recommender systems would become critical to the success of Internet companies such as Netflix, Amazon, Facebook, Baidu and Alibaba.

Virtuous Data Cycle

And the latest generation of deep-learning powered recommender systems provide marketing magic, giving companies the ability to boost click-through rates by better targeting users who will be interested in what they have to offer.

Now the ability to collect this data, process it, use it to train AI models and deploy those models to help you and others find what you want is among the largest competitive advantages possessed by the biggest internet companies.

It’s driving a virtuous cycle — with the best technology driving better recommendations, recommendations which draw more customers and, ultimately, let these companies afford even better technology.

That’s the business model. So how does this technology work?

Collecting  Information

Recommenders work by collecting information — by noting what you ask for — such as what movies you tell your video streaming app you want to see, ratings and reviews you’ve submitted, purchases you’ve made, and other actions you’ve taken in the past

Perhaps more importantly, they can keep track of choices you’ve made: what you click on and how you navigate. How long you watch a particular movie, for example. Or which ads you click on or which friends you interact with.

All this information is streamed into vast data centers and compiled into complex, multidimensional tables that quickly balloon in size.

They can be hundreds of terabytes large — and they’re growing all the time.

That’s not so much because vast amounts of data are collected from any one individual, but because a little bit of data is collected from so many.

In other words, these tables are sparse — most of the information most of these services have on most of us for most of these categories is zero.

But, collectively these tables contain a great deal of information on the preferences of a large number of people.

And that helps companies make intelligent decisions about what certain types of users  might like.

Content Filtering, Collaborative Filtering

While there are a vast number of recommender algorithms and techniques, most fall into one of two broad categories: collaborative filtering and content filtering.

Collaborative filtering helps you find what you like by looking for users who are similar to you.

So while the recommender system may not know anything about your taste in music, if it knows you and another user share similar taste in books, it might recommend a song to you that it knows this other user already likes.

Content filtering, by contrast, works by understanding the underlying features of each product.

So if a recommender sees you liked the movies “You’ve Got Mail” and “Sleepless in Seattle,” it might recommend another movie to you starring Tom Hanks and Meg Ryan, such as “Joe Versus the Volcano.”

Those are extremely simplistic examples, to be sure.

Data as a Competitive Advantage

In reality, because these systems capture so much data, from so many people, and are deployed at such an enormous scale, they’re able to drive tens or hundreds of millions of dollars of business with even a small improvement in the system’s recommendations.

A business may not know what any one individual will do, but thanks to the law of large numbers, they know that, say, if an offer is presented to 1 million people, 1 percent will take it.

But while the potential benefits from better recommendation systems are big, so are the challenges.

Successful internet companies, for example, need to process ever more queries, faster, spending vast sums on infrastructure to keep up as the amount of data they process continues to swell.

Companies outside of technology, by contrast, need access to ready-made tools so they don’t have to hire whole teams of data scientists.

If recommenders are going to be used in industries ranging from healthcare to financial services, they’ll need to become more accessible.

GPU Acceleration

This is where GPUs come in.

NVIDIA GPUs, of course, have long been used to accelerate training times for neural networks — sparking the modern AI boom — since their parallel processing capabilities let them blast through data-intensive tasks.

But now, as the amount of data being moved continues to grow, GPUs are being harnessed more extensively. Tools such as RAPIDS, a suite of software libraries for accelerating data science and analytics pipelines much more quickly, so data scientists can get more work done much faster.

And NVIDIA’s just announced Merlin recommender application framework promises to make GPU-accelerated recommender systems more accessible still with an end-to-end pipeline for ingesting, training and deploying GPU-accelerated recommender systems.

These systems will be able to take advantage of the new NVIDIA A100 GPU, built on our NVIDIA Ampere architecture, so companies can build recommender systems more quickly and economically than ever.

Our recommendation? If you’re looking to put recommender systems to work, now might be a good time to get started.

Looking to get started with recommender systems? Read more about NVIDIA Merlin, NVIDIA’s application framework for deep recommender systems

Featured image credit: © Monkey Business –

The post What’s a Recommender System? appeared first on The Official NVIDIA Blog.

BERT Does Europe: AI Language Model Learns German, Swedish

BERT is at work in Europe, tackling natural-language processing jobs in multiple industries and languages with help from NVIDIA’s products and partners.

The AI model formally known as Bidirectional Encoder Representations from Transformers debuted just last year as a state-of-the-art approach to machine learning for text. Though new, BERT is already finding use in avionics, finance, semiconductor and telecom companies on the continent, said developers optimizing it for German and Swedish.

“There are so many use cases for BERT because text is one of the most common data types companies have,” said Anders Arpteg, head of research for Peltarion, a Stockholm-based developer that aims to make the latest AI techniques such as BERT inexpensive and easy for companies to adopt.

Natural-language processing will outpace today’s AI work in computer vision because “text has way more apps than images — we started our company on that hypothesis,” said Milos Rusic, chief executive of deepset in Berlin. He called BERT “a revolution, a milestone we bet on.”

Deepset is working with PricewaterhouseCoopers to create a system that uses BERT to help strategists at a chip maker query piles of annual reports and market data for key insights. In another project, a manufacturing company is using NLP to search technical documents to speed maintenance of their products and predict needed repairs.

Peltarion, a member of NVIDIA’s Inception program that nurtures startups with access to its technology and ecosystem, packed support for BERT into its tools in November. It is already using NLP to help a large telecom company automate parts of its process for responding to product and service requests. And it’s using the technology to let a large market research company more easily query its database of surveys.

Work in Localization

Peltarion is collaborating with three other organizations on a three-year, government-backed project to optimize BERT for Swedish. Interestingly, a new model from Facebook called XLM-R suggests training on multiple languages at once could be more effective than optimizing for just one.

“In our initial results, XLM-R, which Facebook trained on 100 languages at once, outperformed a vanilla version of BERT trained for Swedish by a significant amount,” said Arpteg, whose team is preparing a paper on their analysis.

Nevertheless, the group hopes to have before summer a first version of a Swedish BERT model that performs really well, said Arpteg, who headed up an AI research group at Spotify before joining Peltarion three years ago.

An analysis by deepset of its German version of BERT.

In June, deepset released as open source a version of BERT optimized for German. Although its performance is only a couple percentage points ahead of the original model, two winners in an annual NLP competition in Germany used the deepset model.

Right Tool for the Job

BERT also benefits from optimizations for specific tasks such as text classification, question answering and sentiment analysis, said Arpteg. Peltarion researchers plans to publish in 2020 results of an analysis of gains from tuning BERT for areas with their own vocabularies such as medicine and legal.

The question-answering task has become so strategic for deepset it created Haystack, a version of its FARM transfer-learning framework to handle the job.

In hardware, the latest NVIDIA GPUs are among the favorite tools both companies use to tame big NLP models. That’s not surprising given NVIDIA recently broke records lowering BERT training time.

“The vanilla BERT has 100 million parameters and XML-R has 270 million,” said Arpteg, whose team recently purchased systems using NVIDIA Quadro and TITAN GPUs with up to 48GB of memory. It also has access to NVIDIA DGX-1 servers because “for training language models from scratch, we need these super-fast systems,” he said.

More memory is better, said Rusic, whose German BERT models weigh in at 400MB. Deepset taps into NVIDIA V100 Tensor Core 100 GPUs on cloud services and uses another NVIDIA GPU locally.

The post BERT Does Europe: AI Language Model Learns German, Swedish appeared first on The Official NVIDIA Blog.

AWS Outposts Station a GPU Garrison in Your Datacenter

All the goodness of GPU acceleration on Amazon Web Services can now also run inside your own data center.

AWS Outposts powered by NVIDIA T4 Tensor Core GPUs are generally available starting today. They bring cloud-based Amazon EC2 G4 instances inside your data center to meet user requirements for security and latency in a wide variety of AI and graphics applications.

With this new offering, AI is no longer a research project.

Most companies still keep their data inside their own walls because they see it as their core intellectual property. But for deep learning to transition from research into production, enterprises need the flexibility and ease of development the cloud offers — right beside their data. That’s a big part of what AWS Outposts with T4 GPUs now enables.

With this new offering, enterprises can install a fully managed rack-scale appliance next to the large data lakes stored securely in their data centers.

AI Acceleration Across the Enterprise

To train neural networks, every layer of software needs to be optimized, from NVIDIA drivers to container runtimes and application frameworks. AWS services like Sagemaker, Elastic MapReduce and many others designed on custom-built Amazon Machine Images require model development to start with the training on large datasets. With the introduction of NVIDIA-powered AWS Outposts, those services can now be run securely in enterprise data centers.

The GPUs in Outposts accelerate deep learning as well as high performance computing and other GPU applications. They all can access software in NGC, NVIDIA’s hub for GPU-accelerated software optimization, which is stocked with applications, frameworks, libraries and SDKs that include pre-trained models.

For AI inference, the NVIDIA EGX edge-computing platform also runs on AWS Outposts and works with the AWS Elastic Kubernetes Service. Backed by the power of NVIDIA T4 GPUs, these services are capable of processing orders of magnitudes more information than CPUs alone. They can quickly derive insights from vast amounts of data streamed in real time from sensors in an Internet of Things deployment whether it’s in manufacturing, healthcare, financial services, retail or any other industry.

On top of EGX, the NVIDIA Metropolis application framework provides building blocks for vision AI, geared for use in smart cities, retail, logistics and industrial inspection, as well as other AI and IoT use cases, now easily delivered on AWS Outposts.

Alternatively, the NVIDIA Clara application framework is tuned to bring AI to healthcare providers whether it’s for medical imaging, federated learning or AI-assisted data labeling.

The T4 GPU’s Turing architecture uses TensorRT to accelerate the industry’s widest set of AI models. Its Tensor Cores support multi-precision computing that delivers up to 40x more inference performance than CPUs.

Remote Graphics, Locally Hosted

Users of high-end graphics have choices, too. Remote designers, artists and technical professionals who need to access large datasets and models can now get both cloud convenience and GPU performance.

Graphics professionals can benefit from the same NVIDIA Quadro technology that powers most of the world’s professional workstations not only on the public AWS cloud, but on their own internal cloud now with AWS Outposts packing T4 GPUs.

Whether they’re working locally or in the cloud, Quadro users can access the same set of hundreds of graphics-intensive, GPU-accelerated third-party applications.

The Quadro Virtual Workstation AMI, available in AWS Marketplace, includes the same Quadro driver found on physical workstations. It supports hundreds of Quadro-certified applications such as Dassault Systèmes SOLIDWORKS and CATIA; Siemens NX; Autodesk AutoCAD and Maya; ESRI ArcGIS Pro; and ANSYS Fluent, Mechanical and Discovery Live.

Learn more about AWS and NVIDIA offerings and check out our booth 1237 and session talks at AWS re:Invent.

The post AWS Outposts Station a GPU Garrison in Your Datacenter appeared first on The Official NVIDIA Blog.

NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data

With over 100 exhibitors at the annual Radiological Society of North America conference using NVIDIA technology to bring AI to radiology, 2019 looks to be a tipping point for AI in healthcare.

Despite AI’s great potential, a key challenge remains: gaining access to the huge volumes of data required to train AI models while protecting patient privacy. Partnering with the industry, we’ve created a solution.

Today at RSNA, we’re introducing NVIDIA Clara Federated Learning, which takes advantage of a distributed, collaborative learning technique that keeps patient data where it belongs — inside the walls of a healthcare provider.

Clara Federated Learning (Clara FL) runs on our recently announced NVIDIA EGX intelligent edge computing platform.

Federated Learning — AI with Privacy

Clara FL is a reference application for distributed, collaborative AI model training that preserves patient privacy. Running on NVIDIA NGC-Ready for Edge servers from global system manufacturers, these distributed client systems can perform deep learning training locally and collaborate to train a more accurate global model.

Here’s how it works: The Clara FL application is packaged into a Helm chart to simplify deployment on Kubernetes infrastructure. The NVIDIA EGX platform securely provisions the federated server and the collaborating clients, delivering everything required to begin a federated learning project, including application containers and the initial AI model.

NVIDIA Clara Federated Learning uses distributed training across multiple hospitals to develop robust AI models without sharing patient data.

Participating hospitals label their own patient data using the NVIDIA Clara AI-Assisted Annotation SDK integrated into medical viewers like 3D slicer, MITK, Fovia and Philips Intellispace Discovery. Using pre-trained models and transfer learning techniques, NVIDIA AI assists radiologists in labeling, reducing the time for complex 3D studies from hours to minutes.

NVIDIA EGX servers at participating hospitals train the global model on their local data. The local training results are shared back to the federated learning server over a secure link. This approach preserves privacy by only sharing partial model weights and no patient records in order to build a new global model through federated averaging.

The process repeats until the AI model reaches its desired accuracy. This distributed approach delivers exceptional performance in deep learning while keeping patient data secure and private.

US and UK Lead the Way

Healthcare giants around the world — including the American College of Radiology, MGH and BWH Center for Clinical Data Science, and UCLA Health — are pioneering the technology. They aim to develop personalized AI for their doctors, patients and facilities where medical data, applications and devices are on the rise and patient privacy must be preserved.

ACR is piloting NVIDIA Clara FL in its AI-LAB, a national platform for medical imaging. The AI-LAB will allow the ACR’s 38,000 medical imaging members to securely build, share, adapt and validate AI models. Healthcare providers that want access to the AI-LAB can choose a variety of NVIDIA NGC-Ready for Edge systems, including from Dell, Hewlett Packard Enterprise, Lenovo and Supermicro.

UCLA Radiology is also using NVIDIA Clara FL to bring the power of AI to its radiology department. As a top academic medical center, UCLA can validate the effectiveness of Clara FL and extend it in the future across the broader University of California system.

Partners HealthCare in New England also announced a new initiative using NVIDIA Clara FL. Massachusetts General Hospital and Brigham and Women’s Hospital’s Center for Clinical Data Science will spearhead the work, leveraging data assets and clinical expertise of the Partners HealthCare system.

In the U.K., NVIDIA is partnering with King’s College London and Owkin to create a federated learning platform for the National Health Service. The Owkin Connect platform running on NVIDIA Clara enables algorithms to travel from one hospital to another, training on local datasets. It provides each hospital a blockchain-distributed ledger that captures and traces all data used for model training.

The project is initially connecting four of London’s premier teaching hospitals, offering AI services to accelerate work in areas such as cancer, heart failure and neurodegenerative disease, and will expand to at least 12 U.K. hospitals in 2020.

Making Everything Smart in the Hospital 

With the rapid proliferation of sensors, medical centers like Stanford Hospital are working to make every system smart. To make sensors intelligent, devices need a powerful, low-power AI computer.

That’s why we’re announcing NVIDIA Clara AGX, an embedded AI developer kit that can handle image and video processing at high data rates, bringing AI inference and 3D visualization to the point of care.

NVIDIA Clara AGX scales from small, embedded devices to sidecar systems to full-size servers.

Clara AGX is powered by NVIDIA Xavier SoCs, the same processors that control self-driving cars. They consume as little as 10W, making them suitable for embedding inside a medical instrument or running in a small adjacent system.

A perfect showcase of Clara AGX is Hyperfine, the world’s first portable point-of-care MRI system. The revolutionary Hyperfine system will be on display in NVIDIA’s booth at this week’s RSNA event.

Hyperfine’s system is among the first of many medical instruments, surgical suites, patient monitoring devices and smart medical cameras expected to use Clara AGX. We’re witnessing the beginning of an AI-enabled internet of medical things.

Hyperfine’s mobile MRI system uses an NVIDIA GPU and will be on display at NVIDIA’s booth.

The NVIDIA Clara AGX SDK will be available soon through our early access program. It includes reference applications for two popular uses — real-time ultrasound and endoscopy edge computing.


Visit NVIDIA and our many healthcare partners in booth 10939 in the RSNA AI Showcase. We’ll be showing our latest AI-driven medical imaging advancements, including keeping patient data secure with AI at the edge.

Find out from our deep learning experts how to use AI to advance your research and accelerate your clinical workflows. See the full lineup of talks and learn more on our website.


The post NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data appeared first on The Official NVIDIA Blog.

Life Observed: Nobel Winner Sees Biology’s Future with GPUs

Five years ago, when Eric Betzig got the call he won a Nobel Prize for inventing a microscope that could see features as small as 20 nanometers, he was already working on a new one.

The new device captures the equivalent of 3D video of living cells — and now it’s using NVIDIA GPUs and software to see the results.

Betzig’s collaborator at the University of California at Berkeley, Srigokul Upadhyayula (aka Gokul), helped refine the so-called Lattice Light Sheet Microscopy (LLSM) system. It generated 600 terabytes of data while exploring part of the visual cortex of a mouse in work published earlier this year in Science magazine. A 1.3TB slice of that effort was on display at NVIDIA’s booth at last week’s SC19 supercomputing show.

Attendees got a glimpse of how tomorrow’s scientists may unravel medical mysteries. Researchers, for example, can use LLSM to watch how protein coverings on nerve axons degrade as diseases such as muscular sclerosis take hold.

Future of Biology: Direct Visualization

“It’s our belief we will never understand complex living systems by breaking them into parts,” Betzig said of methods such as biochemistry and genomics. “Only optical microscopes can look at living systems and gather information we need to truly understand the dynamics of life, the mobility of cells and tissues, how cancer cells migrate. These are things we can now directly observe.

“The future of biology is direct visualization of living things rather than piecing together information gleaned by very indirect means,” he added.

It Takes a Cluster — and More

Such work comes with heavy computing demands. Generating the 600TB dataset for the Science paper “monopolized our institution’s computing cluster for days and weeks,” said Betzig.

“These microscopes produce beautifully rich data we often cannot visualize because the vast majority of it sits in hard drives, completely useless,” he said. “With NVIDIA, we are finding ways to start looking at it.”

The SC19 demo — a multi-channel visualization of a preserved slice of mouse cortex — ran remotely on six NVIDIA DGX-1 servers, each packing eight NVIDIA V100 Tensor Core GPUs. The systems are part of an NVIDIA SATURNV cluster located near its headquarters in Santa Clara, Calif.

Berkeley researchers gave SC19 attendees a look inside the visual cortex of a mouse — visualized using NVIDIA IndeX.

The key ingredient for the demo and future visualizations is NVIDIA IndeX software, an SDK that allows scientists and researchers to see and interact in real time with massive 3D datasets.

Version 2.1 of IndeX debuted at SC19, sporting a host of new features, including GPUDirect Storage, as well as support for Arm and IBM POWER9 processors.

After seeing their first demos of what IndeX can do, the research team installed it on a cluster at UC Berkeley that uses a dozen NVIDIA TITAN RTX and four V100 Tensor Core GPUs. “We could see this had incredible potential,” Gokul said.

Closing a Big-Data Gap

The horizon holds plenty of mountains to climb. The Lattice scope generates as much as 3TB of data an hour, so visualizations are still often done on data that must be laboriously pre-processed and saved offline.

“In a perfect world, we’d have all the information for analysis as we get the data from the scope, not a month or six months later,” said Gokul. The time between collecting and visualizing data can stretch from weeks to months, but “we need to tune parameters to react to data as we’re collecting it” to make the scope truly useful for biologists, he added.

NVIDIA IndeX software, running on its increasingly powerful GPUs, helps narrow that gap.

In the future, the team aims to apply the latest deep learning techniques, but this too presents heady challenges. “There are no robust AI models to deploy for this work today,” Gokul said.

Making the data available to AI specialists who could craft AI models would require shipping crates of hard drives on an airplane, a slow and expensive proposition. That’s because the most recent work produced over half a petabyte of data, but cloud services often limit uploads and downloads to a terabyte or so per day.

Betzig and Gokul are talking with researchers at cloud giants about new options, and they’re exploring new ways to leverage the power of GPUs because the potential of their work is so great.

Coping with Ups and Downs

“Humans are visual animals,” said Betzig. “When most people I know think about a hypothesis, they create mental visual models.

“The beautiful thing about microscopy is you can take a model in your head with all its biases and immediately compare it to the reality of living biological images. This capability already has and will continue to reveal surprises,” he said.

The work brings big ups and downs. Winning a Nobel Prize “was a shock,” Betzig said. “It kind of felt like getting hit by a bus. You feel like your life is settled and then something happens to change you in ways you wouldn’t expect — it has good and bad sides to it.”

Likewise, “in the last several years working with Gokul, every microscope had its limits that led us to the next one. You take five or six steps up to a plateau of success and then there is a disappointment,” he said.

In the partnership with NVIDIA, “we get to learn what we may have missed,” he added. “It’s a chance for us to reassess things, to understand the GPU from folks who designed the architecture, to see how we can merge our problem sets with new solutions,” he said.

Note: The picture at top shows Berkeley researchers Eric Betzig, Ruixian Gao and Srigokul Upadhyayula with the Lattice Light Sheet microscope.

The post Life Observed: Nobel Winner Sees Biology’s Future with GPUs appeared first on The Official NVIDIA Blog.

Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show

Back in the day, the annual SC supercomputing conference was filled with tabletops hung with research posters. Three decades on, the show’s Denver edition this week was a sea of sharp-angled booths, crowned with three-dimensional signage, promoting logos in a multitude of blues and reds.

But nowhere on the SC19 show floor drew more of the show’s 14,000 attendees than NVIDIA’s booth, built around a broad, floor-to-ceiling triangle with 2,500 square feet of ultra-high def LED screens. With a packed lecture hall on one side and HPC simulations playing on a second, it was the third wall that drew the most buzz.

Cycling through was a collection of AI-enhanced photos of several hundred GPU developers — grad students, CUDA pioneers, supercomputing rockstars — together with descriptions of their work.

Like accelerated computing’s answer to baseball cards, they were rendered into art using AI style transfer technology inspired by various painters — from the classicism of Vermeer to van Gogh’s impressionism to Paul Klee’s abstractions.

Meanwhile, NVIDIA sprinted through the show, kicking things off with a news-filled keynote by founder and CEO Jensen Huang, helping to power research behind the two finalists nominated for the Gordon Bell prize, and joining in to celebrate its partner Mellanox.

And in its booth, 200 engineers took advantage of free AI training through the Deep Learning Institute and dozens of tech talks were provided by leading researchers packed in shoulder to shoulder.

Wall in the Family 

Piecing together the Developer Wall project took a dozen NVIDIANs scrambling for weeks in their spare time. The team of designers, technologists and marketers created an app where developers could enter some background, which would be paired with their photo once it’s run through style filters at, a German startup that’s part of NVIDIA’s Inception startup incubator.

“What we’re trying to do is showcase and celebrate the luminaries in our field. They amazing work they’ve done is the reason this show exists,” said Doug MacMillian, a developer evangelist who helped run the big wall initiative.

Behind him flashed an image of Jensen Huang, rendered as if painted by Cezanne. Alongside him was John Stone, the legendary HPC researcher at the University of Illinois, as if painted by Vincent Van Gogh. Close by were Erik Lindahl, who heads the international GROMACS molecular simulation project, right out of a Joan Miró painting. Paresh Kharya, a data center specialist at NVIDIA, looked like an abstracted sepia-tone circuit board.

Enabling the Best and Brightest 

That theme — how NVIDIA’s working to accelerate the work of people in an ever growing array of industries — continued behind the scenes.

In a final rehearsal hours before Huang’s keynote, Ashley Korzun — a Ph.D. engineer who’s spent years working on the manned mission to Mars set for the 2030s — saw for the first time a demo visualizing her life’s work at the space agency.

As she stood on stage, she witnessed an event she’s spent years simulating purely with data – the fiery path that the Mars lander, a capsule the size of a two-story condo, will take as it slows in seven dramatic minutes from 12,000 miles an hour to gently stick its landing on the Red Planet.

“This is amazing,” she quietly said through tears. “I never thought I’d be able to visualize this.”

Flurry of News

Huang later took the stage and in a broad-sweeping two hour keynote set out a range of announcements that show how NVIDIA’s helping others do their life’s work, including:

Award-Winning Work

SC19 plays host to a series of awards throughout the show, and NVIDIA featured in a number of them.

Both finalists for the Gordon Bell Prize for outstanding achievement in high performance computing — the ultimate winner, ETH Zurich, as well as University of Michigan — ran their work on Oak Ridge National Laboratory’s Summit supercomputer, powered by nearly 28,000 V100 GPUs.

NVIDIA’s founding chief scientist, David Kirk, received this year’s Seymour Cray Computer Engineering Award, for innovative contributions to HPC systems. He was recognized for his path-breaking work around development of the GPU.

And NVIDIA’s Vasily Volkov co-authored with UC Berkeley’s James Demmel a seminal paper 11 years ago recognized with the Time of Time Award  for a work of lasting impact. The paper, which has resulted in a new way of thinking and modeling algorithms on GPUs, has had nearly 1,000 citations.

Looking Further Ahead

If the SC show is about powering the future, no corner of the show was more forward looking than the annual Supercomputing Conference Student Cluster Competition.

This year, China’s Tsinghua University captured the top crown. It beat out 15 other undergrad teams using NVIDIA V100 Tensor Core GPUs in an immersive HPC challenge demonstrating the breadth of skills, technologies and science that it takes to build, maintain and use supercomputers. Tsinghua also won the IO500 competition, while two other prizes were won by Singapore’s Nanyang Technological University.

The teams came from xx different markets, including Germany, Latvia, Poland and Taiwan, in addition to China and Singapore.

Up Next: More Performance for the World’s Data Centers

NVIDIA’s frenetic week at SC19 ended with a look at what’s next, with Jensen joining Mellanox CEO Eyal Waldman on stage at an evening event hosted by the networking company, which NVIDIA agreed to acquire earlier this year.

Jensen and Eyal discussed how their partnership will enable the future of computing, with Jensen detailing the synergies between the companies. “Mellanox has an incredible vision,” Huang said. ““In a couple years we’re going to bring more compute performance to data centers than all of the compute since the beginning of time.”

The post Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show appeared first on The Official NVIDIA Blog.

Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm

Broadening support for GPU-accelerated supercomputing to a fast-growing new platform, NVIDIA founder and CEO Jensen Huang Monday introduced a reference design for building GPU-accelerated Arm servers, with wide industry backing.

Huang — speaking Monday at the SC19 supercomputing show in Denver — also announced that Microsoft has built NDv2, a “supersized instance” that’s the world’s largest GPU-accelerated cloud-based supercomputer — a supercomputer in the cloud — on its Azure cloud-computing platform.

He additionally unveiled NVIDIA Magnum IO, a suite of GPU-accelerated I/O and storage software to eliminate data transfer bottlenecks for AI, data science and HPC workloads.

In a two-hour talk, Huang wove together these announcements with an update on developments from around the industry, setting out a sweeping vision of how high performance computing is expanding out in all directions.

HPC Universe Expanding in All Directions

“The HPC universe is expanding in every single direction at the same time,” Huang told a standing-room only crowd of some 1,400 researchers and technologists at the start of the world’s biggest supercomputing event. “HPC is literally everywhere today. It’s in supercomputing centers, in the cloud, at the edge.”

Driving that expansion are factors such as streaming HPC from massive sensor arrays; using edge computing to do more sophisticated filtering; running HPC in the cloud; and using AI to accelerate HPC.

“All of these are undergoing tremendous change,” Huang said.

Putting an exclamation mark on his talk, Huang debuted the world’s largest interactive volume visualization: An effort with NASA to simulate a Mars landing in which a craft the size of a two-story condominium traveling at 12,000 miles an hour screeches safely to a halt in just seven minutes. And it sticks the landing.

Huang said the simulation enables 150 terabytes of data, equivalent to 125,000 DVDs, to be flown through at random access. “To do that, we’ll have a supercomputing analytics instrument that sits next to a supercomputer.”

Expanding the Universe for HPC

Kicking off his talk, Huang detailed how accelerated computing powers the work of today’s computational scientists, whom he calls the da Vincis of our time.

The first AI supercomputers already power scientific research into phenomena as diverse as fusion energy and gravitational waves, Huang explained.

Accelerated computing, meanwhile, powers exascale systems tackling some of the world’s most challenging problems.

They include efforts to identify extreme weather patterns at Lawrence Berkeley National Lab … Research into the genomics of opioid addiction at Oak Ridge National Laboratory … Nuclear waste remediation efforts led by LBNL, the Pacific Northwest National Lab and Brown University at the Hanford site … And cancer-detection research led by Oak Ridge National Laboratory and the State University of New York at Stony Brook.

At the same time, AI is being put to work across an ever-broader array of industries. Earlier this month, the U.S. Post Office, the world’s largest delivery service — which processes nearly 500 million pieces of mail a day — announced it’s adopting end-to-end AI technology from NVIDIA.

“It’s the perfect application for a streaming AI computer,” Huang said.

And last month, in partnership with Ericsson, Microsoft, Red Hat and others, Huang revealed that NVIDIA is powering AI at the edge of enterprise and 5G telco networks with the NVIDIA EGX Edge Supercomputing platform.

Next up for HPC: harnessing vast numbers of software-defined sensors to relay data to programmable edge computers, which in turn pass on the most interesting data to supercomputers able to wring insights out of oceans of real-time data.

Arm in Arm: GPU-Acceleration Speeds Emerging HPC Architecture

Monday’s news marks a milestone for the Arm community. The processor architecture — ubiquitous in smartphones and IoT devices — has long been the world’s most popular. Arm has more than 100 billion computing devices and will cross the trillion mark in the coming years, Huang predicted.

NVIDIA’s moving fast to bring HPC tools of all kinds to this thriving ecosystem.

“We’ve been working with the industry, all of you, and the industry has really been fantastic, everybody is jumping on,” Huang said, adding that 30 applications are already up and running. “This is going to be a great ecosystem — basically everything that runs in HPC should run on any CPU as well.”

World-leading supercomputing centers have already begun testing GPU-accelerated Arm-based computing systems, Huang said. This includes Oak Ridge and Sandia National Laboratories, in the United States; the University of Bristol, in the United Kingdom; and Riken, in Japan.

NVIDIA’s reference design for GPU-accelerated Arm servers — comprising both hardware and software building blocks — has already won support from key players in HPC and Arm ecosystems, Huang said.

In the Arm ecosystem, NVIDIA is teaming with Arm, Ampere, Fujitsu and Marvell. NVIDIA is also working with Cray, a Hewlett Packard Enterprise company, and HPE. A wide range of HPC software companies are already using NVIDIA CUDA-X libraries to bring their GPU-enabled management and monitoring tools to the Arm ecosystem.

The reference platform’s debut follows NVIDIA’s announcement earlier this year that it will bring its CUDA-X software platform to Arm. Fulfilling this promise, NVIDIA is previewing its Arm-compatible software developer kit — available for download now — consisting of NVIDIA CUDA-X libraries and development tools for accelerated computing.

Microsoft Brings GPU-Powered Supercomputer to Azure

“This puts a supercomputer in the hands of every scientist in the world,” Huang said he announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure.

Giving HPC researchers and others instant access to unprecedented amounts of GPU computing power, Huang announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure that ranks among the world’s fastest.

“Now you can open up an instance, you grab one of the stacks … in the container, you launch it, on Azure, and you’re doing science,” Huang said. “It’s really quite fantastic.”

Built to handle the most demanding AI and HPC applications, the Azure NDv2 instance can scale up to 800 NVIDIA V100 Tensor Core GPUs interconnected with Mellanox InfiniBand.

For the first time, researchers and others can rent an entire AI supercomputer on demand, matching the capabilities of large-scale, on-premise supercomputers that can take months to deploy.

AI researchers needing fast solutions can quickly spin up multiple Azure NDv2 instances and train complex conversational AI models in just hours, Huang explained.

For example, Microsoft and NVIDIA engineers used 64 NDv2 instances on a pre-release version of the cluster to train BERT, a popular conversational AI model, in roughly three hours.

Magnum IO Software

Helping AI researchers and data scientists move data in minutes, rather than hours, Huang introduced the NVIDIA Magnum IO software suite.

A standing-room only crowd of some 1,400 researchers and technologists came to hear NVIDIA’s keynote at the start of SC19, the world’s top supercomputing event.

Delivering up to 20x faster data processing for multi-server, multi-GPU computing nodes, Mangum IO eliminates a key bottleneck faced by those carrying out complex financial analysis, climate modeling and other high-performance workloads.

“This is an area that is going to be rich with innovation, and we are going to be putting a lot of energy into helping you move information in and out of the system,” Huang said.

A key feature of Magnum IO is NVIDIA GPUDirect Storage, which provides a direct data path between GPU memory and storage, enabling data to bypass CPUs and travel unencumbered on “open highways” offered by GPUs, storage and networking devices.

NVIDIA developed Magnum in close collaboration with industry leaders in networking and storage, including DataDirect Networks, Excelero, IBM, Mellanox and WekaIO.

The post Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm appeared first on The Official NVIDIA Blog.

DC Startup Casts an AI Net to Stop Phishing and Malware

When the price went way up on a key service a small Washington, D.C., firm was using to protect its customers’ internet connectivity, the company balked.

After not finding a suitable alternative, the company decided to build its own. The result was a whole new business, called DNSFilter, which is casting a wide net around the market to combat phishing and malware.

Its innovation: It ditched the crowdsourcing model that has served for more than a decade as the bedrock for identifying whether websites are valid or corrupt. It opted, instead, for GPU-powered AI to make web surfing safer by identifying threats and objectionable content much faster than traditional offerings.

“We figured that if we built a whole new DNS from the ground up, built on artificial intelligence and machine learning, we could find threats faster and more effectively,” said Rustin Banks, chief revenue officer and one of four principals at DNSFilter.

Spinning Up Phishing Protection

DNS, or domain name system, is the naming system for computers, phones and services that connect to the internet. DNSFilter’s aim is to protect these assets from malicious websites and attacks.

The company’s algorithm takes seconds to compare websites to a machine learning model generated from 30,000 known phishing sites. To date, its AI prevents over 90 percent of new requests to visit potentially corrupt sites.

It’s this speed that largely separates DNSFilter from the rest of the industry, Banks said. It gets results in near real time, while competitors typically take around 24 hours.

The company’s algorithm has been built and trained in the cloud using NVIDIA P4 GPU clusters.

“NVIDIA GPUs allow us to rapidly train AI, while being able to use cutting-edge frameworks. It’s not a job I would want to do without them,” said Adam Spotton, chief data scientist at DNSFilter.

Inferencing occurs at 48 locations worldwide, hosted by 10 vendors who’ve passed DNSFilter’s rigorous security standards.

Banks said the company’s rivals primarily use a company in the Philippines that has a team of 150 people classifying sites all day. But for DNSFilter, the more corrupt sites it identifies, the faster and more accurate its algorithm becomes. (Disclosure: NVIDIA is one of the company’s biggest customers.)

Moreover, DNSFilter’s solution works at the network level so there’s no plug-in necessary and the solution works with any email client, protecting organizations regardless of where employees are or what device they’re using.

“If the CFO uses his Yahoo mail on his mobile device, it doesn’t matter,” said Banks. “It’s built right into the fabric of the internet request.”

Upping the Ante

Banks estimates that DNS filtering represents a billion-dollar market, and he’s confident that the $10 billion firewall market is in play for DNSFilter.

Already, the startup is fielding more than a billion DNS requests a day. Banks foresees that number rising to 10 billion by the end of 2020. He also expects accuracy will come to exceed 99 percent as the dataset of corrupt sites grows.

The company isn’t stopping there. More services are planned, including a log -analysis product currently in beta. It scans logos on sites linked from phishing emails and compares them against a database of approved sites to determine whether the logo is real. It then blocks phishing sites in real time.

Eventually, Banks said, the company intends to evolve from its current machine learning feedback loop to a neural network with sufficient cognition to identify things that its algorithms can’t find.

This, he said, would be like having an extra pair of eyes inside an organization’s security team, constantly monitoring suspicious web surfing wherever employees may be working.

“This is taking phishing protection to a new level,” said Banks. “It’s like network-level protection that comes with you wherever you go.”

The post DC Startup Casts an AI Net to Stop Phishing and Malware appeared first on The Official NVIDIA Blog.