NVIDIA Sets Six Records in AI Performance

NVIDIA has set six AI performance records with today’s release of the industry’s first broad set of AI benchmarks.

Backed by Google, Intel, Baidu, NVIDIA and dozens more technology leaders, the new MLPerf benchmark suite measures a wide range of deep learning workloads. Aiming to serve as the industry’s first objective AI benchmark suite, it covers such areas as computer vision, language translation, personalized recommendations and reinforcement learning tasks.

NVIDIA achieved the best performance in the six MLPerf benchmark results it submitted for. These cover a variety of workloads and infrastructure scale – ranging from 16 GPUs on one node to up to 640 GPUs across 80 nodes.

The six categories include image classification, object instance segmentation, object detection, non-recurrent translation, recurrent translation and recommendation systems. NVIDIA did not submit results for the seventh category for reinforcement learning, which does not yet take advantage of GPU acceleration.

A key benchmark on which NVIDIA technology performed particularly well was language translation, training the Transformer neural network in just 6.2 minutes. More details on all six submissions are available on the NVIDIA Developer news center.

NVIDIA engineers achieved their results on NVIDIA DGX systems, including NVIDIA DGX-2, the world’s most powerful AI system, featuring 16 fully connected V100 Tensor Core GPUs.

NVIDIA is the only company to have entered as many as six benchmarks, demonstrating the versatility of V100 Tensor Core GPUs for the wide variety of AI workloads deployed today.

“The new MLPerf benchmarks demonstrate the unmatched performance and versatility of NVIDIA’s Tensor Core GPUs,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “Exceptionally affordable and available in every geography from every cloud service provider and every computer maker, our Tensor Core GPUs are helping developers around the world advance AI at every stage of development.”

State-of-the-Art AI Computing Requires Full Stack Innovation

Performance on complex and diverse computing workloads takes more than great chips. Accelerated computing is about more than an accelerator. It takes the full stack.

NVIDIA’s stack includes NVIDIA Tensor Cores, NVLink, NVSwitch, DGX systems, CUDA, cuDNN, NCCL, optimized deep learning framework containers and NVIDIA software development kits.

NVIDIA’s AI platform is also the most accessible and affordable. Tensor Core GPUs are available on every cloud and from every computer maker and in every geography.

The same power of Tensor Core GPUs is also available on the desktop, with the most powerful desktop GPU, NVIDIA TITAN RTX costing only $2,500. When amortized over three years, this translates to just a few cents per hour.

And the software acceleration stacks are always updated on the NVIDIA GPU Cloud (NGC) cloud registry.

NVIDIA’s Record-Setting Platform Available Now on NGC

The software innovations and optimizations used to achieve NVIDIA’s industry-leading MLPerf performance are available free of charge in our latest NGC deep learning containers. Download them from the NGC container registry.

The containers include the complete software stack and the top AI frameworks, optimized by NVIDIA. Our 18.11 release of the NGC deep learning containers includes the exact software used to achieve our MLPerf results.

Developers can use them everywhere, at every stage of development:

  • For data scientists on desktops, the containers enable cutting-edge research with NVIDIA TITAN RTX GPUs.
  • For workgroups, the same containers run on NVIDIA DGX Station.
  • For enterprises, the containers accelerate the application of AI to their data in the cloud with NVIDIA GPU-accelerated instances from Alibaba Cloud, AWS, Baidu Cloud, Google Cloud Platform, IBM Cloud, Microsoft Azure, Oracle Cloud Infrastructure and Tencent Cloud.
  • For organizations building on-premise AI infrastructure, NVIDIA DGX systems and NGC-Ready systems from Atos, Cisco, Cray, Dell EMC, HP, HPE, Inspur, Lenovo, Sugon and Supermicro put AI to work.

To get started on your AI project, or to run your own MLPerf benchmark, download containers from the NGC container registry.

The post NVIDIA Sets Six Records in AI Performance appeared first on The Official NVIDIA Blog.

2019 CES

Intel technology is the foundation for the world’s most important innovations and advances.

At CES 2019, we will share our vision for the future of computing and explore advancements in the client, network, cloud and edge — designed to power the next era of computing in areas including PC innovation, artificial intelligence, 5G connectivity and autonomous driving.

Others predict the future. We’re building it.



Intel at 2018 CES


» Download all DAY 1 images (ZIP, 198 MB)
» Download all DAY 2 images (ZIP, 38 MB)
» Download all DAY 3 images (ZIP, 2 MB)

The post 2019 CES appeared first on Intel Newsroom.

Media Alert: Intel to Showcase Technology Innovation for the Next Era of Computing at CES 2019

gbryant nshenoy 2x1
Gregory Bryant (left) and Navin Shenoy will lead Intel’s news conference at 4 p.m. Jan. 7, 2019, at 2019 CES.

Intel technology is the foundation for many of the world’s most important innovations and advances. At CES 2019, Intel will highlight the future of computing and explore advancements in the client, network, cloud and edge designed to unlock human potential.

Intel News Conference – “Innovations and the Compute Foundation”

Join Intel executives, Client Computing Group Senior Vice President Gregory Bryant and Data Center Group Executive Vice President Navin Shenoy, who will take the stage to showcase news related to innovations in client computing, data center, artificial intelligence, 5G and more. During the event, Intel will touch on how expanding technology capabilities have a direct impact on human experiences. Please note, there is limited seating. Doors open to press and analysts at 3:30 p.m. PST.

Where: Mandalay Bay South Convention Center
Level 2, Ballrooms E & F
When: Jan. 7, 4-4:45 p.m. PST
Livestream:    Watch on the Intel Newsroom

Intel Press Breakfast & Booth Preview

Mix and mingle with Intel executives and take part in a guided tour of Intel’s booth to see the latest Intel technology in action – before the CES show floor opens to attendees. The CES opening keynote will be livestreamed at 8:30 a.m., and a light continental breakfast will be served. Please note, only credentialed press will be allowed access, and registration is requested at the link below.

Where:    Las Vegas Convention Center
Intel Booth in Central Hall South (Booth #10048)
When: Jan. 8, 7:30-9:30 a.m. PST


» Register for the Press Breakfast

Mobileye Press & Customer Conference – “An Hour with Amnon”

Join Mobileye CEO Amnon Shashua as he delivers a “state of the state” on automated driving technologies, along with a look at how these technologies are being delivered globally. Shashua will touch on Mobileye’s unique perspective on vision and mapping technologies, along with the company’s proposed model for industry-wide safety standards.

Where:    Las Vegas Convention Center Room S228
When: Jan. 8, 11:30 a.m.-12:30 p.m. PST


Visit Intel in the Central Hall South (Booth #10048)

Stop by our booth for an up-close look at how Intel’s technology is helping power the smart, connected data-centric world and how it is helping accelerate the expansion of human potential.

Where:    Las Vegas Convention Center
Central Hall South, C2 lobby entrance
When: Jan. 8, 10 a.m.-6 p.m. PST
Jan. 9 and Jan. 10, 9 a.m.-6 p.m. PST
Jan. 11, 9 a.m.-4 p.m. PST


Throughout CES:

  • Recharge in Intel’s Media-Only Lounge: Intel’s media lounge is a great getaway for you to enjoy. Located upstairs in the Intel booth, stop by to relax in our comfortable seating, snack, quench your thirst and use our dedicated internet access.
  • Spotlight Sessions: Throughout the week, Intel will host Spotlight Sessions at the Intel booth. They will focus on specific topics including 5G, autonomous driving, artificial intelligence and more. Final schedule will be uploaded to the CES press kit on the Intel Newsroom prior to the show.
  • Booth Demonstrations: Experience Intel technology and products.

Can’t Make It to CES 2019?

Visit our newsroom at newsroom.intel.com/2019-CES and follow us on social media at @IntelNews, @Intel and www.facebook.com/intel.

Media Contacts

Laurie Smith DeJong
(503) 313-6891

Erica Pereira Kubr
(415) 471-4970

The post Media Alert: Intel to Showcase Technology Innovation for the Next Era of Computing at CES 2019 appeared first on Intel Newsroom.

IBM and NVIDIA Deliver Proven Infrastructure for the AI Enterprise

A deluge of data is fueling AI innovation. It’s a trend that shows no sign of slowing.

As organizations in every industry around the globe attempt to streamline their data pipelines and maximize data science productivity, one challenge looms large: implementing AI initiatives effectively.

This is where IBM SpectrumAI with NVIDIA DGX comes in.

At the core of data science productivity is the infrastructure and software used for building and training machine learning and deep learning workloads.

With IBM and NVIDIA’s new converged infrastructure offering, organizations can take advantage of integrated compute, storage and networking. It’s the latest systems and software to support the complete lifecycle of AI — from data preparation to training to inference.

IBM SpectrumAI with NVIDIA DGX is built on:

  • IBM Spectrum Scale v5: Software-defined to streamline data movement through the AI data pipeline
  • NVIDIA DGX-1 servers: Purpose-built for AI and machine learning
  • NVIDIA DGX software stack: Optimized for maximum GPU training performance
  • Proven data performance: Over 100GB/s throughput, supporting up to nine DGX-1 servers in a single rack

IBM SpectrumAI with NVIDIA DGX helps businesses deploy AI infrastructure quickly, efficiently and with top-tier performance — and it’s easier for IT teams to manage.

It’s the latest addition to our NVIDIA DGX-1 Reference Architecture lineup, which includes data center solutions from select storage technology partners. The solutions help enterprises and their data science teams:

  • Focus on innovation, research and transforming the world through their AI initiatives.
  • Minimize the design complexities behind architectures optimized for AI workloads.
  • Effortlessly scale AI workloads with predictable performance that’s also cost effective.

IBM software-defined storage offers performance, flexibility and extensibility for the AI data pipeline. NVIDIA DGX-1 provides the fastest path to machine learning and deep learning. Pairing the two results in an integrated, turnkey AI infrastructure solution with proven productivity, agility and scalability.

Register for our joint webinar on Jan. 29, or check out these resources to learn more:

The post IBM and NVIDIA Deliver Proven Infrastructure for the AI Enterprise appeared first on The Official NVIDIA Blog.

Twice as Nice: NVIDIA Powers Not One, But Two, Gordon Bell Prizes

Award-winning research on opioid addiction and climate change fueled by NVIDIA researchers and tens of thousands of NVIDIA GPUs.

There was a big surprise at the highly anticipated awards ceremony at SC18, the annual supercomputing show in Dallas this week. Usually, one team of researchers receives the coveted Gordon Bell Prize, the equivalent of an “Oscar” in the world of high performance computing.

But this year, that didn’t happen.

In a rare move, the award committee split the prize — honoring one team for its groundbreaking research on opioid addiction and another for breakthrough discoveries in climate change.

Conducting their research on Summit, the world’s fastest supercomputer, both teams had tapped into the power of tens of thousands of NVIDIA GPUs. Housed at the U.S. Department of Energy’s Oak Ridge National Laboratory, Summit features 27,648 NVIDIA V100 Tensor Core GPUs. The world’s most advanced data center GPU, V100s provide unique multi-precision computing capabilities that are ideal for AI and machine learning applications.

The climate change team benefited from another type of NVIDIA power — NVIDIA brain power. Jointly led with the Lawrence Berkeley National Laboratory, half of the 12-person team are NVIDIA deep learning computer scientists.

The team used deep learning algorithms to train a neural network to identify extreme weather patterns from high-resolution climate simulations.

Led by researchers at ORNL, the team studying opioid addiction developed a new “CoMet” algorithm that allows a supercomputer to process vast amounts of genetic data and identify genes that may be more susceptible to pain and opioid addiction — as well as promising treatments.

For more information about ORNL’s opioid epidemic project and Lawrence Berkeley National Laboratory’s climate change project, visit ACM’s website.

The post Twice as Nice: NVIDIA Powers Not One, But Two, Gordon Bell Prizes appeared first on The Official NVIDIA Blog.

That Was Fast: GPUs Now Accelerate Almost 600 HPC Apps

Just over 10 years ago accelerated applications didn’t exist. Today, almost 600 are accelerated by NVIDIA GPUs. The reason: GPU acceleration works. And that’s why it’s been put to work on the hardest computing jobs on earth.

These are apps that get work done in physics, bioscience, molecular dynamics, chemistry and weather forecasting. The world’s 15 most popular HPC applications are all GPU accelerated. In the last year, we’ve added more than 100 applications to our NVIDIA GPU Applications Catalog. More are coming.

A report by Intersect 360 research identified the key applications running in the data center. All the top 15 apps were GPU accelerated. It’s a murderer’s row of hard-core science apps. They include:

  • GROMACS (Chemistry) – Molecular dynamics application for simulating Newtonian equations of motion for systems with hundreds to millions of particles.
  • ANSYS (Fluid Dynamics Analysis) – Simulates the interaction of liquids and gases with surfaces.
  • Gaussian (Chemistry) – Predicts energies, molecular structures and vibrational frequencies of molecular systems.
  • VASP (Chemistry) – Performing ab-initio quantum-mechanical molecular dynamics simulations.
  • NAMD (Chemistry) – High-performance simulation of large biomolecular systems.
  • Simulia Abaqus (Structural Analysis) – Simulation and analysis of structural mechanics.
  • WRF (Weather/Environment Modeling) – Numerical weather prediction system designed for both atmospheric research and operational forecasting applications.
  • OpenFOAM (Fluid Dynamics Analysis) – Solver library for general-purpose CFD software
  • ANSYS (Structural Analysis) – Models 3D full-wave electromagnetic fields in high-frequency and high-speed electronic components.
  • LS-DYNA (Structural Analysis) – Simulation and analysis tool for structural mechanics.
  • BLAST (Bioscience) – One of the most widely used bioinformatics tools.
  • LAMMPS (Chemistry) – A classical molecular dynamics package.
  • Amber (Chemistry) – A molecular dynamics application developed for the simulation of biomolecular systems.
  • Quantum Espresso (Chemistry) – An integrated suite of computer codes for electronic structure calculations and materials modeling at the nanoscale.
  • GAMESS (Chemistry) – Computational chemistry suite used to simulate atomic and molecular electronic structure.

These tools don’t get incremental performance gains. GPU acceleration changes the economics of the data center. Servers with NVIDIA GPUs typically speed up the application performance by 10x or more.

And since the application performance does not scale linearly with the number of CPU servers, each GPU-accelerated server provides the performance of even more CPU servers than what just the speed-ups would imply. So you can meet the growing demand for computing — and save money.

Not bad for 10 years’ worth of work.

Predicting the Weather

Weather prediction looks hard. And it might be even harder than it looks. No surprise, then, that weather prediction is a big piece of HPC. Important, too. Reliable weather forecasts save lives. They also drive economic decisions in aviation, energy and utilities, insurance, retail and other industries.

But weather prediction requires massive computing resources. Two reasons: geometric scale (especially for global weather predictions), and the enormous number of variables that describe the state of the atmosphere.

Today weather prediction is limited by the amount of computing and application performance available. So today’s models are limited to low-resolution simulations, such as 12-km resolution.

But that leaves out important details, such as the impact of clouds, which play an important role in weather patterns by reflecting solar radiation. Going to 1-km cloud-resolving resolution can improve forecasting. But it requires 1,700x more application performance.

GPU acceleration can heft weather forecasts over that gap.

Accelerating Aerodynamics Simulations with FUN3D

Aircraft, spacecraft, automobiles. If it goes fast, large-scale aerodynamic simulations can help it go faster — and more efficiently.

The NASA Langley Research Center develops FUN3D computational fluid dynamics software to simulate fluid flow for a broad range of aerodynamics applications. This application consumes more cycles at NASA’s Pleiades supercomputer than any other. And GPU acceleration enables a server with six NVIDIA V100 Tensor Core GPUs to provide 30x higher performance than a dual-socket CPU server while running these simulations.

Takeaway: the performance on GPUs scales very well to enable efficient computation of the largest and the most complex simulations. NASA has shown that a thousand GPU servers on Summit supercomputer can do the work of over a million CPU cores. And for a fraction of the energy costs.

Performance That Keeps Growing

We have deep expertise in all accelerated computing domains. Combined with an ecosystem of more than 1 million developers, this results in a platform that’s constantly improving. This provides higher application performance over time on the same GPU-accelerated servers.

For instance, on a basket of 11 HPC applications, a server with 4 NVIDIA Tesla P100 GPUs now runs 2x faster compared to its performance from two years ago. Pair improvements in the software stack and GPU architecture advancements and you get even bigger performance gains.

With a single platform, you can now accelerate applications across a variety of HPC domains — scientific computing, industrial simulations, deep learning and machine learning. The harder the job, the bigger the payoff. So go ahead and accomplish wonders — or get your work done fast enough to see your kids — with GPU-accelerated applications.

To see the full list of GPU-accelerated applications, check out the NVIDIA GPU Applications Catalog.

The post That Was Fast: GPUs Now Accelerate Almost 600 HPC Apps appeared first on The Official NVIDIA Blog.

NVIDIA Enables Next Wave of Growth — Accelerated Data Science — for High Performance Computing

NVIDIA is teaming up with the world’s largest tech companies and the U.S.’s top supercomputing labs to accelerate data analytics and machine learning, one of the fastest growing areas of high performance computing.

The new initiative marks a key moment in our work accelerating HPC, a market expected to grow considerably over the next few years. While the world’s data doubles each year, CPU computing has hit a brick wall with the end of Moore’s law.

Together with partners such as Microsoft, Cisco, Dell EMC, Hewlett Packard Enterprise, IBM, Oracle and others, we’ve already sped up data tasks for our customers by as much as 50x. And initial testing by the U.S. Department of Energy’s Oak Ridge National Laboratory is showing a remarkable 215x speedup related to climate prediction research.

A Rapid Evolution

Starting a decade ago, we brought acceleration to scientific computing. Since then, we’ve helped researchers — including multiple Nobel Prize winners — dramatically speed up their compute-intensive simulations, tackling some of the world’s greatest problems.

Then, just over five years ago, we enabled our GPU platform to accelerate deep learning through optimized software, setting in motion the AI revolution.

Now, through new open-source data science acceleration software released last month, a third wave is upon us.

At the center of this new movement is RAPIDS, an open-source data analytics and machine learning acceleration platform for executing end-to-end data science training pipelines completely on GPUs.

RAPIDS relies on NVIDIA CUDA primitives for low-level compute optimization, but exposes that GPU parallelism and high memory bandwidth through user-friendly Python interfaces. The RAPIDS dataframe library mimics the pandas API and is built on Apache Arrow to maximize interoperability and performance.

More Accelerated Machine Learning in the Cloud

Now we’re partnering with the world’s leading technology companies to bring accelerated machine learning to more users in more places.

Working closely with NVIDIA, Microsoft is introducing accelerated machine learning to its Azure Machine Learning customers.

“Azure Machine Learning is the leading platform for data scientists to build, train, manage and deploy machine learning models from the cloud to the edge,” said Eric Boyd, corporate vice president for Azure AI at Microsoft. “We’ve been partnering with NVIDIA to offer GPU-powered compute for data scientists and are excited to introduce software from the RAPIDS open source project to Azure users. I’m looking forward to seeing what the data science community can do with RAPIDS and Azure Machine Learning.”

More Systems for Accelerated Machine Learning

We’re also collaborating on a range of new products from leading computer makers based on the NVIDIA HGX-2 cloud-server platform for all AI and HPC workloads.

Delivering two petaflops of compute performance in a single node, NVIDIA HGX-2 can run machine learning workloads nearly 550x faster than a CPU-only server.

The first HGX-2 based servers are from Inspur, QCT and Supermicro. All three companies are featuring their new HGX-2 servers on the exhibit hall of the annual high performance computing show, SC18, in Dallas this week.

More Scientific Breakthroughs Using Accelerated Machine Learning

Our nation’s most important labs are engaged in work ranging from fusion research and human genomics to climate prediction — work that relies on scientific computing, deep learning and data science.

NVIDIA DGX-2, designed to handle the most compute-intensive applications, offers them performance breakthroughs in the most demanding areas. Now, paired with RAPIDS open-source machine learning software, DGX-2 is helping scientists at several U.S. Department of Energy laboratories accelerate their research.

Among those witnessing early success with DGX-2 and RAPIDS are researchers at Oak Ridge National Lab.

Currently, there are massive amounts of observational data available to create models to enhance energy security applications involving climate simulations. However, historically, machine learning training on climate datasets has been compute limited and slow. Until now.

Using DGX-2 and RAPIDS, researchers at ORNL are already seeing massive improvements in the speed of applying machine learning to massive datasets. Running XGBoost on their DGX-2, ORNL reduced the time to train a 224GB model from 21 hours on a CPU node down to just six minutes — a 215x speed-up.

All of the RAPIDS open-source libraries for accelerating machine learning and data analytics are available at no charge. To get started, visit the NGC container registry.

The post NVIDIA Enables Next Wave of Growth — Accelerated Data Science — for High Performance Computing appeared first on The Official NVIDIA Blog.

Images: Intel Optane DC Persistent Memory Readies for Widespread Deployment

Memory Moment 1

Photo 1: Intel’s beta program for Intel Optane DC persistent memory allows original equipment manufacturers and cloud service providers to offer their customers early access to the revolutionary memory technology.

Photo 2: Leading original equipment manufacturers and cloud service providers have announced beta services and systems for early customer trials and deployments.

Photo 3: App Direct mode and Memory mode help developers fully harness the value of persistent memory.

More: Intel Optane DC Persistent Memory Readies for Widespread Deployment

The post Images: Intel Optane DC Persistent Memory Readies for Widespread Deployment appeared first on Intel Newsroom.

Intel Unveils the Intel Neural Compute Stick 2 at Intel AI Devcon Beijing for Building Smarter AI Edge Devices

Intel Neural Compute Stick 2 1

» Download all images (ZIP, 59 MB)

What’s New: Intel is hosting its first artificial intelligence (AI) developer conference in Beijing on Nov. 14 and 15. The company kicked off the event with the introduction of the Intel® Neural Compute Stick 2 (Intel NCS 2) designed to build smarter AI algorithms and for prototyping computer vision at the network edge. Based on the Intel® Movidius™ Myriad™ X vision processing unit (VPU) and supported by the Intel® Distribution of OpenVINO™ toolkit, the Intel NCS 2 affordably speeds the development of deep neural networks inference applications while delivering a performance boost over the previous generation neural compute stick. The Intel NCS 2 enables deep neural network testing, tuning and prototyping, so developers can go from prototyping into production leveraging a range of Intel vision accelerator form factors in real-world applications.

“The first-generation Intel Neural Compute Stick sparked an entire community of AI developers into action with a form factor and price that didn’t exist before. We’re excited to see what the community creates next with the strong enhancement to compute power enabled with the new Intel Neural Compute Stick 2.”
–Naveen Rao, Intel corporate vice president and general manager of the AI Products Group

What It Does: Bringing computer vision and AI to Internet of Things (IoT) and edge device prototypes is easy with the enhanced capabilities of the Intel NCS 2. For developers working on a smart camera, a drone, an industrial robot or the next must-have smart home device, the Intel NCS 2 offers what’s needed to prototype faster and smarter.

What looks like a standard USB thumb drive hides much more inside. The Intel NCS 2 is powered by the latest generation of Intel VPU – the Intel Movidius Myriad X VPU. This is the first to feature a neural compute engine – a dedicated hardware neural network inference accelerator delivering additional performance. Combined with the Intel Distribution of the OpenVINO toolkit supporting more networks, the Intel NCS 2 offers developers greater prototyping flexibility. Additionally, thanks to the Intel® AI: In Production ecosystem, developers can now port their Intel NCS 2 prototypes to other form factors and productize their designs.

How It Works: With a laptop and the Intel NCS 2, developers can have their AI and computer vision applications up and running in minutes. The Intel NCS 2 runs on a standard USB 3.0 port and requires no additional hardware, enabling users to seamlessly convert and then deploy PC-trained models to a wide range of devices natively and without internet or cloud connectivity.

The first-generation Intel NCS, launched in July 2017, has fueled a community of tens of thousands of developers, has been featured in more than 700 developer videos  and has been utilized in dozens of research papers. Now with greater performance in the NCS 2, Intel is empowering the AI community to create even more ambitious applications.

What is Happening at Intel AI DevCon Beijing: More than 1,000 AI developers, researchers and Intel customers and supporters are gathering at Intel® AI DevCon Beijing to collaborate on the advancement of AI and hear the latest updates on Intel’s AI portfolio of technologies, including:

  • Cascade Lake, a future Intel® Xeon® Scalable processor that will introduce Intel® Optane™ DC persistent memory and a set of new AI features called Intel DL Boost. This embedded AI accelerator is expected to speed deep learning inference workloads, with enhanced image recognition compared with current Intel Xeon Scalable processors. Cascade Lake is targeted to ship this year and ramp in 2019.
  • Intel’s Vision Accelerator Design Products targeted at AI inference and analytics performance on edge devices come in two forms: one that features an array of Intel Movidius VPUs and one built on the high-performance Intel® Arria® 10 FPGA. The accelerator solutions build on the OpenVINO toolkit that provides developers with improved neural network performance on a variety of Intel products and helps them further unlock cost-effective, real-time image analysis and intelligence within their IoT devices.
  • Spring Crest is the Intel® Nervana™ Neural Network Processor (NNP) that will be available in the market in 2019. The Intel Nervana NNP family leverages compute characteristics specific for AI deep learning, such as dense matrix multiplies and custom interconnects for parallelism.

More Context: Users Put Intel Movidius Neural Compute Stick to Real-Life Uses | New Intel Vision Accelerator Solutions Speed Deep Learning and Artificial Intelligence on Edge DevicesIntroducing Myriad X: Unleashing AI at the Edge|Beyond the CPU or GPU: Why Enterprise-Scale Artificial Intelligence Requires a More Holistic Approach | Intel Democratizes Deep Learning Application Development with Launch of Movidius Neural Compute Stick |Artificial Intelligence at Intel

» Download video: “Intel Movidius Neural Compute Stick (B-Roll)”

Intel, the Intel logo, Xeon, Intel Optane, Intel Nervana, Arria and Movidius are trademarks of Intel Corporation in the U.S. and/or other countries.

The post Intel Unveils the Intel Neural Compute Stick 2 at Intel AI Devcon Beijing for Building Smarter AI Edge Devices appeared first on Intel Newsroom.

SC18: High Performance Computing Demand Continues to Surge, Accelerated by NVIDIA GPUs

NVIDIA’s playing a bigger role in high performance computing than ever, just as supercomputing itself has become central to meeting the biggest challenges of our time.

Speaking just hours ahead of the start of the annual SC18 supercomputing conference in Dallas, NVIDIA CEO Jensen Huang told 700 researchers, lab directors and execs about forces that are driving the company to push both into “scale up” computing — focused on large supercomputing systems — as well as “scale out” efforts, for researchers, data scientists and developers to harness the power of however many GPUs they need.

“The HPC industry is fundamentally changing,” he told the crowd. “It started out in scientific computing, and the architecture was largely scale up. Its purpose in life was to simulate from first principles in the laws of physics. In the future, we will continue to do that, but we have a new tool – this tool is called machine learning.”

Machine learning — which has caught fire over the past decade among businesses and researchers — can now take advantage of both the ability to to scale up, with powerful GPU-accelerated machines, as well as the fast-growing ability to scale out workloads to sprawling GPU-powered data centers.

Scaling Up the World’s Supercomputers

That’s because data scientists are facing the same kinds of challenges that those in the hyperscale community have faced for more than a decade: the need to continue accelerating their work, even as Moore’s Law — which long drove increases in the computing power offered by CPUs — sputtered out.

This year 127 of the world’s top 500 supercomputers are powered by NVIDIA, according to the newly released Top500 list of the world’s fastest supercomputers. And fully half of their overall processing power is driven by GPUs.

In addition to powering the world’s two fastest supercomputers, NVIDIA GPUs power 22 of the top 25 machines on the Green500 list of the most energy-efficient supercomputers.

“The number one fastest supercomputer in the world, the fastest supercomputer in America, the fastest supercomputer in Europe and the fastest supercomputer in Japan, all powered by NVIDIA Volta V100,” Huang said.

One Architecture – Scaling Out, Scaling Up

NVIDIA’s bringing that ability to scale up to the world’s datacenters. Shortly after its introduction, NVIDIA’s new T4 Cloud GPU with revolutionary multi-precision Tensor Cores — is being adopted at a record pace. Huang said that it’s now available on Google Cloud Platform, in addition to being featured in 57 server designs from the world’s leading computer makers.

“I am just blown away by how fast Google Cloud works, in the 30 days from production, it was deployed at scale on the cloud, Huang said.

Another example of scaling up: NVIDIA’s DGX-2, a single node with 16 V100 GPUs connected with NVLink, which produce two petaflops of processing power.

Huang showed off the system several times to the audience, and joked that it’s “super heavy” and you have to be “super strong” to hold it, while hefting in the HGX-2 board – two of which are at heart of the DGX-2 – and which are being sold by leading OEMs.

“This is the hallmark of ‘scale up,’ this is what ‘scale up’ computing is all about” Huang said.

Because it all runs on a single software ecosystem that developers have painstakingly woven together over the past decade, Huang explained, developers can now instantly scale out across growing numbers of GPUs.

As part of that effort, Huang announced new multi-node HPC and visualization containers for the NGC container registry, which allow supercomputing users to run GPU-accelerated applications on large-scale clusters. NVIDIA also announced a new NGC-Ready program, including workstations and servers from top vendors.

These systems are ready to run a growing collection of software that just works, on one system or across many. “This isn’t the app store for apps you want, this is the app store for the aps you need,” Huang said. “It’s all lovingly maintained and tested and optimized.”

Compatible Architecture

All of this builds on the CUDA foundation NVIDIA has been developing for more than ten years. That foundation allows “investments made in the past to be carried into the future,” Huang explained.

Each new version of CUDA, now on version 10.0, offers faster performance than the last, and each new GPU architecture — Tesla, Fermi, Kepler, Maxwell, Pascal, Volta, and now Turing — accelerates software running on CUDA even further.

“Your investments in software last a long time, and the investment you make in hardware is instantly worthwhile,” Huang said.

GPUs Powering HPC Breakthroughs

This performance is paying off for top researchers. NVIDIA GPUs run on the newly commissioned world’s fastest Summit supercomputer at Oak Ridge National Laboratory and power five of the six finalists for the ACM Gordon Bell Prize, Huang said.

The winner of the coveted prize — which recognizes outstanding achievements in HPC — will be announced at the end of the show on Thursday.

Now GPUs are being put to work by growing numbers of businesses and researchers in machine learning. Huang noted that the entire computer industry is moving toward high performance computing.

RAPIDS, an open-source software suite for accelerating data analytics and machine learning, allows data scientists to execute end-to-end data science training pipelines completely on NVIDIA GPUs. It works the way data scientists work, Huang explained.

RAPIDS relies on NVIDIA CUDA primitives for low-level compute optimization, but exposes that GPU parallelism and high-memory bandwidth through user-friendly Python interfaces. The RAPIDS open source software library mimics the pandas API and is built on Apache Arrow to maximize interoperability and performance.

“We made it super easy to come onto this platform, and it’s completely open sourced,” Huang said.

Flower Power

Huang also showed the latest version of an inferencing demo that identifies flowers, which seems to grow more staggering every time he shows it. Initially showing how a system running Intel CPUs can classify five per second, he showed how NVIDIA’s new T4 GPU on Google Cloud can take advantage of NGC to recognize images of more than 50,000 flowers in a second.


And, in a striking model of galactic winds, Huang transported the audience to another galaxy — a dwarf galaxy one-fifth the size of the Milky Way — showed how IndeX, an NVIDIA plug-in for the ParaView data analysis and visualization application, harnesses GPUs to deliver  real-time performance on breathtakingly large datasets.

While the audience watched, with just a few keystrokes, a 7 terabyte data set was turned into an interactive simulation of gasses flying in all directions as a galaxy spun around its axis.

Huang also demonstrated how GPU-accelerated systems running key tools such as COSMO WRF and MPAs can tackle one of HPC’s biggest challenges — weather prediction — creating visually stunning models of the weather over the Alps.

“We’re not looking at a video right now, we’re looking at the future of predicting microclimates,” Huang said.

It’s only one latest example of why researchers are clamoring for GPU-powered systems. And why, thanks to GPUs, demand for the world’s fastest computers is growing faster than ever.


The post SC18: High Performance Computing Demand Continues to Surge, Accelerated by NVIDIA GPUs appeared first on The Official NVIDIA Blog.