GTC 2019: Huang Kicks Off GTC, Focuses on NVIDIA Datacenter Momentum, Blue Chip Partners

NVIDIA’s message was unmistakable as it kicked off the 10th annual GPU Technology Conference: it’s doubling-down on the datacenter.

Founder and CEO Jensen Huang delivered a sweeping opening keynote at San Jose State University, describing the company’s progress accelerating the sprawling datacenters that power the world’s most dynamic industries.

With a record GTC registered attendance of 9,000, he rolled out a spate of new technologies, detailed their broad adoption by industry leaders including Cisco, Dell, Hewlett-Packard Enterprise, and Lenovo, and highlighted how NVIDIA technologies are Communications by some of the world’s biggest names, including Accenture, Amazon, Charter Spectrum, Microsoft and Toyota.

“The accelerated computing approach that we pioneered is really taking off,” said Huang, who exactly a week ago announced the company’s $6.9 billion acquisition of Mellanox, a leader in high-performance computing interconnect technology.  “If you take look at what we achieved last year, the momentum is absolutely clear.”

To be sure, Huang also detailed progress outside the data center, rolling out innovations targeting everything from robotics to pro graphics to the automotive industry.

Developers, Developers, Developers

The recurring theme, however, was how NVIDIA’s ability to couple software and silicon delivers the advances in computing power needed to transform torrents of data into insights and intelligence.

“Accelerated computing is not just about the chips,” Huang said. “Accelerated computing is a collaboration, a codesign, a continuous optimization between the architecture of the chip, the systems, the algorithm and the application.”

As a result, the GPU developer ecosystem is growing fast, Huang said. The number of developers has grown to more than 1.2 million from 800,000 last year; there now are 125 GPU powered systems among the world’s 500 fastest supercomputers; and there are more than 600 applications powered by NVIDIA’s CUDA parallel computing platform.

Mellanox — whose interconnect technology helps power more than half  the world’s 500 fastest supercomputers — complement’s NVIDA’s strength in datacenters and high-performance computing, Huang said, explaining why NVIDIA agreed to buy the company earlier this month.

Mellanox CEO Eyal Waldman, who joined Huang on stage, said: “We’re seeing a great growth in data, we’re seeing an exponential growth. The program-centric datacenter is changing into a data-centric datacenter, which means the data will flow and create the programs, rather than the programs creating the data.”

Bringing AI to Datacenters

These technologies are all finding their way into the world’s datacenters as enterprises build more powerful servers — “scaling up” or “capability” systems, as Huang called it  — and network their servers more closely together than ever — or “scaling out,” or “capacity” systems, as businesses seek to turn data into a competitive advantage.

To help businesses move faster, Huang introduced CUDA-X AI, the world’s only end-to-end acceleration libraries for data science. CUDA-X AI arrives as businesses turn to AI — deep learning, machine learning and data analytics — to make data more useful, Huang explained.

The typical workflow for all these: data processing, feature determination, training, verification and deployment. CUDA-X AI unlocks the flexibility of our NVIDIA Tensor Core GPUs to uniquely address this end-to-end AI pipeline.

Matt Garman, vice president of computing services at Amazon Web Services joined NVIDIA CEO Jensen Huang on stage. Monday at GTC.

CUDA-X AI has been adopted by all the major cloud services, including Amazon Web Services, Google Cloud Platform, and Microsoft Azure. It’s been adopted by Charter, PayPal, SAS, and Walmart.

Huang also announced servers equipped with our NVIDIA T4 inferencing GPUs from all the world’s top computer and server makers. T4 will also be offered by Amazon Web Services.

“Think about not just the costs that they’ree saving, but the most precious resource that these data scientists have — time and iterations,” said Matt Garman, vice president of computing services at Amazon Web Services.

Turing, RTX, and Omniverse

NVIDIA’s Turing  GPU architecture — and its RTX real-time ray tracing technology — is also being widely adopted. NVIDIA RTX enjoys wide support with Huang highlighting more than 20 partners — including Adobe, Autodesk, Dassault Systèmes, Pixar, Siemens, Unity, Unreal, and Weta Digital — supporting RTX.

And to support the fast-growing numbers of creative professionals across an increasingly complex pipeline around the globe, Huang introduced Omniverse, enabling creative professionals to harness multiple applications to create and share scenes across different teams and from different locations. He described is as a collaboration tools like Google Docs for 3D designers, who could be located anywhere in the world while working on the same project.

“We wanted to make a tool that made it possible for studios all around the world to collaborate,” Huang said. “Omniverse basically connects up all the designers in the studios, it works with every tool.”

To speed the work of graphics pros using these, and other tools, Huang introduced the NVIDIA RTX Server, a reference architecture that will be delivered with top system vendors.

The massive power savings alone mean these machines don’t just accelerate your work, they pay for themselves. “I used to say ‘The more you buy the more you save,’ but I think I was wrong,” Huang said, with a smile. “RTX Servers are free.”

To accelerate data preparation, model training and visualization, Huang also introduced the NVIDIA-powered Data Science Workstation. Built with Quadro RTX GPUs  and pre-installed with CUDA-X AI accelerated machine learning and deep learning software, these systems for data scientists are available from global workstation providers.

Bringing gaming technology to the datacenter, as well, Huang announced the GeForce Now alliance. Built around specialized pods, each packing 1,280 GPUs in 10 racks, all interconnected with Mellanox high-speed interconnect technology, it expands NVIDIA’s GFN online gaming service through partnerships with global telecoms providers.

Together, GeForce NOW Alliance partners will scale GeForce NOW to serve millions more gamers, Huang said. Softbank and LG Uplus be among the first partners to deploy RTX cloud gaming servers in Japan and Korea later this year.

To underscore his announcement, he rolled a witty demo featuring characters in high-tech armor at a futuristic firing range, drawing broad applause from the audience. “Very few tech companies get to sit at the intersection of art and science and it’s such a thrill to be here,’ Huang said. “NVIDIA is the ILM of real time computer graphics and you can see it here.


Inviting makers to build on NVIDIA’s platform, Huang announced Jetson Nano. It’s a small, powerful CUDA-X AI computer delivering 472 GFLOPs of compute performance for running modern AI workloads, consumes just 5 watts. It supports the same architecture and software powering America’s fastest supercomputers.

Jetson Nano will come in two flavors, a $99 dev kit for makers, developers, learners, students available now; and a $129 production-ready module for creating mass-market AI powered edge systems  available June 2019.

“Here’s the amazing thing about this little thing,” Huang said. “It’s 99 dollars — the whole computer — and if you use Raspberry Pi and you just don’t have enough computer performance you just get yourself one of these, and it runs the entire CUDA X AI stack.”

Huang also announced the general availability of the Isaac SDK, a  toolbox that saves manufacturers, researchers and startups hundreds of hours by making it easier to add AI for perception, navigation and manipulation into next-generation robots.

Autonomous Vehicles

Huang finished his keynote with a flurry of automotive news.

He announced that NVIDIA is collaborating with Toyota, Toyota Research Institute Advanced-Development in Japan and Toyota Research Institute in the United States on the entire end-to-end workflow of developing, training, and validating self-driving vehicles.

“Today we are announcing that the world’s largest car company is partnering with us from end to end,” Huang said.

The deal builds on ongoing relationship with Toyota to utilize DRIVE AGX Xavier AV compute and expands collaboration to new testing, validation using DRIVE Constellation — which is now available and allows automakers to simulate billions of miles of driving in all conditions.

And Huang announced Safety Force Field — a driving policy designed to shield self-driving cars from collisions, a sort of “cocoon,” of safety.

“We have a computational method that detects the surrounding cars and predicts their natural path – knowing our own path – and computationally avoids traffic,” Huang said, adding that the open software has been validated in simulation and can be combined with any driving software.

The post GTC 2019: Huang Kicks Off GTC, Focuses on NVIDIA Datacenter Momentum, Blue Chip Partners appeared first on The Official NVIDIA Blog.

NVIDIA and Microsoft Create Edge-to-Cloud Real-Time Streaming Video Analytics Solution

It’s a huge challenge extracting actionable insights from the sea of data created by the world’s billions of cameras and sensors.

Bringing all this raw data to the cloud is inefficient because of bandwidth, cost and latency limitations. So NVIDIA and Microsoft are making it possible to process it at the edge.

The companies have integrated NVIDIA DeepStream and Microsoft Azure IoT Edge to enable real-time streaming and video analytics that extract powerful insights from thousands of cameras distributed over wide areas.

How DeepStream and Azure IoT Edge Work Together

The DeepStream SDK is a scalable framework to build high-performance streaming analytics applications and then deploy them on NVIDIA GPU platforms.

Microsoft Azure IoT Edge is a fully managed service that delivers cloud intelligence locally to enable the remote monitoring and management of NVIDIA-powered devices. Azure IoT Edge deploys applications and services built using DeepStream from the cloud to run on these edge devices.

Combining these powerful technologies brings device management, monitoring and custom business logic to millions of edge devices for real-time insights and easy deployment. This makes it easier than ever to detect and classify objects, recognize patterns and identify anomalies.

DeepStream and Microsoft Azure IoT Edge generate situational awareness over large-scale physical infrastructures such as underground parking garages, malls, smart factories and city-wide road networks through real-time streaming analytics.

See This Collaboration in Action at GTC 2019

At GTC in San Jose from March 18-21, Microsoft will be showing an early preview of DeepStream and Azure IoT Edge in booth 1122. Developers can learn more on Tuesday, March 19, at 9 a.m. PT during a DeepStream technical deep dive.

A DeepStream container supporting Azure IoT Edge will be available soon for developers in the Azure IoT Edge marketplace.

The post NVIDIA and Microsoft Create Edge-to-Cloud Real-Time Streaming Video Analytics Solution appeared first on The Official NVIDIA Blog.

NVIDIA CUDA-X AI Acceleration Libraries Speed Up Machine Learning in the Cloud by 20x; Available Now on Microsoft Azure

Data scientists can now accelerate their machine learning projects by up to 20x using NVIDIA CUDA-X AI, NVIDIA’s data science acceleration libraries, on Microsoft Azure.

With just a few clicks, businesses of all sizes can accelerate their data science, turning enormous amounts of data into their competitive advantage faster than ever before.

Microsoft Azure Machine Learning (AML) service is the first major cloud platform to integrate RAPIDS, a key component of NVIDIA CUDA-X AI. With access to the RAPIDS open source suite of libraries, data scientists can do predictive analytics with unprecedented speed using NVIDIA GPUs on AML service.

RAPIDS on AML service dramatically boosts performance for the many businesses across a wide range of industries that are using machine learning to create predictive AI models from their vast amounts of data. These include retailers that want to manage inventories better, financial institutions that want to make smarter financial projections, and healthcare organizations that want to detect disease faster and lower administration costs.

Businesses using RAPIDS on AML service can reduce the time it takes to train their AI models by up to 20x, slashing training times from days to hours or from hours to minutes, depending on their dataset size. This is the first time RAPIDS has been integrated natively into a cloud data science platform.

Walmart is an early adopter of RAPIDS, using it to improve the accuracy of its forecasts.

“RAPIDS software has the potential to significantly scale our feature engineering processes – enabling us to run our most complex machine learning models to further improve our forecast accuracy,” said Srini Venkatesan, senior vice president of Supply Chain Technology and Cloud at Walmart. “We’re excited that Azure Machine Learning service is partnering with NVIDIA to offer RAPIDS and GPU-powered compute for data scientists so we can run RAPIDS in the Azure cloud.”

RAPIDS on AML service comes in the form of a Jupyter Notebook that through the use of the AML service SDK creates a resource group, workspace, cluster and an environment with the right configurations and libraries for the use of RAPIDS code. Template scripts are provided to enable the user to experiment with different data sizes and number of GPUs as well to set up a CPU baseline.

“Our vision is to deliver the best technology that helps customers do transformative work,” said Eric Boyd, corporate vice president of Azure AI at Microsoft. “Azure Machine Learning service is the leading platform for building and deploying machine learning models, and we’re excited to help data scientists unlock significant performance gains with Azure paired with NVIDIA’s GPU acceleration.”

Learn more about NVIDIA CUDA-X AI acceleration libraries.

Check out Microsoft Azure’s blog or attend this GTC session to learn about using RAPIDS on Azure Machine Learning service.

The post NVIDIA CUDA-X AI Acceleration Libraries Speed Up Machine Learning in the Cloud by 20x; Available Now on Microsoft Azure appeared first on The Official NVIDIA Blog.

NVIDIA RTX Server Lineup Expands to Meet Growing Demand for Data Center and Cloud Graphics Applications

From Hollywood studios under pressure to create amazing content faster than ever, to the emerging demand for 5G-enabled cloud gaming and VR streaming — the need for computational horsepower has never been greater.

Previously, running servers powerful enough to deliver visually rich content in real time was too expensive. Not anymore.

NVIDIA RTX Servers — which include fully optimized software stacks available for Optix RTX rendering, gaming, VR and AR, and professional visualization applications — can now deliver cinematic-quality graphics enhanced by ray tracing for far less than just the cost of electricity for a CPU-based rendering cluster with the same performance.

RTX Blade Servers: A Leap in Cloud-Rendered Density, Efficiency and Scalability

NVIDIA founder and CEO Jensen Huang unveiled the latest RTX Server configuration at our annual GPU Technology Conference today. It comprises 1,280 Turing GPUs on 32 RTX blade servers, which offer a monumental leap in cloud-rendered density, efficiency and scalability.

Each RTX blade server packs 40 GPUs into an 8U space and can be shared by multiple users with NVIDIA GRID vGaming or container software. Mellanox technology is used as the backbone storage and networking interconnect to deliver the apps and updates instantly to thousands of concurrent users.

We’ve optimized RTX servers for use by cloud gaming operators, enabling them to render and stream games at the performance levels of GeForce RTX 2080 GPUs to any client device.

AR and VR Applications Now a Cloud Reality

With low-latency access to RTX Servers at the network edge, cloud-rendered AR and VR applications become a reality. We’re showcasing AR/VR demos running on cloud-based hardware at GTC, including an RTX Server-powered demo from Envrmnt, the XR arm of Verizon.

We’re collaborating with AT&T and Ericsson to bring these experiences to life on mobile networks. At AT&T Foundry, using NVIDIA CloudVR software, we were able to play an interactive VR game, over a 5G radio streamed from an RTX Server. The result was a great end user experience, with only 5ms of network delay and no observable performance loss. Cloud-based VR over 5G will be demonstrated next week at the AT&T and Ericsson 5G Designing the Edge event.

Global Telcos Join NVIDIA GeForce NOW Alliance to Deploy Cloud Gaming on 5G

The advent of 5G networks with high bandwidth and ultra-low latency has made rendering and streaming even the most complex applications from the cloud viable. With RTX Servers on 5G Edge networks, users will have access to cloud gaming services like GeForce NOW and AR and VR applications on just about any device.

Applying edge computing to cloud gaming benefits both fixed-line and mobile broadband networks by eliminating the physical distance between network hubs and game servers that can add to latency.

Telecommunications operators deploying optimized RTX Servers with NVIDIA-managed GeForce NOW software in their data centers get an easy-to-deploy, turnkey solution to deliver computationally demanding content. Softbank (Japan) and LG Uplus (Korea) will be among the first partners to deploy RTX cloud gaming servers later this year.

We’re working with HTC to bring cloud gaming and VR wirelessly into homes. The HTC 5G Hub — a 5G hotspot, Android entertainment device and battery pack all-in-one — is ideal for fixed wireless access (FWA) for 5G broadband homes. The GeForce NOW app is being optimized for the HTC Hub to provide a low latency cloud gaming experience over 5G. We’re also working together to support CloudVR, enabling virtual reality apps to be rendered on RTX servers in cloud data centers and streamed to the HTC VIVE headset without a local PC or cables to enable a mobile, high-end VR experience.

Virtualized Production, Rendering, and Collaboration

Whether working on content creation on the desktop or batch and final frame rendering in the data center, users of NVIDIA RTX Servers can tap into GPU-accelerated rendering and performance at a fraction of the cost, space and power requirements of a CPU render farm. And they get the processing power artists need to explore more creative options and create even higher quality effects that might otherwise take too long to attempt.

Production processes for animation, visual effects and industrial design often require many people across multiple continents. But complex 3D pipelines make global collaboration challenging. NVIDIA’s Omniverse allows artists to see and interact in real time with changes made by other artists or colleagues working on the same content in a different application. Changes are reflected in multiple tools at the same time.

Support for Quadro Virtual Data Center Workstation Software (Quadro vDWS) provides another efficient and powerful option for artists and designers to run their content creation applications and design tools virtually for greater mobility and increased collaboration without compromising performance. Studios can have one, easy-to-manage server with multiple virtual workstations, so employees can share GPU resources and securely access their work from any location.

World’s Leading OEMs Unveil Latest RTX Servers

System makers worldwide, including Dell, HPE, Lenovo, ASUS and Supermicro have unveiled newly validated NVIDIA RTX Servers for highly configurable, on-demand rendering and virtual workstation solutions from the data center.

“We are enabling content producers to create more visually-rich graphics and renderings faster than ever before. With the HPE Apollo 6500 Gen10, HPE ProLiant DL380 Gen10 and HPE ProLiant ML350 Gen10, we will offer the NVIDIA RTX Server to provide designers with GPU-accelerated power and performance for the most efficient end-to-end rendering solutions, from interactive sessions on the desktop to final batch rendering in the data center.” — Bill Mannel, vice president and general manager of HPC and AI Group, Hybrid IT, Hewlett Packard Enterprise

“NVIDIA RTX Server provides benefits to users and organizations with the best performing and most efficient end-to-end rendering solutions, from batch rendering to interactive rendering in a design viewport. With the flexibility of ASUS solutions such as ASUS ESC4000 G4 and ESC8000 G4, designers can leverage the new AI and ray-tracing features of NVIDIA’s enhanced RTX platform, enabling them to create impressive, stunning designs and visual effects faster than ever before.” — Jackie Hsu, corporate vice president and general manager of Worldwide Sales, ASUS

“NVIDIA RTX Server combines the ground-breaking Quadro RTX 8000 and RTX 6000 GPUs with Quadro vDWS to deliver a powerful and flexible architecture to meet the demands of creative professionals. Supermicro is proud to be an inaugural partner for the NVIDIA RTX Server program with the Supermicro SYS-4029GP-TRT.” — Michael McNerney, vice president of Marketing and Network Security, Supermicro

Scalability and Availability

The NVIDIA RTX platform comes in 2U, 4U and 8U form factors and supports multiple NVIDIA GPU options from Quadro RTX GPUs and Quadro vDWS software for professional apps to NVIDIA GPUs with GRID vGaming software for cloud gaming and consumer AR/VR.

2U and 4U RTX servers are available from our OEM partners today. The new 8U RTX blade server will initially be available from NVIDIA in Q3.

Learn more about NVIDIA RTX.

The post NVIDIA RTX Server Lineup Expands to Meet Growing Demand for Data Center and Cloud Graphics Applications appeared first on The Official NVIDIA Blog.

Intel Announces First 58Gbps FPGA Transceiver in Volume Production Enabling 400G Ethernet Deployment

fpga 2x1
The Intel Stratix 10 TX FPGAs are the world’s first field programmable gate array with 58Gbps PAM4 transceiver technology enabling 400Gb Ethernet deployment. This technology doubles transceiver bandwidth performance when compared to traditional solutions. (Credit: Intel Corporation)
» Click for full image

What’s New: At the Optical Fiber Communications (OFC) conference in San Diego this week, Intel’s Programmable Solutions Group is showcasing market-leading 58Gbps transceiver technology integrated on the Intel® Stratix® 10 TX FPGA — the world’s first field programmable gate array (FPGA) with 58Gbps PAM4 transceiver technology now shipping in volume production and enabling 400Gb Ethernet deployment.

“As we continue to deliver product innovations and capabilities that allow for higher data ingest and processing speeds critical for networking and data center applications, this is a powerful example of how Intel FPGAs bring real value to our customers.”
–Dan McNamara, Intel senior vice president and general manager of the Programmable Solutions Group

Why It’s Important: This industry-leading technology doubles transceiver bandwidth performance when compared to traditional solutions. It is critical for applications where high bandwidth is paramount, including: networking, cloud and 5G applications, optical transport networks, enterprise networking, cloud service providers, and 5G. By supporting dual-mode modulation, 58Gbps PAM4 and 30Gbps NRZ, new infrastructure can reach 58Gbps data rates while staying backward-compatible with existing network infrastructure.

The Stratix 10 TX FPGA with 58Gbps PAM4 transceiver technology provides system architects with higher transceiver bandwidth and hardened IP to address the insatiable demand for faster and higher density connectivity.

“The 400Gb Ethernet and QSFP-DD market is evolving at a fast pace. And being first to market with a portable solution is instrumental to enable the transition from lab to the field. We were excited to work closely with Intel to deliver our next-generation test module with the only production FPGA technology supporting native 58Gbps PAM4,” says Ildefonso M. Polo, vice president of Product Marketing at VeEX.

What It Does: To facilitate the future of networking, Network Function Virtualization (NFV) and optical transport solutions, Intel Stratix 10 TX FPGAs provide up to 144 transceiver lanes with serial data rates of 1 to 58Gbps. This combination delivers a higher aggregate bandwidth than any current FPGA, enabling architects to scale to 100Gb, 200Gb and 400Gb delivery speeds.

A wide range of hardened intellectual property cores, including 100Gb MAC and FEC, deliver optimized performance, latency and power.

What It Delivers: Intel Stratix 10 FPGA 58Gbs transceivers are interoperable with 400G Ethernet FPGAs, using only eight channels to support new high-bandwidth requirements for routers, switches, active optical cables and direct attach cables, interconnects, and test and measurement equipment.

What the Future Holds: At Intel Architecture Day, Intel unveiled a 112G PAM4 high-speed transceiver test chip built on 10nm process technology. The chip will be incorporated into Intel’s next-generation FPGA product families, supporting the most demanding bandwidth requirements in next-generation data center, enterprise and networking environments.

More Context: Intel FPGAs and Programmable Devices | Programmable Solutions Group News

The Small Print: Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. 28G is considered a traditional solution. Intel product can achieve over 57Gbps. No computer system can be absolutely secure. Check with your system manufacturer or retailer, or learn more at

The post Intel Announces First 58Gbps FPGA Transceiver in Volume Production Enabling 400G Ethernet Deployment appeared first on Intel Newsroom.

NVIDIA Enables Next Wave of Growth — Accelerated Data Science — for High Performance Computing

NVIDIA is teaming up with the world’s largest tech companies and the U.S.’s top supercomputing labs to accelerate data analytics and machine learning, one of the fastest growing areas of high performance computing.

The new initiative marks a key moment in our work accelerating HPC, a market expected to grow considerably over the next few years. While the world’s data doubles each year, CPU computing has hit a brick wall with the end of Moore’s law.

Together with partners such as Microsoft, Cisco, Dell EMC, Hewlett Packard Enterprise, IBM, Oracle and others, we’ve already sped up data tasks for our customers by as much as 50x. And initial testing by the U.S. Department of Energy’s Oak Ridge National Laboratory is showing a remarkable 215x speedup related to climate prediction research.

A Rapid Evolution

Starting a decade ago, we brought acceleration to scientific computing. Since then, we’ve helped researchers — including multiple Nobel Prize winners — dramatically speed up their compute-intensive simulations, tackling some of the world’s greatest problems.

Then, just over five years ago, we enabled our GPU platform to accelerate deep learning through optimized software, setting in motion the AI revolution.

Now, through new open-source data science acceleration software released last month, a third wave is upon us.

At the center of this new movement is RAPIDS, an open-source data analytics and machine learning acceleration platform for executing end-to-end data science training pipelines completely on GPUs.

RAPIDS relies on NVIDIA CUDA primitives for low-level compute optimization, but exposes that GPU parallelism and high memory bandwidth through user-friendly Python interfaces. The RAPIDS dataframe library mimics the pandas API and is built on Apache Arrow to maximize interoperability and performance.

More Accelerated Machine Learning in the Cloud

Now we’re partnering with the world’s leading technology companies to bring accelerated machine learning to more users in more places.

Working closely with NVIDIA, Microsoft is introducing accelerated machine learning to its Azure Machine Learning customers.

“Azure Machine Learning is the leading platform for data scientists to build, train, manage and deploy machine learning models from the cloud to the edge,” said Eric Boyd, corporate vice president for Azure AI at Microsoft. “We’ve been partnering with NVIDIA to offer GPU-powered compute for data scientists and are excited to introduce software from the RAPIDS open source project to Azure users. I’m looking forward to seeing what the data science community can do with RAPIDS and Azure Machine Learning.”

More Systems for Accelerated Machine Learning

We’re also collaborating on a range of new products from leading computer makers based on the NVIDIA HGX-2 cloud-server platform for all AI and HPC workloads.

Delivering two petaflops of compute performance in a single node, NVIDIA HGX-2 can run machine learning workloads nearly 550x faster than a CPU-only server.

The first HGX-2 based servers are from Inspur, QCT and Supermicro. All three companies are featuring their new HGX-2 servers on the exhibit hall of the annual high performance computing show, SC18, in Dallas this week.

More Scientific Breakthroughs Using Accelerated Machine Learning

Our nation’s most important labs are engaged in work ranging from fusion research and human genomics to climate prediction — work that relies on scientific computing, deep learning and data science.

NVIDIA DGX-2, designed to handle the most compute-intensive applications, offers them performance breakthroughs in the most demanding areas. Now, paired with RAPIDS open-source machine learning software, DGX-2 is helping scientists at several U.S. Department of Energy laboratories accelerate their research.

Among those witnessing early success with DGX-2 and RAPIDS are researchers at Oak Ridge National Lab.

Currently, there are massive amounts of observational data available to create models to enhance energy security applications involving climate simulations. However, historically, machine learning training on climate datasets has been compute limited and slow. Until now.

Using DGX-2 and RAPIDS, researchers at ORNL are already seeing massive improvements in the speed of applying machine learning to massive datasets. Running XGBoost on their DGX-2, ORNL reduced the time to train a 224GB model from 21 hours on a CPU node down to just six minutes — a 215x speed-up.

All of the RAPIDS open-source libraries for accelerating machine learning and data analytics are available at no charge. To get started, visit the NGC container registry.

The post NVIDIA Enables Next Wave of Growth — Accelerated Data Science — for High Performance Computing appeared first on The Official NVIDIA Blog.

NGC Containers Now Available for More Users, More Apps, More Platforms

Call it a virtuous circle. GPUs are accelerating increasing numbers of data science and HPC workloads. This has enabled a wide range of scientific breakthroughs, including five of this year’s six Gordon Bell Prize finalists. These advances boost mindshare — GPUs are featuring prominently in sessions, demos and new product offerings throughout SC18, taking place this week in Dallas.

And we’re completing the loop by making it easier to deploy software from our NGC container registry. Its pre-integrated and optimized containers bring the latest enhancements and performance improvements for industry-standard software to NVIDIA GPUs. As the registry grows — the number of containers has doubled in the last year — users have even more ways to take advantage of GPU computing.

More Applications, New Multi-Node Containers and Singularity

The NGC container registry now offers a total of 41 frameworks and applications (up from 18 last year) for deep learning, HPC and HPC visualization. Recent additions include CHROMA, Matlab, MILC, ParaView, RAPIDS and VMD. We’ve also increased their capabilities and made them easier to deploy.

At SC18, we announced new multi-node HPC and visualization containers, which allow supercomputing users to run workloads on large-scale clusters.

Large deployments often use a technology called message passing interface (MPI) to execute jobs across multiple servers. But building an application container that leverages MPI is challenging because there are so many variables that define an HPC system (scheduler, networking stack, MPI and various drivers versions).

The NGC container registry simplifies this with an initial rollout of five containers supporting multi-node deployment. This makes it significantly easier to run massive computational workloads on multiple nodes with multiple GPUs per node.

And to make deployment even easier, NGC containers can now be used natively in Singularity, a container technology that is widely adopted at supercomputing sites.

New NGC-Ready Program

To expand the places where people can run HPC applications, we’ve announced the new NGC-Ready program. This lets users of powerful systems with NVIDIA GPUs deploy with confidence. Initial NGC-Ready systems from server companies include:

  • ATOS BullSequana X1125
  • Cisco UCS C480ML
  • Cray CS Storm NX
  • Dell EMC PowerEdge C4140
  • HPE Apollo 6500
  • Supermicro SYS-4029GP-TVRT

NGC-Ready workstations equipped with NVIDIA Quadro GPUs provide a platform that offers the performance and flexibility that researches need to rapidly build, train and evolve deep learning projects. NGC-Ready systems from workstation companies include:

  • HPI Z8
  • Lenovo ThinkStation P920

The combination of NGC containers and NGC-Ready systems from top vendors provides users a replicable, containerized way to roll out HPC applications from development to production.

Containers from the NGC container registry work across a wide variety of additional platforms, including Amazon EC2, Google Cloud Platform, Microsoft Azure, Oracle Cloud Infrastructure, NVIDIA DGX systems, and select NVIDIA TITAN and Quadro GPUs.

NGC container registry

NGC Containers Deployed by Premier Supercomputing Centers

NGC container registry users represent a variety of industries and disciplines, from large corporations to individual researchers. Among these are two of the top education and research facilities in the country: Clemson University and the University of Arizona.

Research facilitators for Clemson’s Palmetto cluster continually received requests to support multiple versions of the same applications. Installing, upgrading and maintaining all of these different versions was time consuming and resource intensive. Maintaining all of these different versions bogged down the support staff and hampered user productivity.

The Clemson team successfully tested HPC and deep learning containers such as GROMACS and TensorFlow from the NGC container registry on their Palmetto system. Now they recommend users leverage NGC containers for their projects. Additionally, the containers run in their Singularity deployment, making it easier to support across their systems. With NGC containers, Clemson’s Palmetto users can now run their preferred application versions without disrupting other researchers or relying on the system admins for deployment.

At the University of Arizona, system admins for the Ocelote cluster would be inundated with update requests whenever new versions of the TensorFlow deep learning framework came out. Due to the complexity of installing TensorFlow on HPC systems — which can take as long as a couple of days — this became a resource issue for their modest-sized team and often led to unhappy users.

“Our cluster environment by necessity does not get updated fast enough to keep up with the requirements of the deep learning workflows,” says Chris Reidy, principal HPC systems administrator at the University of Arizona. “We made a significant investment in NVIDIA GPUs, and the NGC containers leverage that investment. We have significant interest in various fields ranging from traditional molecular dynamics codes like NAMD to machine learning and deep learning, and the NGC containers are built with an optimized and fully tested software stack to provide a quick start to getting research done.”

Reidy tested various HPC, HPC visualization and deep learning containers from NGC in Singularity on their cluster. Following instructions available in the NGC documentation, he was able to easily get the NGC containers up and running. They’re now the preferred way of running these applications.

NGC containers are available to download at no charge. To get started, visit the NGC container registry.

The post NGC Containers Now Available for More Users, More Apps, More Platforms appeared first on The Official NVIDIA Blog.

Putting Biopsies Under AI Microscope: Pathology Startup Fuels Shift Away from Physical Slides

Hundreds of millions of tissue biopsies are performed worldwide each year — most of which are diagnosed as non-cancerous. But for the few days or weeks it takes a lab to provide a result, uncertainty weighs on patients.

“Patients suffer emotionally, and their cancer is progressing as the clock ticks,” said David West, CEO of digital pathology startup Proscia.

That turnaround time has the potential to dramatically reduce. In recent years, the biopsy process has begun to digitize, with more and more pathologists looking at digital scans of body tissue instead of physical slides under a microscope.

Proscia, a member of our Inception virtual accelerator program, is hosting these digital biopsy specimens in the cloud. This makes specimen analysis borderless, with one hospital able to consult a pathologist in a different region. It also creates the opportunity for AI to assist experts as they analyze specimens and make their diagnoses.

“If you have the opportunity to read twice as many slides in the same amount of time, it’s an obvious win for the laboratories,” said West.

The Philadelphia-based company recently closed a $8.3 million Series A funding round, which will power its AI development and software deployment. And a feasibility study published last week demonstrated that Proscia’s deep learning software scores over 99 percent accuracy for classifying three common types of skin pathologies.

Biopsy Analysis, Behind the Scenes

Pathologists have the weighty task of examining lab samples of body tissue to determine if they’re cancerous or benign. But depending on the type and stage of disease, two pathologists looking at the same tissue may disagree on a diagnosis more than half the time, West says.

These experts are also overworked and in short supply globally. Laboratories around the world have too many slides and not enough people to read them.

China has one pathologist per 80,000 patients, said West. And while the United States has one per 25,000 patients, it’s facing a decline as many pathologists are reaching retirement age. Many other countries have so few pathologists that they are “on the precipice of a crisis,” according to West.

He projects that 80 to 90 percent of major laboratories will have switched their biopsy analysis from microscopes to scanners in the next five years. Proscia’s subscription-based software platform aims to help pathologists more efficiently analyze these digital biopsy specimens, assisted by AI.

The company uses a range of NVIDIA Tesla GPUs through Amazon Web Services to power its digital pathology software and AI development. The platform is currently being used worldwide by more than 4,000 pathologists, scientists and lab managers to manage biopsy data and workflows.

screenshot of Proscia's DermAI tool
Proscia’s digital pathology and AI platform displays a heat map analysis of this H&E stained skin tissue image.

In December, Proscia will release its first deep learning module, DermAI. This tool will be able to analyze skin biopsies and is trained to recognize roughly 70 percent of the pathologies a typical dermatology lab sees. Three other modules are currently under development.

Proscia works with both labeled and unlabeled data from clinical partners to train its algorithms. The labeled dataset, created by expert pathologists, are tagged with the overall diagnosis as well as more granular labels for specific tissue formations within the image.

While biopsies can be ordered at multiple stages of treatment, Proscia focuses on the initial diagnosis stage, when doctors are looking at tissue and making treatment decisions.

“The AI is checking those cases as a virtual second opinion behind the scenes,” said West. This could lower the chances of missing tricky-to-spot cancers like melanoma, and make diagnoses more consistent among pathologists.

The post Putting Biopsies Under AI Microscope: Pathology Startup Fuels Shift Away from Physical Slides appeared first on The Official NVIDIA Blog.

An AI for an Eye: How Deep Learning May Prevent Diabetes-Induced Blindness

There are many ways diabetes can be debilitating, even lethal. But one condition caused by the disease comes on without warning.

A patient “can go to sleep one night and wake up the next morning and be legally blind, with no previous symptoms,” said Jonathan Stevenson, chief strategy and information officer for Intelligent Retinal Imaging Systems (IRIS), speaking of the condition known as diabetic retinopathy.

While most complications of diabetes such as heart disease, kidney disease and nerve damage have overt symptoms, diabetic retinopathy can sneak up on a patient undetected, unless spotted early by regular eye exams.

IRIS has been applying GPU-powered deep learning and Azure Machine Learning Services to provide early and broad detection of diabetic retinopathy, and prevent patients from losing their eyesight.

Making Diabetic Eye Exams Widely Available

Fewer than 40 percent of the 370 million diabetics in the world get checked for diabetes-related eye conditions. To make matters worse, while the number of patients with diabetes has steadily grown in recent decades, the population of ophthalmologists has been shrinking.

IRIS is attempting to bridge this gap by making retinal exams quick, easy and widely available.

“We are trying to enable a workflow that gives the provider the data they need to make decisions, but not interrupt that sacred time spent with patients,” said Stevenson.

The idea that a patient with diabetes could go blind so suddenly and unnecessarily was too much for Dr. Sunil Gupta, who founded IRIS in 2011. What the young company subsequently discovered was that deep learning can detect early indicators of diabetic complications in the retina.

Now, IRIS is preparing to unleash an updated component to its cloud-based solution that quickly analyzes uploaded images and returns that analysis to caregivers, achieving a 97 percent success rate in matching the analysis of expert ophthalmologists.

Tapping Microsoft’s Latest Toolkits

Behind that solution is an approach combining NVIDIA GPUs and the TensorFlow machine learning library with Microsoft Azure Machine Learning Services and CNTK, which make it possible to write low-level, hardware-agnostic algorithms.

Jocelyn Desbiens, lead innovator and data scientist for IRIS, said the company was one of the first organizations to make use of the Microsoft toolkits in this way. IRIS also uses Kubernetes to orchestrate its cloud-based container, which runs on the Microsoft Azure platform.

To build its model, IRIS obtained a dataset of about 10,000 retinal images, sifting through them to reveal 8,000 high-quality images, 6,000 of which were used for training, while 2,000 were held out for validation.

The system can detect differences between the left and right eyes, as well as between diabetic and normal eyes. Ultimately, it recommends whether a patient needs to be referred to a physician or if the detected condition simply needs to be observed.

Newer GPUs Up the Ante

All training and inferencing occur on NVIDIA GPUs running in IRIS’s Azure instance. IRIS has been at it long enough that it’s benefited from incredible advances in performance.

A few years ago, adopting NVIDIA Tesla K80 GPU accelerators slashed the time it took to train the company’s model on 10,000 images from a month to a week. Switching to the Tesla P100 shrunk that down to only a couple of days. And now with the Tesla V100, the process is down to half a day.

That time gain, Stevenson said, is how NVIDIA is enabling researchers and scientists to answer questions they’d never been able to tackle before — such as whether diabetic blindness can be identified ahead of time.

Even more Azure customers will soon be able to utilize these performance gains as Microsoft has announced the preview of two new N-series Virtual Machines with NVIDIA GPU capabilities.

Eventually, IRIS intends to apply its understanding of the retina to assist in the treatment of other conditions. The retina in many ways, Stevenson said, is a window into a person’s health, providing clues about everything from autoimmune disorders and cancers to cardiovascular diseases.

Without divulging specifics, he made it clear that IRIS’s work won’t stop with diabetic blindness.

“By looking at features within the retinas,” Stevenson said, “we’re able to see other conditions that aren’t necessarily related to the eye.”

The post An AI for an Eye: How Deep Learning May Prevent Diabetes-Induced Blindness appeared first on The Official NVIDIA Blog.

Alibaba and Intel Transforming Data-Centric Computing from Hyperscale Data Centers to the Edge

data centric 2x1What’s New: At The Computing Conference 2018 hosted by Alibaba Group* in Hangzhou, China, Intel and Alibaba revealed how their deep collaboration is driving the creation of revolutionary technologies that power the era of data-centric computing – from hyperscale data centers to the edge, to accelerate the deployments of new applications such as autonomous vehicles and Internet of Things (IoT).

“Alibaba’s highly innovative data-centric computing infrastructure supported by Intel technology enables real-time insight for customers from the cloud to the edge. Our close collaboration with Alibaba from silicon to software to market adoption enables customers to benefit from a broad set of workload-optimized solutions.”
– Navin Shenoy, Intel executive vice president and general manager of the Data Center Group

What the Headlines are: Intel and Alibaba Group are:

  • Launching of a Joint Edge Computing Platform to accelerate edge computing development
  • Establishing the Apsara Stack Industry Alliance targeting on-premises enterprise cloud environments
  • Deploying latest Intel technology in Alibaba to prepare for the 11/11 shopping festival
  • Bringing volumetric content to the Olympic Games Tokyo 2020 via OBS Cloud
  • Accelerating the commercialization intelligent roads

“We are thrilled to have Intel as our long-term strategic partner, and are excited to expand our collaboration across a wide array of areas from edge computing to hybrid cloud, Internet of Things and smart mobility,” said Simon Hu, senior vice president of Alibaba Group and president of Alibaba Cloud. “By combining Intel’s leading technology services and Alibaba’s experience in driving digital transformation in China and the rest of Asia, we are confident that our clients worldwide will benefit from the technology innovation that comes from this partnership.”

How They Accelerate on the Edge: Intel and Alibaba Cloud launched a Joint Edge Computing Platform that allows enterprises to develop customizable device-to-cloud IoT solutions for different edge computing scenarios, including industrial manufacturing, smart building and smart community, among others. The Joint Edge Computing Platform is an open architecture that integrates Intel software, hardware and artificial intelligence (AI) technologies with Alibaba Cloud’s latest IoT products. The platform utilizes computer vision and AI to convert data at the edge into business insights. The Joint Edge Computing Platform was recently deployed in Chongqing Refine-Yumei Die Casting Co., Ltd. (Yumei) factories and was able to increase defect detection speed five times from manual detection to automatic detection1.

hybrid cloud 2x1How They Drive Hybrid Cloud Solutions: Intel and Alibaba Cloud established the Apsara Stack Industry Alliance, which focuses on building an ecosystem of hybrid cloud solutions for Alibaba Cloud’s Apsara Stack. Optimized for Intel® Xeon® Scalable processors, the Apsara Stack provides large- and medium-sized businesses with on-premises hybrid cloud services that function the same as hyperscale cloud computing and big data services provided by Alibaba public cloud. This alliance will also enable small- and medium-sized businesses (SMBs) to access technologies, infrastructure and security on par with that of large corporations, while offering them a path to greater levels of automation, self-service capabilities, cost efficiencies and governance.

How They Power eCommerce: In preparation for the upcoming 11/11 “Singles Day” global shopping festival – which generated in excess of 168.2 billion yuan ($25 billion) in spending during the 2017 celebration – Alibaba plans to trial the next-generation Intel Xeon Scalable processors and upcoming Intel® Optane® DC persistent memory with Alibaba’s Tair workload. This workload is a key value data access and caching storage system developed by Alibaba and broadly deployed in many of Alibaba’s core applications such as Taobao and Tmall. Intel’s compute, memory and storage solutions are optimized for Alibaba’s highly interactive and data-intensive applications. These applications require the infrastructure to keep large amounts of hot accessible data in the memory cache to achieve the desired throughput (queries per second) in order to deliver smooth and responsive user experiences, especially during peak hours of the 11/11 shopping festival.

How They Accelerate the Olympics’ Digital Transformation: Also announced was a partnership aimed at advancing the digital transformation of the Olympics and delivering volumetric content over the OBS Cloud for the first time at the Olympic Games Tokyo 2020. As worldwide Olympic partners, Intel and Alibaba Cloud, will collaborate with OBS to explore a more efficient and reliable delivery pipeline of immersive media to RHBs worldwide that will improve the fan experience and bring them closer to the action via Intel’s volumetric and virtual reality technologies. This showcases the depth of Intel’s end-to-end capabilities, including the most advanced Intel Xeon Scalable processors powering OBS Cloud, compute power to process high volumes of data, and technology to create and deliver immersive media.

How They Accelerate the Commercialization of Intelligent Roads: Intel officially became one of Alibaba AliOS’ first strategic partners of the intelligent transportation initiative, aiming to support the construction of intelligent road traffic network and build a digital and intelligent transportation system to realize vehicle-road synergy. Intel and Alibaba will jointly explore v2x usage model with respect to 5G communication and edge computing based on the Intel Network Edge Virtualization Software Development Kit (NEV SDK).

More Context: Intel and Alibaba Cloud Deliver Joint Computing Platform for AI Inference at the Edge

1Automated product quality data collected by YuMei using JWIPC® model IX7, ruggedized, fan-less edge compute node/industrial PC running an Intel® Core™ i7 CPU with integrated on die GPU and OpenVINO SDK. 16GB of system memory, connected to a 5MP POE Basler* Camera model acA 1920-40gc. Together these components, along with the Intel developed computer vision and deep learning algorithms, provide YuMei factory workers information on product defects near real-time (within 100 milliseconds). Sample size >100,000 production units collected over 6 months in 2018.

The post Alibaba and Intel Transforming Data-Centric Computing from Hyperscale Data Centers to the Edge appeared first on Intel Newsroom.