Intel Highlighted Why NVIDIA Tensor Core GPUs Are Great for Inference

It’s not every day that one of the world’s leading tech companies highlights the benefits of your products.

Intel did just that last week, comparing the inference performance of two of their most expensive CPUs to NVIDIA GPUs.

To achieve the performance of a single mainstream NVIDIA V100 GPU, Intel combined two power-hungry, highest-end CPUs with an estimated price of $50,000-$100,000, according to Anandtech. Intel’s performance comparison also highlighted the clear advantage of NVIDIA T4 GPUs, which are built for inference. When compared to a single highest-end CPU, they’re not only faster but also 7x more energy-efficient and an order of magnitude more cost-efficient.

Inference performance is crucial, as AI-powered services are growing exponentially. And Intel’s latest Cascade Lake CPUs include new instructions that improve inference, making them the best CPUs for inference. However, it’s hardly competitive with NVIDIA deep learning-optimized Tensor Core GPUs.

Inference (also known as prediction), in simple terms, is the “pattern recognition” that a neural network does after being trained. It’s where AI models provide intelligent capabilities in applications, like detecting fraud in financial transactions, conversing in natural language to search the internet, and predictive analytics to fix manufacturing breakdowns before they even happen.

While most AI inference today happens on CPUs, NVIDIA Tensor Core GPUs are rapidly being adopted across the full range of AI models. Tensor Core, a breakthrough innovation has transformed NVIDIA GPUs to highly efficient and versatile AI processors. Tensor Cores do multi-precision calculations at high rates to provide optimal precision for diverse AI models and have automatic support in popular AI frameworks.

It’s why a growing list of consumer internet companies — Microsoft, Paypal, Pinterest, Snap and Twitter among them — are adopting GPUs for inference.

Compelling Value of Tensor Core GPUs for Computer Vision

First introduced with the NVIDIA Volta architecture, Tensor Core GPUs are now in their second generation with NVIDIA Turing. Tensor Cores perform extremely efficient computations for AI for a full range of precision — from 16-bit floating point with 32-bit accumulate to 8-bit and even 4-bit integer operations with 32-bit accumulate.

They’re designed to accelerate both AI training and inference, and are easily enabled using automatic mixed precision features in the TensorFlow and PyTorch frameworks. Developers can achieve 3x training speedups by adding just two lines of code to their TensorFlow projects.

On computer vision, as the table below shows, when comparing the same number of processors, the NVIDIA T4 is faster, 7x more power-efficient and far more affordable. NVIDIA V100, designed for AI training, is 2x faster and 2x more energy efficient than CPUs on inference.

Table 1: Inference on ResNet-50.

 Two-Socket
Intel Xeon 9282
NVIDIA V100
(Volta)
NVIDIA T4
(Turing)
ResNet-50 Inference (images/sec)7,8787,8444,944
# of Processors211
Total Processor TDP800 W350 W70 W
Energy Efficiency (Taking TDP)10 img/sec/W22 img/sec/W71 img/sec/W
Performance per Processor (images/sec)3,9397,8444,944
GPU Performance Advantage1.0 (baseline)2.0x1.3x
GPU Energy-Efficiency Advantage1.0 (baseline)2.3x7.2x

Source: Intel Xeon performance; NVIDIA GPU performance

Compelling Value of Tensor Core GPUs for Understanding Natural Language

AI has been moving at a frenetic pace. This rapid progress is fueled by teams of AI researchers and data scientists who continue to innovate and create highly accurate and exponentially more complex AI models.

Over four years ago, computer vision was among the first applications where AI from Microsoft was able to perform at superhuman accuracy using models like ResNet-50. Today’s advanced models perform even more complex tasks like understanding language and speech at superhuman accuracy. BERT, a highly complex AI model open-sourced by Google last year, can now understand prose and answer questions with superhuman accuracy.

A measure of the complexity of AI models is the number of parameters they have. Parameters in an AI model are the variables that store information the model has learned. While ResNet-50 has 25 million parameters, BERT has 340 million, a 13x increase.

On an advanced model like BERT, a single NVIDIA T4 GPU is 56x faster than a dual-socket CPU server and 240x more power-efficient.

Table 2: Inference on BERT. Workload: Fine-Tune Inference on BERT Large dataset.

 Dual Intel Xeon
Gold 6240
NVIDIA T4
(Turing)
BERT Inference,
Question-Answering (sentences/sec)
2118
Processor TDP300 W (150 Wx2)70 W
Energy Efficiency (using TDP)0.007 sentences/sec/W1.7 sentences/sec/W
GPU Performance Advantage1.0 (baseline)59x
GPU Energy-Efficiency Advantage1.0 (baseline)240x

CPU server: Dual-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; FP32 precision; with Intel’s TF Docker container v. 1.13.1. Note: Batch-size 4 results yielded the best CPU score.

GPU results: T4: Dual-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; mixed precision; CUDA 10.1.105; NCCL 2.4.3, cuDNN 7.5.0.56, cuBLAS 10.1.105; NVIDIA driver 418.67; on TensorFlow using automatic mixed precision and XLA compiler; batch-size 4 and sequence length 128 used for all platforms tested. 

Compelling Value of Tensor Core GPUs for Recommender Systems

Another key usage of AI is in recommendation systems, which are used to provide relevant content recommendations on video sharing sites, news feeds on social sites and product recommendations on e-commerce sites.

Neural collaborative filtering, or NCF, is a recommender system that uses the prior interactions of users with items to provide recommendations. When running inference on the NCF model that is a part of the MLPerf 0.5 training benchmark, NVIDIA T4 brings 12x more performance and 24x higher energy efficiency than CPUs.

Table 3: Inference on NCF.

 Single Intel Xeon
Gold 6140
NVIDIA T4
(Turing)
Recommender Inference Throughput (MovieLens)(thousands of samples/sec)2,86027,800
Processor TDP150 W70 W
Energy Efficiency (using TDP)19 samples/sec/W397 samples/sec/W
GPU Performance Advantage1.0 (baseline)10x
GPU Energy-Efficiency Advantage1.0 (baseline)20x

CPU server: Single-socket Xeon Gold 6240@2.6GHz; 384GB system RAM; Used Intel Benchmark for NCF on TensorFlow with Intel’s TF Docker container version 1.13.1; FP32 precision. Note: Single-socket CPU config used for CPU tests as it yielded a better score than dual-socket.

GPU results: T4: Single-socket Xeon Gold 6140@2.3GHz; 384GB system RAM; CUDA 10.1.105; NCCL 2.4.3, cuDNN 7.5.0.56, cuBLAS 10.1.105; NVIDIA driver 418.40.04; on TensorFlow using automatic mixed precision and XLA compiler; batch-size: 2,048 for CPU, 1,048,576 for T4; precision: FP32 for CPU, mixed precision for T4. 

Unified Platform for AI Training and Inference

The use of AI models in applications is an iterative process designed to continuously improve their performance. Data scientist teams constantly update their models with new data and algorithms to improve accuracy. These models are then updated in applications by developers.

Updates can happen monthly, weekly and even on a daily basis. Having a single platform for both AI training and inference can dramatically simplify and accelerate this process of deploying and updating AI in applications.

NVIDIA’s data center GPU computing platform leads the industry in performance by a large margin for AI training, as demonstrated by the standard AI benchmark, MLPerf. And the NVIDIA platform provides compelling value for inference, as the data presented here attests. That value increases with the growing complexity and progress of modern AI.

To help fuel the rapid progress in AI, NVIDIA has deep engagements with the ecosystem and constantly optimizes software, including key frameworks like TensorFlow, Pytorch and MxNet as well as inference software like TensorRT and TensorRT Inference Server.

NVIDIA also regularly publishes pre-trained AI models for inference and model scripts for training models using your own data. All of this software is freely made available as containers, ready to download and run from NGC, NVIDIA’s hub for GPU-accelerated software.

Get the full story about our comprehensive AI platform.

The post Intel Highlighted Why NVIDIA Tensor Core GPUs Are Great for Inference appeared first on The Official NVIDIA Blog.

Plowing AI, Startup Retrofits Tractors with Autonomy

Colin Hurd, Mark Barglof and Quincy Milloy aren’t your conventional tech entrepreneurs. And that’s not just because their agriculture technology startup is based in Ames, Iowa.

Smart Ag is developing autonomy and robotics for tractors in a region more notable for its corn and soybeans than software or silicon.

Hurd, Barglof and Milloy, all Iowa natives, founded Smart Ag in 2015 and landed a total of $6 million in seed funding, $5 million of which came from Stine Seed Farm, an affiliate of Stine Seed Co. Other key investors included Ag Startup Engine, which backs Iowa State University startups.

The company is in widespread pilot tests with its autonomy for tractors and plans to commercialize its technology for row crops by 2020.

Smart Ag is a member of the NVIDIA Inception virtual accelerator, which provides marketing and technology support to AI startups.

A team of two dozen employees has been busy on its GPU-enabled autonomous software and robotic hardware system that operates tractors that pull grain carts during harvest.

“We aspire to being a software company first, but we’ve had to make a lot of hardware in order to make a vehicle autonomous,” said Milloy.

Wheat from Chaff

Smart Ag primarily works today with traditional row crop (corn and soybean) producers and cereal grain (wheat) producers. During harvest, these farmers use a tractor to pull a grain cart in conjunction with the harvesting machine, or combine, which separates the wheat from the chaff or corn from the stalk. Once the combine’s storage bin is full, the tractor with the grain cart pulls alongside for the combine to unload into the cart.

That’s where autonomous tractors come in.

Farm labor is scarce. In California, 55 percent of farms surveyed said they had experienced labor shortages in 2017, according to a report from the California Farm Bureau Federation.

Smart Ag is developing its autonomous tractor to pull a grain cart, addressing the lack of drivers available for this job.

Harvest Tractor Autonomy

Farmers can retrofit a John Deere 8R Series tractor using the company’s AutoCart system. It provides controllers for steering, acceleration and braking, as well as cameras, radar and wireless connectivity. An NVIDIA Jetson Xavier powers its perception system, fusing Smart Ag’s custom agricultural object detection model with other sensor data to give the tractor awareness of its surroundings.

“The NVIDIA Jetson AGX Xavier has greatly increased our perception capabilities — from the ability to process more camera feeds to the fusion of additional sensors —  it has enabled the path to develop and rapidly deploy a robust safety system into the field,” Milloy said.

Customers can use mobile devices and a web browser to access the system to control tractors.

Smart Ag’s team gathered more than 1 million images to train the image recognition system on AWS, tapping into NVIDIA GPUs. The startup’s custom image recognition algorithms allow its autonomous tractor to avoid people and other objects in the field, find the combine for unloading and return to a semi truck for the driverless grain cart vehicle to unload the grain for final transport to a grain storage facility.

Smart Ag has more than 12 pilot tests under its belt and uses those to gather more data to refine its algorithms. The company plans to expand its test base to roughly 20 systems operating during harvest in 2019 in preparation for its commercial launch in 2020.

“We’ve been training for the past year and a half. The system can get put out today in deployment, but we can always get higher accuracy,” Milloy said.

The post Plowing AI, Startup Retrofits Tractors with Autonomy appeared first on The Official NVIDIA Blog.

ACR AI-LAB and NVIDIA Make AI in Hospitals Easy on IT, Accessible to Every Radiologist

For radiology to benefit from AI, there needs to be easy, consistent and scalable ways for hospital IT departments to implement the technology. It’s a return to a service-oriented architecture, where logical components are separated and can each scale individually, and an efficient use of the additional compute power these tools require.

AI is coming from dozens of vendors as well as internal innovation groups, and needs a place within the hospital network to thrive. That’s why NVIDIA and the American College of Radiology (ACR) have published a Hospital AI Reference Architecture Framework. It helps hospitals easily get started with AI initiatives.

A Cookbook to Make AI Easy

The Hospital AI Reference Architecture Framework was published at yesterday’s annual ACR meeting for public comment. This follows the recent launch of the ACR AI-LAB, which aims to standardize and democratize AI in radiology. The ACR AI-LAB uses infrastructure such as NVIDIA GPUs and the NVIDIA Clara AI toolkit, as well as GE Healthcare’s Edison platform, which helps bring AI from research into FDA-cleared smart devices.

The Hospital AI Reference Architecture Framework outlines how hospitals and researchers can easily get started with AI initiatives. It includes descriptions of the steps required to build and deploy AI systems, and provides guidance on the infrastructure needed for each step.

Hospital AI Architecture Framework
Hospital AI Architecture Framework

To drive an effective AI program within a healthcare institution, there must first be an understanding of the workflows involved, compute needs and data required. It comes from a foundation of enabling better insights from patient data with easy-to deploy compute at the edge.

Using a transfer client, seed models can be downloaded from a centralized model store. A clinical champion uses an annotation tool to locally create data that can be used for fine-tuning the seed model or training a new model. Then, using the training system with the annotated data, a localized model is instantiated. Finally, an inference engine is used to conduct validation and ultimately inference on data within the institution.

These four workflows sit atop AI compute infrastructure, which can be accelerated with NVIDIA GPU technology for best performance, alongside storage for models and annotated studies. These workflows tie back into other hospital systems such as PACS, where medical images are archived.

Three Magic Ingredients: Hospital Data, Clinical AI Workflows, AI Computing

Healthcare institutions don’t have to build the systems to deploy AI tools themselves.

This scalable architecture is designed to support and provide computing power to solutions from different sources. GE Healthcare’s Edison platform now uses NVIDIA’s TRT-IS inference capabilities to help AI run in an optimized way within GPU-powered software and medical devices. This integration makes it easier to deliver AI from multiple vendors into clinical workflows — and is the first example of the AI-LAB’s efforts to help hospitals adopt solutions from different vendors.

Together, Edison with TRT-IS offers a ready-made device inferencing platform that is optimized for GPU-compliant AI, so models built anywhere can be deployed in an existing healthcare workflow.

Hospitals and researchers are empowered to embrace AI technologies without building their own standalone technology or yielding their data to the cloud, which has privacy implications.

The post ACR AI-LAB and NVIDIA Make AI in Hospitals Easy on IT, Accessible to Every Radiologist appeared first on The Official NVIDIA Blog.

By the Book: AI Making Millions of Ancient Japanese Texts More Accessible

Natural disasters aren’t just threats to people and buildings, they can also erase history — by destroying rare archival documents. As a safeguard, scholars in Japan are digitizing the country’s centuries-old paper records, typically by taking a scan or photo of each page.

But while this method preserves the content in digital form, it doesn’t mean researchers will be able to read it. Millions of physical books and documents were written in an obsolete script called Kuzushiji, legible to fewer than 10 percent of Japanese humanities professors.

“We end up with billions of images which will take researchers hundreds of years to look through,” said Tarin Clanuwat, researcher at Japan’s ROIS-DS Center for Open Data in the Humanities. “There is no easy way to access the information contained inside those images yet.”

Extracting the words on each page into machine-readable, searchable form takes an extra step: transcription, which can be done either by hand or through a computer vision method called optical character recognition, or OCR.

Clanuwat and her colleagues are developing a deep learning OCR system to transcribe Kuzushiji writing — used for most Japanese texts from the 8th century to the start of the 20th — into modern Kanji characters.

Clanuwat said GPUs are essential for both training and inference of the AI.

“Doing it without GPUs would have been inconceivable,” she said. “GPU not only helps speed up the work, but it makes this research possible.”

Parsing a Forgotten Script

Before the standardization of the Japanese language in 1900 and the advent of modern printing, Kuzushiji was widely used for books and other documents. Though millions of historical texts were written in the cursive script, just a few experts can read it today.

Only a tiny fraction of Kuzushiji texts have been converted to modern scripts — and it’s time-consuming and expensive for an expert to transcribe books by hand. With an AI-powered OCR system, Clanuwat hopes a larger body of work can be made readable and searchable by scholars.

She collaborated on the OCR system with Asanobu Kitamoto from her research organization and Japan’s National Institute of Informatics, and Alex Lamb of the Montreal Institute for Learning Algorithms. Their paper was accepted in 2018 to the Machine Learning for Creativity and Design workshop at the prestigious NeurIPS conference.

Using a labeled dataset of 17th to 19th century books from the National Institute of Japanese Literature, the researchers trained their deep learning model on NVIDIA GPUs, including the TITAN Xp. Training the model took about a week, Clanuwat said, but “would be impossible” to train on CPU.

Kuzushiji has thousands of characters, with many occurring so rarely in datasets that it is difficult for deep learning models to recognize them. Still, the average accuracy of the researchers’ KuroNet document recognition model is 85 percent — outperforming prior models.

The newest version of the neural network can recognize more than 2,000 characters. For easier documents with fewer than 300 character types, accuracy jumps to about 95 percent, Clanuwat said. “One of the hardest documents in our dataset is a dictionary, because it contains many rare and unusual words.”

One challenge the researchers faced was finding training data representative of the long history of Kuzushiji. The script changed over the hundreds of years it was used, while the training data came from the more recent Edo period.

Clanuwat hopes the deep learning model could expand access to Japanese classical literature, historical documents and climatology records to a wider audience.

The post By the Book: AI Making Millions of Ancient Japanese Texts More Accessible appeared first on The Official NVIDIA Blog.

Paige.AI Ramps Up Cancer Pathology Research Using NVIDIA Supercomputer

An accurate diagnosis is key to treating cancer — a disease that kills 600,000 people a year in the U.S. alone — and AI can help.

Common forms of the disease, like breast, lung and prostate cancer, can have good recovery rates when diagnosed early. But diagnosing the tumor, the work of pathologists, can be a very manual, challenging and time-consuming process.

Pathologists traditionally interpret dozens of slides per cancer case, searching for clues pointing to a cancer diagnosis. For example, there can be more than 60 slides for a single breast cancer case and, out of those, only a handful may contain important findings.

AI can help pathologists become more productive by accelerating and enhancing their workflow as they examine massive amounts of data. It gives the pathologists the tools to analyze images, provide insight based on previous cases and diagnose faster by pinpointing anomalies.

Paige.AI is applying AI to pathology to increase diagnostic accuracy and deliver better patient outcomes, starting with prostate and breast cancer. Earlier this year, Paige.AI was granted “Breakthrough Designation” by the U.S. Food and Drug Administration, the first such designation for AI in cancer diagnosis.

The FDA grants the designation for technologies that have the potential to provide for more effective diagnosis or treatment for life-threatening or irreversibly debilitating diseases, where timely availability is in the best interest of patients.

To find breakthroughs in cancer diagnosis, Paige.AI will access millions of pathology slides, providing the volume of data necessary to train and develop cutting-edge AI algorithms.

DGX-1 AI supercomputer
NVIDIA DGX-1 is proving to be an important research tool for many of the world’s leading AI researchers.

To make sense of all this data, Paige.AI uses an AI supercomputer made up of 10 interconnected NVIDIA DGX-1 systems. The supercomputer has the enormous computing power of over 10 petaflops necessary to develop a clinical-grade model for pathology and, for the first time, bridge the gap from research to a clinical setting that benefits future patients.

One example of how NVIDIA’s technology is already being used is a recent study by Paige.AI that used seven NVIDIA DGX-1 systems to train neural networks on a new dataset to detect prostate cancer. The dataset consisted of 12,160 slides, two orders of magnitude larger than previous datasets in pathology. The researchers achieved near perfect accuracy on a test set consisting of 1,824 real-world slides without any manual image-annotation.

By minimizing the time pathologists spend processing data, AI can help them focus their time on analyzing it. This is especially critical given the short supply of pathologists.

According to The Lancet medical journal, there is a single pathologist for every million people in sub-Saharan Africa and one for every 130,000 people in China. In the United States, there is one for rohly every 20,000 people, however, studies predict that number will shrink to one for about every 30,000 people by 2030.

AI gives a big boost to computational pathology by enabling quantitative analysis of the study of structures seen under a microscope and cell biology. This advancement is made possible by combining novel image analysis, computer vision and machine learning techniques.

“With the help of NVIDIA technology, Paige.AI is able to train deep neural networks from hundreds of thousands of gigapixel images of whole slides. The result is clinical-grade artificial intelligence for pathology,” said Dr. Thomas Fuchs, co-founder and chief scientific officer at Paige.AI. “Our vision is to help pathologists improve the efficiency of their work, for researchers to generate new insights, and clinicians to improve patient care.”

 

Feature image credit: Dr. Cecil Fox, National Cancer Institute, via Wikimedia Commons.

The post Paige.AI Ramps Up Cancer Pathology Research Using NVIDIA Supercomputer appeared first on The Official NVIDIA Blog.

Intel Drives Innovation across the Software Stack with Open Source for AI and Cloud

osts 19 logoWhat’s New: Intel is hosting the annual Open Source Technology Summit (OSTS) May 14-16. What started as an internal conference in 2004 with a few dozen engineers now brings together 500 participants. This year is the most open yet, with leaders from Alibaba*, Amazon*, AT&T*, Google*, Huawei*, JD.com*, Microsoft*, MontaVista*, Red Hat*, SUSE* and Wind River* taking part in discussions of open source software that is optimized for Intel hardware and will drive the next generation of data-centric technology in areas such as containers, artificial intelligence (AI), machine learning and other cloud to edge to device workloads.

“OSTS is at its heart a technology conference, and it’s the depth of technical content, engineering engagement and community focus that make the summit so valuable. This year we’re open-sourcing our open source summit, inviting customers, partners and industry stakeholders for the first time. I’m excited by the opportunity to connect the community with the amazing people who are driving open source at Intel.”
–Imad Sousou, Intel corporate vice president and general manager of System Software Products

The Details: The latest contributions Intel is sharing at OSTS represent critical advances in:

  • Modernizing core infrastructure for uses well-suited to Intel architecture
  • ModernFW Initiative has the goal to remove legacy code and modularize design for scalability and security. By delivering just enough code to boot the kernel, this approach can help reduce exposure to security risks and help ensure management is easier for users.
  • rust-vmm offers a set of common hypervisor components, developed by Intel with industry leaders including Alibaba, Amazon, Google and Red Hat to deliver use-case specific hypervisors. Intel has released a special-purpose cloud hypervisor based on rust-vmm with partners to provide a more secure, higher performance container technology designed for cloud native environments.
  • Intel is also committing to advancing critical system infrastructure projects by assigning developers to contribute code, as well as incorporating our “0-day Continuous Integration” best practices to technologies beyond the Linux* kernel. Projects Intel plans to contribute to include (but are not limited to) bash*, chrony*, the Fuzzing Project*, GnuPG*, libffi*, the Linux Kernel Self Protection Project*, OpenSSH*, OpenSSL* and the R* programming language.
  • Enhancing Intel Linux-based solutions for developers and partners: Intel’s Clear Linux* Distribution is adding Clear Linux Developer Edition, which includes a new installer and store, bringing together toolkits to give developers an operating system with all Intel hardware features already enabled. Additionally, Clear Linux usages are expanding to provide end-to-end integration and optimization for Intel hardware features and key workloads supporting the Deep Learning and Data Analytics software stacks. The performance, security, ease-of-use and customization advantages make Clear Linux a great choice for Linux developers.
    • The Deep Learning Reference Stack is an integrated, highly-performant open source stack optimized for Intel® Xeon® Scalable Processors. This stack includes Intel® Deep Learning Boost (Intel DL Boost) and is designed to accelerate AI use cases such as image recognition, object detection, speech recognition and language translation.
    • The Data Analytics Reference Stack was developed to help enterprises analyze, classify, recognize and process large amounts of data built on Intel® Xeon® Scalable platforms using Apache Hadoop and Apache Spark*.
  • Enabling new usages across automotive and industrial automation: In a world where functional safety is increasingly important, workload consolidation is both complex and critical. And with the growing reliance on software-defined systems, virtualization has never been more important. Intel is working to transform the software-defined environment to support a mix of safety critical, non-safety critical and time critical workloads to help support automotive, industrial automation and robotics uses.
    • Fusion Edge Stacks support the consolidated workloads that today’s connected devices demand using the ACRN* device hypervisor, Clear Linux OS, Zephyr Project* and Android*.
    • The Intel Robot SDKbrings together the best of Intel hardware and software in one resource, simplifying the process of creating AI-enabled robotics and automation solutions, with an optimized computer vision stack.

    Why It Matters: Open source powers the software-defined infrastructure that transformed the modern data center and ushered in the data-centric era. Today, the vast majority of the public cloud runs on open source software; new contributions by Intel are poised to drive a future where everything is software-defined, including new areas such as automotive, industrial and retail.

    With more than 15,000 software engineers, Intel invests in software and the work on standards initiatives to optimize the workload and to unlock the performance of our processors. In addition to significant contributions to the Linux kernel, Chromium OS* and OpenStack*, Intel’s leadership in the open source community drives industry advancement that fuel new models for hardware and software interaction in emerging workloads.

    Intel is in a unique position to bring together key industry players to address the complexity of building for diverse architectures and workloads and enable faster deployments of new innovations at scale. Software is a key technology pillar for Intel to fully realize the advancements in architecture, process, memory, interconnect and security.

    The post Intel Drives Innovation across the Software Stack with Open Source for AI and Cloud appeared first on Intel Newsroom.

    Bird’s-AI View: Harnessing Drones to Improve Traffic Flow

    Traffic. It’s one of the most commonly cited frustrations across the globe.

    It consumed nearly 180 hours of productive time for the average U.K. driver last year. German drivers lost an average of 120 hours. U.S. drivers lost nearly 100 hours.

    Because time is too precious to waste, RCE Systems — a Brno, Czech Republic-based startup and member of the NVIDIA Inception program — is taking its tech to the air to improve traffic flow.

    Its DataFromSky platform combines trajectory analysis, computer vision and drones to ease congestion and improve road safety.

    AI in the Sky

    Traffic analysis has traditionally been based on video footage from fixed cameras, mounted at specific points along roads and highways.

    This can severely limit the analysis of traffic which is, by nature, constantly moving and changing.

    Capturing video from a bird’s-eye perspective via drones allows RCE Systems to gain deeper insights into traffic.

    Beyond monitoring objects captured on video, the DataFromSky platform interprets movements using AI to provide highly accurate telemetric data about every object in the traffic flow.

    RCE Systems trains its deep neural networks using thousands of hours of video footage from around the globe, shot in various weather conditions. The training takes place on NVIDIA GPUs using Caffe and TensorFlow.

    These specialized neural networks can then recognize objects of interest and continually track them in video footage.

    The data captured via this AI process is used in numerous research projects, enabling deeper analysis of object interaction and new behavioral models of drivers in specific traffic situations.

    Ultimately, this kind of data will also be crucial for the development of autonomous vehicles.

    Driving Impact

    The DataFromSky platform is still in its early days, but its impact is already widespread.

    RCE Systems is working on a system for analyzing safety at intersections, based on driver behavior. This includes detecting situations where accidents were narrowly avoided and then determining root causes.

    By understanding these situations better, their occurrence can be avoided — making traffic flow easier and preventing vehicle damage as well as potential loss of life.

    Toyota Europe used RCE Systems’ findings from the DataFromSky platform to create probabilistic models of driver behavior as well as deeper analysis of interactions with roundabouts.

    Leidos used insights gathered by RCE Systems to calibrate traffic simulation models as part of its projects to examine narrowing freeway lanes and shoulders in Dallas, Seattle, San Antonio and Honolulu.

    And the value of RCE Systems’ analysis is not limited to vehicles. The Technical University of Munich has used it to perform a behavioral study of cyclists and pedestrians.

    Moving On

    RCE Systems is looking to move to NVIDIA Jetson AGX Xavier in the future to accelerate their AI at the edge solution. They are currently developing a “monitoring drone” capable of evaluating image data in flight, in real time.

    It could one day replace a police helicopter during high-speed chases or act as a mobile surveillance system for property protection.

    The post Bird’s-AI View: Harnessing Drones to Improve Traffic Flow appeared first on The Official NVIDIA Blog.

    NVIDIA and Red Hat Team to Accelerate Enterprise AI

    For enterprises looking to get their GPU-accelerated AI and data science projects up and running more quickly, life just got easier.

    At Red Hat Summit today, NVIDIA and Red Hat introduced the combination of NVIDIA’s GPU-accelerated computing platform and the just-announced Red Hat OpenShift 4 to speed on-premises Kubernetes deployments for AI and data science.

    The result: Kubernetes management tasks that used to take an IT administrator the better part of a day can now be completed in under an hour.

    More GPU Acceleration, Less Deployment Hassle

    This collaboration comes at a time when enterprises are relying on AI and data science to turn their vast amounts of data into actionable intelligence.

    But meaningful AI and data analytics work requires accelerating the full stack of enterprise IT software with GPU computing. Every layer of software — from NVIDIA drivers to container runtimes to application frameworks — needs to be optimized.

    Our CUDA parallel computing architecture and CUDA-X acceleration libraries have been embraced by a community of more than 1.2 million developers for accelerating applications across a broad set of domains — from AI to high-performance computing to VDI.

    And because NVIDIA’s common architecture runs on every computing device imaginable — from a laptop to the data center to the cloud — the investment in GPU-accelerated applications is easy to justify and just makes sense.

    Accelerating AI and data science workloads is only the first step, however. Getting the optimized software stack deployed the right way in large-scale, GPU-accelerated data centers can be frustrating and time consuming for IT organizations. That’s where our work with Red Hat comes in.

    Red Hat OpenShift is the leading enterprise-grade Kubernetes platform in the industry. Advancements in OpenShift 4 make it easier than ever to deploy Kubernetes across a cluster. Red Hat’s investment in Kubernetes Operators, in particular, reduces administrative complexity by automating many routine data center management and application lifecycle management tasks.

    NVIDIA has been working on its own GPU operator to automate a lot of the work IT managers previously did through shell scripts, such as installing device drivers, ensuring the proper GPU container runtimes are present on all nodes in the data center, as well as monitoring GPUs.

    Thanks to our work with Red Hat, once the cluster is set up, you simply run the GPU operator to add the necessary dependencies to the worker nodes in the cluster. It’s just that easy. This can make it as simple for an organization to get its GPU-powered data center clusters up and running with OpenShift 4 as it is to spin up new cloud resources.

    Preview and Early Access Program

    At Red Hat Summit, we’re showing in our booth 1039 a preview of how easy it is to set up bare-metal GPU clusters with OpenShift and GPU operators.

    Also, you won’t want to miss Red Hat Chief Technology Officer Chris Wright’s keynote on Thursday when NVIDIA Vice President of Compute Software Chris Lamb will join him on stage to demonstrate how our technologies work together and discuss our collaboration in further detail.

    Red Hat and NVIDIA are inviting our joint customers in a white-glove early access program. Customers who want to learn more or participate in the early access program can sign up at https://www.openshift.com/accelerated-ai.

    The post NVIDIA and Red Hat Team to Accelerate Enterprise AI appeared first on The Official NVIDIA Blog.

    AI of the Storm: Deep Learning Analyzes Atmospheric Events on Saturn

    Saturn is many times the size of Earth, so it’s only natural that its storms are more massive — lasting months, covering thousands of miles, and producing lightning bolts thousands of times more powerful.

    While scientists have access to galaxies of data on these storms, its sheer volume leaves traditional methods inadequate for studying the planet’s weather systems in their entirety.

    Now AI is being used to launch into that trove of information. Researchers from University College London and the University of Arizona are working with data collected by NASA’s Cassini spacecraft, which spent 13 years studying Saturn before disintegrating in the planet’s atmosphere in 2017.

    A recently published Nature Astronomy paper describes how the scientists’ deep learning model can reveal previously undetected atmospheric features on Saturn, and provide a clearer view of the planet’s storm systems at a global level.

    In addition to providing new insights about Saturn, the AI can shed light on the behavior of planets both within and beyond our solar system.

    “We have these missions that go around planets for many years now, and much of this data basically sits in an archive and it’s not being looked at,” said Ingo Waldmann, deputy director of the University College London’s Centre for Space Exoplanet Data. “It’s been difficult so far to look at the bigger picture of this global dataset because people have been analyzing data by hand.”

    The researchers used an NVIDIA V100 GPU, the most advanced data center GPU, for both training and inference of their neural networks.

    Parting the Clouds

    Scientists studying the atmosphere of other planets take one of two strategies, Waldmann says. Either they conduct a detailed manual analysis of a small region of interest, which could take a doctoral student years — or they simplify the data, resulting in rough, low-resolution findings.

    “The physics is quite complicated, so the data analysis has been either quite old-fashioned or simplistic,” Waldmann said. “There’s a lot of science one can do by using big data approaches on old problems.”

    Thanks to the Cassini satellite, researchers have terabytes of data available to them. Primarily using unsupervised learning, Waldmann and Caitlin Griffith, his co-author from the University of Arizona, trained their deep learning model on data from the satellite’s mapping spectrometer.

    This data is commonly collected on planetary missions, Waldmann said, making it easy to apply their AI model to study other planets.

    The researchers saw speedups of 30x when training their deep learning models on a single V100 GPU compared to CPU. They’re now transitioning to using clusters of multiple GPUs. For inference, Waldmann said the GPU was around twice as fast as using a CPU.

    Using the AI model, the researchers were able to analyze a months-long electrical storm that churned through Saturn’s southern hemisphere in 2008. Scientists had previously detected a bright ammonia cloud from satellite images of the storm — a feature more commonly spotted on Jupiter, but rarely seen on Saturn.

    Waldmann and Griffith’s AI-analyzed data from this months-long electric storm on Saturn. Left image shows the planet in colors similar to how the human eye would see it, while the image on the right is color enhanced, making the storm stand out more clearly. (Image credit: NASA/JPL/Space Science Institute)

    Waldmann and Griffith’s neural network found that the ammonia cloud visible by eye was just the tip of a “massive upwelling” of ammonia hidden under a thin layer of other clouds and gases.

    “What you can see by eye is just the strongest bit of that ammonia feature,” Waldmann said. “It’s just the tip of the iceberg, literally. The rest is not visible by eye — but it’s definitely there.”

    To Infinity and Beyond

    For researchers like Waldmann, these findings are just the first step. Deep learning can provide planetary scientists for the first time with depth and breadth at once, producing detailed analyses that also cover vast geographic regions.

    “It will tell you very quickly what the global picture is and how it all connects together,” said Waldmann. “Then researchers can go and look at individual spots that are interesting within a particular system, rather than blindly searching.”

    A better understanding of Saturn’s atmosphere can help scientists analyze how our solar system behaves, and provide insights that can be extrapolated to planets around other stars.

    Already, the researchers are extending their model to study features on Mars, Venus and Earth using transfer learning — which they were surprised to learn “works really well between planets.”

    While Venus and Earth are almost identical in size, Venus has no global plate tectonics. In collaboration with the Observatoire de Paris, the team is starting a project to analyze Venus’s cloud structure and planetary surface to understand why the planet lacks tectonic plates.

    Rather than atmospheric features, the researchers’ Mars project focuses on studying the planet’s surface. Data from the Mars Reconaissance Orbiter can create a global analysis that scientists can use to deduce where ancient water was most likely present, and to determine where the next Mars rover should land.

    The underlying pattern recognition algorithm can be extended even further, Waldmann said. On Earth, it can be repurposed to spot rogue fishing vessels to preserve protected environments. And across the solar system on Jupiter, a transfer learning approach can train an AI model to analyze how the planet’s storms change over time.

    Waldmann says there’s relatively easy access to training data — creating an open field of opportunities for researchers.

    “This is the beautiful thing about planetary science,” he said. “All of the data for all of the planets is publicly available.”

    Main image, captured in 2011, shows the largest storm observed on Saturn by the Cassini spacecraft. (Image credit NASA/JPL-Caltech/SSI)

    The post AI of the Storm: Deep Learning Analyzes Atmospheric Events on Saturn appeared first on The Official NVIDIA Blog.

    Israel’s Holocaust Museum Embracing AI to Help Visitors Draw Insights from its Vast Archives

    Yad Vashem, the world’s preeminent Holocaust memorial center, is dedicated to keeping alive for future generations the memory of the 6 million Jews who perished at the hands of the German Nazis and their collaborators.

    But its World Holocaust Remberance Center — a source for documentation used by scholars worldwide — is overwhelmed with difficult-to-find digital media documenting the lives of victims and survivors.

    The Jerusalem-based organization is turning to AI to help identify, organize and link photos and other historical documents amid its ocean of data, for easier discovery. That’s because the documentation, gathered over decades of submissions and discoveries, and now almost fully digitized, is a source for Holocaust scholars globally.

    A destination for a million visitors each year — six U.S. presidents have visited the site — Yad Vashem has archives that include unique, searing video testimonies, short films, photos, personal written accounts, Nazi documentation, and audio files. In addition to remembering Hitler’s victims, it pays tribute to the non-Jews who put their lives at risk trying to save them.

    People worldwide last week recognized Holocaust Remembrance Day.

    Twice the Data of Library of Congress

    Its 800 million digital assets — which comprise over 4 petabytes of data (more than twice that held by the U.S. Library of Congress) — make it a daunting challenge for the institution to keep up with indexing this history for researchers, let alone reach a younger generation.

    Using deep neural networks, Yad Vashem’s team can let image-recognition algorithms help index and categorize its digital history. This could lead to finding new connections and stories on Holocaust victims, according to Michael Lieber, chief information officer at Yad Vashem.

    Lieber is optimistic that AI will help better identify resources to tell stories of Holocaust victims and survivors on its social media accounts. That could help keep it in touch with younger audiences, he said.

    He’s also hopeful that researchers may use deep learning in ways to surface new historical information that couldn’t otherwise be discovered.

    “We are among the first institutions in the world dealing with cultural heritage that decided to have a digital copy of everything because that is the way to get to a much wider audience globally,” said Lieber.

    Improving Search for Family History

    Many individuals visit Yad Vashem to research what happened to grandparents and great grandparents and piece together their family history. The problem is that the collection of digitized data, which could double in years to come, is difficult to search.

    Yad Vashem’s technology team aims to change that by tapping into deep learning driven by high performance computing.

    It plans to harness the supercomputing power of the NVIDIA DGX-1 AI system to help organize and augment its history using deep learning. DGX-1 offers the power of hundreds of CPU-based servers in a single system capable of over a petaFLOP of AI computing power.

    The DGX-1 puts Yad Vashem alongside the world’s most innovative organizations deploying AI to address their challenges, said Yuval Mazor, senior solutions architect at NVIDIA.

    “They get tangible benefits from the application of AI,” he said. “For example, Yad Vashem can use video analytics for understanding and predicting museum traffic and the impact of individual exhibits, as well as for extracting deep insights from the wealth of historical data,” he said. “These can help Yad Vashem in its primary mission, which is to reach and educate as many people as possible.”

    Unsupervised learning holds the promise for trained neural networks to create meta-tags for digital artifacts, allowing deep learning to connect the dots on all kinds of information, Lieber said.

    “If you manage to locate a prison card in the Mauthausen camp, the system will know that it is an inmate card,” he said. “It will direct you to the relevant data fields and documents, and you will be able to locate and identify types of documents and provide additional information without human intervention.”

    The alternative would be to have legions of people label hundreds of millions of digital media assets and continue to keep track and make updates on databases.

    NVIDIA research and development staff in Israel is partnering with Yad Vashem on the effort.

     

     

    The post Israel’s Holocaust Museum Embracing AI to Help Visitors Draw Insights from its Vast Archives appeared first on The Official NVIDIA Blog.