Researchers Make Movies of the Brain with CUDA

When colleagues told Sally Epstein they sped up image processing by three orders of magnitude for a client’s brain-on-a-chip technology, she responded like any trained scientist. Go back and check your work, the biomedical engineering Ph.D. told them.

Yet it was true. The handful of researchers at Cambridge Consultants had devised a basket of techniques to process an image on GPUs in an NVIDIA DGX-1 system in 300 milliseconds, a 3,000x boost over the 18 minutes the task took on an Intel Core i9 CPU.

The achievement makes it possible for researchers to essentially watch movies of neurons firing in real time using the brain-on-a-chip technology from NETRI, a French startup.

“Animal studies revolutionized medicine. This is the next step in testing for areas like discovering new drugs,” said Epstein, head of Strategic Technology at Cambridge Consultants, which develops products and technologies for a wide variety of established companies and startups such as NETRI.

The startup designs chips that sport 3D microfluidic channels to host neural tissues and a CMOS camera sensor with polarizing filters to detect individual neurons firing. It hopes its precision imaging can speed the development of novel treatments for neurological disorders such as Alzheimer’s disease.

Facing a Computational Bottleneck

NETRI’s chips generate 100-megapixel images at up to 1,000 frames per second — the equivalent of a hundred 4K gaming systems running at 120fps. Besides spawning tons of data, they use highly complex math.

As a result, processing a single second of a recording took NETRI 12 days, an unacceptable delay. So, the startup turned to Cambridge Consultants to bust through the bottleneck.

“Our track record in scientific and biological imaging turned out to be very relevant,” said Monty Barlow, Director of Strategic Technology at Cambridge Consultants. And when NETRI heard about the 3,000x boost, “they trusted us even though we didn’t trust ourselves at first,” he quipped.

Leveraging Math, Algorithms and GPUs

A handful of specialists at Cambridge Consultants delivered the 3,000x speedup using multiple techniques. For example, math and algorithm experts employed a mix of Gaussian filters, multivariate calculus and other tools to eliminate redundant tasks and reduce peak RAM requirements.

Software developers migrated NETRI’s Python code to CuPy to take advantage of the massive parallelism of NVIDIA’s CUDA software. And hardware specialists optimized the code to fit into GPU memory, eliminating unnecessary data transfers inside the DGX-1.

The CUDA profiler helped find bottlenecks in NETRI’s code and alternatives to resolve them. “NVIDIA gave us the tools to execute this work efficiently — it happened within a week with a core team of four researchers and a few specialists,” said Epstein.

Looking ahead, Cambridge Consultants expects to find further speedups for the code using the DGX-1 that could enable real-time manipulation of neurons using a laser. Researchers also aim to explore NVIDIA IndeX software to help visualize neural activity.

The work with NETRI is one of several DGX-1 applications at the company. It also hosts a Digital Greenhouse for AI research. Last year, it used the DGX-1 to create a low-cost but highly accurate tool for monitoring tuberculosis.

The post Researchers Make Movies of the Brain with CUDA appeared first on The Official NVIDIA Blog.

NVIDIA Gives COVID-19 Researchers Free Access to Parabricks

When a crisis hits, we all pitch in with what we have. In response to the current pandemic, NVIDIA is sharing tools with researchers that can accelerate their race to understand the novel coronavirus and help inform a response.

Starting today, NVIDIA will provide a free 90-day license to Parabricks to any researcher in the worldwide effort to fight the novel coronavirus. Based on the well-known Genome Analysis Toolkit, Parabricks uses GPUs to accelerate by as much as 50x the analysis of sequence data.

We recognize this pandemic is evolving, so we’ll monitor the situation and extend the offer as needed.

If you have access to NVIDIA GPUs, fill out this form to request a Parabricks license.

For researchers working with Oxford Nanopore long-read data, a repository of GPU-accelerated tools is available on GitHub. In addition, the following applications already have NVIDIA GPU acceleration built in: Medaka, Racon, Raven, Reticulatus, Unicycler.

Researchers are sequencing both the novel coronavirus and the genomes of people afflicted with COVID-19 to understand, among other things, the spread of the disease and who is most affected. But analyzing genomic sequences takes time and computing muscle.

Accelerating science has long been part of NVIDIA’s core mission. The Parabricks team joined NVIDIA in December, providing the latest tool for that work. It can reduce the time for variant calling on a whole human genome from days to less than an hour on a single server.

Given the unprecedented spread of the pandemic, getting results in hours versus days could have an extraordinary impact on understanding the virus’s evolution and the development of vaccines.

NVIDIA is inviting our family of partners to join us in matching this urgent effort to assist the research community. We’re in discussions with cloud service providers and supercomputing centers to provide compute resources and access to Parabricks on their platforms.

We’ll update this blog with links to others who can provide cloud-based access to NVIDIA GPUs and this software as those sources become available.

Support links:

The post NVIDIA Gives COVID-19 Researchers Free Access to Parabricks appeared first on The Official NVIDIA Blog.

Engage Warp Drive: Innovation Awaits at the HPC Summit

Every so often the scientists, researchers and engineers in high performance computing get a chance to take their work to the next level. 

The HPC Summit, taking place March 25-26 in San Jose, is the community’s latest opportunity for a strategic refresh.

Speakers from across the HPC community will share their work driving science and technology forward, including important updates on GPU-accelerated applications networking architectures and the latest systems and developer tools for x86, Power and Arm environments. It’s also an opportunity to spark fresh ideas and spawn new collaborations across the diverse HPC landscape.

HPC computing is expanding in all directions. The Summit supercomputer recently simulated and rendered the first visualization of NASA’s massive dataset describing a space shuttle for the first manned-mission to Mars. Back on Earth, IBM and government researchers just crafted a new HPC model that will deliver faster, more accurate weather forecasts to your smartphone.

At the HPC Summit, plenary speakers such as Richard Loft, a technology director at the National Center for Atmospheric Research, will talk about his team’s work harnessing emerging exascale systems and AI to create global climate models that span decades.

The Summit supercomputer cracked the exascale barrier in AI.

Jeff Nichols, associate laboratory director of Oak Ridge National Laboratory, will describe the impact on science as supercomputing evolves through exascale performance, AI and beyond. Oak Ridge is the home of Summit, the world’s most powerful supercomputer, which is already delivering 3.3 exaflops of mixed-precision horsepower on AI tasks.

Arming for the Exascale Era

In core technologies, the HPC Summit will provide a strategic update on Arm in high performance computing. The CPU architecture — ubiquitous in smartphones and IoT devices — is expanding the range of choices for the emerging exascale era.

Brent Gorda, senior director for HPC at Arm, will give a state of the union address on the Arm ecosystem in HPC. He’ll provide an update on the latest Arm supercomputer projects and take feedback on future directions. It’s an initiative NVIDIA officially joined when it announced in June support for CUDA on Arm.

The release in November of an NVIDIA reference design for GPU acceleration on Arm brought many new partners into the fold. It’s expected to be a hot topic of discussion at developer breakout sessions at the HPC Summit, where other Arm representatives will speak on its expanding HPC ecosystem.

The TOP500 list of the world’s fastest supercomputers already includes two Arm-based systems. Astra, deployed at Sandia National Laboratories, uses Marvell’s ThunderX2 processors linked on high-speed interconnects from Mellanox. Fujitsu’s A64FX prototype is a precursor to an exascale system set to be running in Japan in 2021.

Onramps to Supercomputers and Clouds

The HPC Summit hosts several plenary speakers who are leaders from both commercial and research fields. They include Ashok Belani, executive vice president of technology for Schlumberger, and Thomas Schulthess, a professor of computational physics at ETH Zurich and director of the Swiss National Supercomputing Center.

A data center track will offer best practices for tapping into hyperscale cloud services.

For example, Alan Chalker, director of strategic programs at the Ohio Supercomputer Center, will describe the Open OnDemand Project, a popular web-based gateway, giving researchers and engineers access to powerful HPC systems.

Nidhi Chappell, who heads up HPC and AI efforts at Microsoft Azure, will provide her insights. Separately, Mellanox Technologies will describe an emerging class of I/O processors that can help speed work in HPC and AI in the cloud or at the edge of the network.

In addition to more than a dozen sessions, the HPC Summit provides plenty of opportunities for human networking, too.

A morning panel of experts will field questions about the latest GPU technologies and tools and give the community a chance to give feedback. They’ll cover a broad range of topics including libraries, hardware, platform support, developer tools and more.

The co-located GTC sports 600+ expert sessions.

In the afternoon, a developer’s forum also will include Q&A with experts. They’ll answer questions and take feedback on future directions in programming models and languages, math libraries, HPC+AI, data science, mixed precision, Arm, multi-GPU programming, message passing and other topics.

The event is co-located with the annual NVIDIA GPU Technology Conference (GTC), so you can bet there will be plenty of news to discuss.

An opportunity to take your work to the next level starts when you register for the HPC Summit at GTC. If you want to dive deeper into 600+ industry-focused talks, register for GTC and freely attend the HPC Summit as part of your pass.

The post Engage Warp Drive: Innovation Awaits at the HPC Summit appeared first on The Official NVIDIA Blog.

NVIDIA Awards $50,000 Fellowships to Ph.D. Students for GPU Computing Research

Our NVIDIA Graduate Fellowship Program recently awarded up to $50,000 each to five Ph.D. students involved in GPU computing research.

Now in its 19th year, the fellowship program supports graduate students doing GPU-based work. We selected this year’s fellows from more than 300 applicants from a host of countries.

The fellows’ work puts them at the forefront of GPU computing, including projects in deep learning, graphics, high performance computing and autonomous machines.

“Our fellowship recipients are among the most talented graduate students in the world,” said NVIDIA Chief Scientist Bill Dally. “They’re working on some of the most important problems in computer science, and we’re delighted to support their research.”

The NVIDIA Graduate Fellowship Program is open to applicants worldwide.

Our 2020-2021 fellows are:

  • Anqi Li, University of Washington — Bridging the gap between robotics research and applications by exploiting complementary tools from machine learning and control theory
  • Benedikt Bitterli, Dartmouth College — Principled forms of sample reuse that unlock more efficient ray-tracing techniques for offline and real-time rendering
  • Vinu Joseph, University of Utah — Optimizing deep neural networks for performance and scalability
  • Xueting Li, University of California, Merced — Self-supervised learning and relation learning between different visual elements
  • Yue Wang, Massachusetts Institute of Technology — Designing sensible deep learning modules that learn effective representations of 3D data

And our 2020-2021 finalists are:

  • Guandao Yang, Cornell University
  • Michael Lutter, Technical University of Darmstadt
  • Yuanming Hu, Massachusetts Institute of Technology
  • Yunzhu Li, Massachusetts Institute of Technology
  • Zackory Erickson, Georgia Institute of Technology

The post NVIDIA Awards $50,000 Fellowships to Ph.D. Students for GPU Computing Research appeared first on The Official NVIDIA Blog.

Spacing Out: How AI Provides Astronomers with Insights of Galactic Proportions

The gallery of galaxy images astronomers produce is multiplying faster than the number of selfies on a teen’s new smartphone.

Millions of these images have already been collected by astronomy surveys. But the volume is spiraling with projects like the recent Dark Energy Survey and upcoming Legacy Survey of Space and Time, which will capture billions more.

Volunteers flocked to a recent crowdsource project, Galaxy Zoo, to help classify over a million galaxy images from the Sloan Digital Sky Survey. But citizen science can carry astrophysics only so far.

“Galaxy Zoo was a very successful endeavor, but the rate at which next-generation surveys will gather data will make crowdsourcing methods no longer scalable,” said Asad Khan, a physics doctoral student at the University of Illinois at Urbana-Champaign. “This is where human-in-the-loop techniques present an approach to guide AI to data-driven discovery, including image classification.”

Using transfer learning from the popular image classification model Xception, Khan and his fellow researchers developed a neural network that categorizes galaxy images as elliptical or spiral with expert-level accuracy. Classifying galaxy shapes helps scientists determine how old they are. It can also help them understand more complex questions about dark energy and how fast the universe is expanding.

Automating elements of galaxy classification enables astrophysicists to spend less time on basic labeling and focus on more complex research questions.

The research — the first application of deep transfer learning for galaxy classification — was one of six projects featured at the Scientific Visualization and Data Analytics Showcase at SC19, the annual supercomputing trade show.

AI Wrinkle in Time

The researchers trained the deep learning network on around 35,000 galaxy images from the Sloan Digital Sky Survey. Using Argonne National Laboratory’s Cooley supercomputer, which is equipped with dozens of NVIDIA data center GPUs, the team accelerated neural network training from five hours to just eight minutes.

When tested on other images from the Sloan Digital Sky Survey, the AI achieved 99.8 percent accuracy for classifying images as either elliptical or spiral galaxies — an improvement compared to neural networks trained without transfer learning.

Using a single NVIDIA V100 Tensor Core GPU for inference, the team was able to classify 10,000 galaxies in under 30 seconds.

“We can already start using this network, or future versions of it, to start labeling the 300 million galaxies in the Dark Energy Survey,” Khan said, “With GPU-accelerated inference, we could classify all the images in no time at all.”

Khan and his team also developed a visualization to show how the neural network learned during training.

“Even if deep learning models can achieve impressive accuracy levels, when AI does make a mistake, we often don’t know why,” he said. “Visualizations like these can serve as a heuristic check on the network’s performance, providing more interpretability for science communities.”

The researchers next plan to study how the morphology of galaxies change with redshift, a phenomenon caused by the expansion of the universe.

Main image from the Sloan Digital Sky Survey, licensed from Wikimedia Commons under Creative Commons (CC BY 4.0).

The post Spacing Out: How AI Provides Astronomers with Insights of Galactic Proportions appeared first on The Official NVIDIA Blog.

Italy Forges AI Future in Partnership with NVIDIA

Italy is well known for its architecture, culture and cuisine. Soon, its contributions to AI may be just as renowned.

Taking a step in that direction, a national research organization forged a three-year collaboration with NVIDIA. Together they aim to accelerate AI research and commercial adoption across the country.

Leading the charge for Italy is CINI, the National Inter-University Consortium for Informatics that includes a faculty of more than 1,300 professors in various computing fields across 39 public universities.

CINI’s National Laboratory of Artificial Intelligence and Intelligence Systems (AIIS) is spearheading the effort as part of its goal to expand Italy’s ecosystem for both academic research and commercial applications of AI.

“Leveraging NVIDIA’s expertise to build systems specifically designed for the creation of AI will help secure Italy’s position as a top player in AI education, research and industry adoption,” said Rita Cucchiara, a professor of computer engineering and science and director of AIIS.

National effort begins in Modena

The joint initiative aims to train students, nurture startups and spread adoption of the attest AI technology throughout Italy. As a first step, the partners will create a local hub at the University of Modena and Reggio Emilia (Unimore) for the global NVIDIA AI Technology Center.

The partnership marks an important expansion of NVIDIA’s work with the university whose roots date back to the medieval period.

In December, the company supported research on a novel way to automate the process of describing actions in a video. A team of four researchers at Unimore and one from Milan-based AI startup Metaliquid developed an AI model that achieved up to a 16 percent relative improvement compared to prior solutions. In a final stage of the project, NVIDIA helped researchers analyze their network’s topology to optimize training it on an NVIDIA DGX-1 system.

In July, Unimore and NVIDIA collaborated on an event for AI startups. Unimore’s AImageLab hosted the event, which included representatives of NVIDIA’s Inception program, an initiative to nurture AI startupswith access to the company’s technology and ecosystem.

The collaboration comes at a time when the AImageLab, host for the new NVIDIA hub, is already making its mark in areas such as machine vision and medical imaging.

Winning kudos in image recognition

In September, two world-class research events singled out the AImageLab for recognition. One team from the lab won a best paper award at the International Conference on Computer Analysis of Images and Patterns. Another came third out of 64 research groups in an international competition using AI to classify skin lesions.

The Modena hub becomes the latest of more than 12 collaborations with countries worldwide for the NVIDIA AI Technology Center. NVAITC maintains an open database of research and tools developed with and for its partners.

Overall, the new collaboration “will bring together NVIDIA and CINI in our shared mission to enable, support and inform Italy’s AI ecosystem for research, industry and society,” said Simon See, senior director of NVAITC.

The post Italy Forges AI Future in Partnership with NVIDIA appeared first on The Official NVIDIA Blog.

Blue Moon Over Dijon: French Hobbyist Taps GPU for Stellar Camera

By day, Alain Paillou is the head of water quality for the Bourgogne region of France. But when the stars come out, he indulges his other passions.

Paillou takes exquisitely crisp pictures of the moon, stars and planets — a hobby that combines his lifelong love of astronomy and technology.

Earlier this year, he chronicled on an NVIDIA forum his work building what he calls SkyNano, a GPU-powered camera to take detailed images of the night sky using NVIDIA’s Jetson Nano.

“I’ve been interested in astronomy from about eight or 10 years old, but I had to quit my studies for more than 30 years because of my job as an aerospace software engineer,” said Paillou in an interview from his home in Dijon.

Paillou went back to school in his early 30s to get a degree and eventually a job as a hydrogeologist. “I came back to astronomy after my career change 20 years ago when I lived in Paris, where I started taking photographs of the moon, Jupiter and Saturn,” he said.

“I really love technology and astronomy needs technical competence,” he said. “It lets me return to some of the skills of my first job — developing software to get the best results from my equipment — and it’s very interesting to me.”

Seeing Minerals on the Moon

Paillou loves to take color-enhanced pictures of the moon that show the diversity of its blue titanium and orange iron-oxide minerals. And he delights in capturing star-rich pictures of the night sky. Both require significant real-time filters, best run on a GPU.

Around his Dijon home, as in many places, “the sky is really bad with light pollution from cities that make images blurry,” he said. “I can see 10-12 stars with my eyes, but with my system I can see thousands of stars,” he said.

Paillou in his home astronomy lab in Dijon.

“If you want to retrieve something beautiful, you need to apply real-time filtering with an A/V compensation system. I built my own system because I could not find anything I could buy that matched what I wanted,” Paillou said.

Building the SkyNano

His first prototype mounted a ZWO ASI178MC camera using a Sony IMX178 color sensor on a platform with a gyro/compass and a two-axis mount controlled by stepper motors. Initially he used a Raspberry Pi 3 B+ to run Python programs that controlled the mount and camera.

The board lacked the muscle to drive the real-time filters. After some more experiments, he asked NVIDIA for help in his first post on the Jetson Nano community projects forum on June 21. By July 5, he had a Jetson Nano in hand and started loading OpenCV filters on it using Python.

By the end of July, he had taught himself PyCUDA and posted significant results with it. He released his routines on GitHub and reported he was ready to start taking pictures.

On Aug. 2, he posted his camera’s first digitally enhanced picture of the Copernicus crater on the moon as well as a YouTube video showing a Jetson Nano-enhanced night sky. By October, he posted stunning color-enhanced pictures of the moon (see above), impressive night-vision capabilities and a feature for tracking satellites.

Paillou’s project became the most popular thread on the NVIDIA Jetson Project’s forum with more than 3,100 views to date. Along the way, he gave a handful of others tips for their own AI projects, many of which are available here.

Exploring Horizons in Space and Software

“Twenty years ago, computers were not powerful enough to do this work, but today a little computer like the Jetson Nano makes it really interesting and it’s not expensive,” said Paillou, whose laptop connected to the system also uses an NVIDIA GPU.

In fact, the $99 Jetson Nano is currently marked down to $89 in a holiday special on NVIDIA’s website. Hobbyists who want to use Jetson Nano for neural networking can pair the starter kit with a free AI for Beginners course from our Deep Learning Institute.

Paillou sees plenty of headroom for his project. He hopes to rewrite his Python code in C++ for further performance speed-ups, get a better camera, and further study the possibilities for using AI.

With a little help from friends in America, the sky’s the limit.

“I was not sure I would have the time to learn CUDA – at 52, I am not so young – but it turned out to be very powerful and not so complicated,” he said.

Follow Paillou’s work and many others contributed by fellow developers on the Jetson Community Projects page.

Paillou’s SkyNano (lower left) and SkyPC waiting for the dark.


The post Blue Moon Over Dijon: French Hobbyist Taps GPU for Stellar Camera appeared first on The Official NVIDIA Blog.

Intel Research Identifies Digital Skills Gap Slowing Industry 4.0

Intel Petrick McCreary
“Accelerate Industrial” was conducted and authored by Dr. Faith McCreary (right), a principal engineer, experience architect and researcher at Intel, in tandem with Dr. Irene Petrick, senior director of Industrial Innovation for Intel’s Industrial Solutions Division. (Credit: Intel Corporation)

What’s New: Intel released the results of a new study today, “Accelerate Industrial,” that represents the most comprehensive view of Industry 4.0, the digital transformation of the manufacturing sector. The research uncovered a serious skills gap that most Western industrial production training programs and government investment initiatives fail to address.

The study found that today’s leaders need to create tomorrow’s future-ready workforce. This requires the collaboration of universities, government and industry – including initiatives that focus on worker training for the transforming manufacturing sector.

Why It’s Important: A recent Deloitte/Manufacturing Institute study suggests that industries are entering a period of acute long-term labor shortages, with a shortfall in manufacturing expected to be 2.4 million job openings unfilled by 2028, resulting in a $2.5 trillion negative impact on the U.S. economy. Germany and Japan, two other developed economies, are expected to fare even worse in terms of this projected labor shortage.

What the Study Shows: With the increasing proliferation of data, connectivity and processing power at the edge, the industrial internet of things is becoming more accessible. However, successful adoption remains out of reach for many: two of three companies piloting digital manufacturing solutions fail to move into large-scale rollout.

The study uncovered the top five challenges cited by respondents that have the potential to derail investments in smart solutions in the future:

  • 36% cite “technical skill gaps” that prevent them from benefiting from their investment.
  • 27% cite “data sensitivity” from increasing concerns over data and IP privacy, ownership and management.
  • 23% say they lack interoperability between protocols, components, products and systems.
  • 22% cite security threats, both in terms of current and emerging vulnerabilities in the factory.
  • 18% reference handling data growth in amount and velocity, as well as sense-making.

What to Take From the Research: “Accelerate Industrial” points to the rising importance of the digital skills required to navigate and succeed in this new landscape.

The research found that while there is a big appetite for digital transformation – 83% of companies plan to make investments in smart factory technologies – the most important skills and characteristics cited for that transformation are not ones that are typically emphasized by most industry job training programs or relevant policymakers.

Future skills cited by respondents point to the need to go beyond the basics of programming to embrace a deep understanding of digital tools, from data collection to analytics and real-time feedback directly to the operating environment. The top five future skills required to support digital transformation in manufacturing are:

  • “Deep understanding” of modern programming or software engineering techniques
  • “Digital dexterity,” or the ability to leverage existing and emerging technologies for practical business outcomes
  • Data science
  • Connectivity
  • Cybersecurity

More Context: “Accelerate Industrial” was conducted and authored by Dr. Faith McCreary, a principal engineer, experience architect and researcher at Intel, in tandem with Dr. Irene Petrick, senior director of Industrial Innovation for Intel’s Industrial Solutions Division. The study encompasses mobile ethnographies and interviews with over 400 manufacturers and the ecosystem technologists that support them. The work is being released as a series of reports.

Even More Context:’s Industrial Internet of Things Website | 2018 Intel Study: Intel Study Discovers Why So Many Factories are Still Operating in the 20th Century | Intel Innovators: Making Factories Better Places for Humans to Work | Internet of Things News

The post Intel Research Identifies Digital Skills Gap Slowing Industry 4.0 appeared first on Intel Newsroom.

Intel Research to Solve Real-World Challenges

What’s New: This week at the annual Neural Information Processing Systems (NeurIPS) conference in Vancouver, British Columbia, Intel is contributing almost three dozen conference, workshop and spotlight papers covering deep equilibrium models, imitation learning, machine programming and more.

“Intel continues to push the frontiers in fundamental and applied research as we work to infuse AI everywhere, from low-power devices to data center accelerators. This year at NeurIPS, Intel will present almost three dozen conference and workshop papers. We are fortunate to collaborate with excellent academic communities from around the world on this research, reflecting Intel’s commitment to collaboratively advance machine learning.”
–Hanlin Tang, senior director, AI Lab at Intel

Research topics span the breadth of artificial intelligence (AI) topics, from fundamental understanding of neural networks to applying machine learning to software programming to particle physics.  A few highlights are shown below:

Automating Software Testing

autoperf neurips19 poster sm
» Click for full presentation

Title: A Zero-Positive Learning Approach for Diagnosing Software Performance Regression by Mejbah Alam (Intel Labs), Justin Gottschlich (Intel Labs), Nesime Tatbul (Intel Labs and MIT), Javier Turek (Intel Labs), Timothy Mattson (Intel), Abdullah Muzahid (Texas A&M University)

When: Thursday, Dec. 12, 2019, 5-7 p.m. PST @ East Exhibition Hall B+C #120

Software development automated with machine learning (ML) is an emerging field. The long-term vision is to augment programmers with ML-driven tools to test code, write new code and diagnose errors. The paper proposes an approach to automate regression testing (errors introduced by new code check-ins) in high-performance computing code, termed AutoPerf. Leveraging only nominal training data and utilizing hardware performance counters while running code, we illustrate that AutoPerf can detect some of the most complex performance bugs found in parallel programming.

See the presentation: A Zero-Positive Learning Approach for Diagnosing Software Performance Regression

“Intel is making significant strides in advancing and scaling neural network technologies to handle increasingly complex and dynamic workloads – from tackling challenges with memory to researching new adaptive learning techniques,” said Dr. Rich Uhlig, Intel senior fellow and managing director of Intel Labs. “The developments we’re showcasing at NeurIPS will help reduce memory footprints, better measure how neural networks process information and reshape how machines learn in real time, opening up the potential for new deep learning applications that can change everything from manufacturing to healthcare.”

Teaching Robots Through Imitation Learning

goal conditioned il neurips poster sm
» Click for full presentation

Title: Goal-Conditioned Imitation Learning by Yiming Ding (University of California, Berkeley), Carlos Florensa (University of California, Berkeley), Pieter Abbeel (University of California, Berkeley and, Mariano Phielipp (Intel AI Lab)

When: Thursday, Dec. 12, 2019, 10:45 a.m.-12:45 p.m. PST @ East Exhibition Hall B+C #229

The long-term goal of this research effort is to build robotic algorithms that can learn quickly and easily from human demonstrations. Although learning by human demonstration is a well-studied topic in robotics, current work cannot surpass the human expert, is susceptible to non-perfect human teachers, and cannot adapt to unseen situations. The paper introduces a newly developed algorithm, goalGAIL. Using goalGAIL, the robot demonstrates the ability to learn better than the expert and can even perform in situations with non-expert actions. This will broaden robotic applications across practical robotics where the demonstrator not need to be an expert; industrial settings where algorithms may need to adapt quickly to new parts; and personalized robotics where the algorithm must adapt through demonstration to personal preference.

See the presentation: Goal-Conditioned Imitation Learning

New Approach to Sequence Models

deq poster sm
» Click for full presentation

Title: Deep Equilibrium Models by Shaojie Bai (Carnegie Mellon), J. Zico Kolter (Carnegie Mellon), Vladlen Koltun (Intel Labs)

When: Tuesday, Dec. 10, 2019, 10:40-10:45 a.m. PST @ West Ballroom C

In this spotlight paper at NeurIPS (2% acceptance rate), we develop a radically different approach to machine learning on sequence data. We are able to replace deep recurrent layers with a single-layer model. Instead of iterating through a sequence of layers, we instead solve directly for the final representation via root-finding. This new type of model can match state-of-the-art performance on language benchmarks, but with a single layer, reducing the memory footprint by 88%. This opens the door to building larger and more powerful models.

See the presentation: Deep Equilibrium Models

4-bit Training Without Retraining

4bit poster sm
» Click for full presentation

Title: Post-Training 4-bit Quantization of Convolutional Networks for Rapid-Deployment By Ron Banner (Intel AI Lab), Yury Nahshan (Intel AI Lab), Daniel Soudry (Technion)

When: Wednesday, Dec. 11, 2019, 10:45 a.m.-12:45 p.m. PST @ East Exhibition Hall B + C #105

A convolutional neural network is a class of deep neural networks most commonly applied to analyzing visual imagery that requires substantial computing resources, memory bandwidth and storage capacity. To accelerate the speed of analysis, the models are often quantized to lower bits. However, such methods often require full datasets and time-consuming fine-tuning to recover the accuracy lost after quantization. This paper introduces the first practical 4-bit post-training quantization approach that does not involve training the quantized model (fine-tuning) or require the availability of the full dataset. The approach achieves accuracy that is just a few percent less than the state-of-the-art baseline across a wide range of convolutional models.

See the presentation: Post-Training 4-bit Quantization of Convolutional Networks for Rapid-Deployment | Presentation Slides

Understanding Neural Networks

auditory manifolds poster sm
» Click for full presentation

Title: Untangling in Invariant Speech Recognition by Cory Stephenson (Intel AI Lab), Suchismita Padhy (Intel AI Lab), Hanlin Tang (Intel AI Lab), Oguz Elibol (Intel AI Lab), Jenelle Feather (MIT), Josh McDermott (MIT), SueYeon Chung (MIT)

When: Wednesday, Dec. 11, 2019, 5-7 p.m. PST @ East Exhibition Hall B+C #241

A neural network is often referred to as a “black box” because parts of its decision-making are famously opaque. There has been a plethora of approaches to try to peer into the box, but the challenge has been that many of the measures are not theoretically grounded. In collaboration with MIT, we’ve applied some of theoretically-grounded measures of manifold capacity to better understand the geometry of speech recognition models. Theoretically-grounded measurements are rare in deep learning, and the work seeks to provide a unique view on how neural networks process information.

See the presentation: Untangling in Invariant Speech Recognition

More context:  Intel at NeurIPS 2019 | Artificial Intelligence at Intel | Intel Labs

The post Intel Research to Solve Real-World Challenges appeared first on Intel Newsroom.

2D or Not 2D: NVIDIA Researchers Bring Images to Life with AI

Close your left eye as you look at this screen. Now close your right eye and open your left — you’ll notice that your field of vision shifts depending on which eye you’re using. That’s because while we see in two dimensions, the images captured by your retinas are combined to provide depth and produce a sense of three-dimensionality.

Machine learning models need this same capability so that they can accurately understand image data. NVIDIA researchers have now made this possible by creating a rendering framework called DIB-R — a differentiable interpolation-based renderer — that produces 3D objects from 2D images.

The researchers will present their model this week at the annual Conference on Neural Information Processing Systems (NeurIPS), in Vancouver.

In traditional computer graphics, a pipeline renders a 3D model to a 2D screen. But there’s information to be gained from doing the opposite — a model that could infer a 3D object from a 2D image would be able to perform better object tracking, for example.

NVIDIA researchers wanted to build an architecture that could do this while integrating seamlessly with machine learning techniques. The result, DIB-R, produces high-fidelity rendering by using an encoder-decoder architecture, a type of neural network that transforms input into a feature map or vector that is used to predict specific information such as shape, color, texture and lighting of an image.

It’s especially useful when it comes to fields like robotics. For an autonomous robot to interact safely and efficiently with its environment, it must be able to sense and understand its surroundings. DIB-R could potentially improve those depth perception capabilities.

It takes two days to train the model on a single NVIDIA V100 GPU, whereas it would take several weeks to train without NVIDIA GPUs. At that point, DIB-R can produce a 3D object from a 2D image in less than 100 milliseconds. It does so by altering a polygon sphere — the traditional template that represents a 3D shape. DIB-R alters it to match the real object shape portrayed in the 2D images.

The team tested DIB-R on four 2D images of birds (far left). The first experiment used a picture of a yellow warbler (top left) and produced a 3D object (top two rows).

NVIDIA researchers trained their model on several datasets, including a collection of bird images. After training, DIB-R could take an image of a bird and produce a 3D portrayal with the proper shape and texture of a 3D bird.

“This is essentially the first time ever that you can take just about any 2D image and predict relevant 3D properties,” says Jun Gao, one of a team of researchers who collaborated on DIB-R.

DIB-R can transform 2D images of long extinct animals like a Tyrannosaurus rex or chubby Dodo bird into a lifelike 3D image in under a second.

Built on PyTorch, a machine learning framework, DIB-R is included as part of Kaolin, NVIDIA’s newest 3D deep learning PyTorch library that accelerates 3D deep learning research.

The entire NVIDIA research paper, “Learning to Predict 3D Objects with an Interpolation-Based Renderer,” can be found here. The NVIDIA Research team consists of more than 200 scientists around the globe, focusing on areas including AI, computer vision, self-driving cars, robotics and graphics.

The post 2D or Not 2D: NVIDIA Researchers Bring Images to Life with AI appeared first on The Official NVIDIA Blog.