Researchers Make Movies of the Brain with CUDA

When colleagues told Sally Epstein they sped up image processing by three orders of magnitude for a client’s brain-on-a-chip technology, she responded like any trained scientist. Go back and check your work, the biomedical engineering Ph.D. told them.

Yet it was true. The handful of researchers at Cambridge Consultants had devised a basket of techniques to process an image on GPUs in an NVIDIA DGX-1 system in 300 milliseconds, a 3,000x boost over the 18 minutes the task took on an Intel Core i9 CPU.

The achievement makes it possible for researchers to essentially watch movies of neurons firing in real time using the brain-on-a-chip technology from NETRI, a French startup.

“Animal studies revolutionized medicine. This is the next step in testing for areas like discovering new drugs,” said Epstein, head of Strategic Technology at Cambridge Consultants, which develops products and technologies for a wide variety of established companies and startups such as NETRI.

The startup designs chips that sport 3D microfluidic channels to host neural tissues and a CMOS camera sensor with polarizing filters to detect individual neurons firing. It hopes its precision imaging can speed the development of novel treatments for neurological disorders such as Alzheimer’s disease.

Facing a Computational Bottleneck

NETRI’s chips generate 100-megapixel images at up to 1,000 frames per second — the equivalent of a hundred 4K gaming systems running at 120fps. Besides spawning tons of data, they use highly complex math.

As a result, processing a single second of a recording took NETRI 12 days, an unacceptable delay. So, the startup turned to Cambridge Consultants to bust through the bottleneck.

“Our track record in scientific and biological imaging turned out to be very relevant,” said Monty Barlow, Director of Strategic Technology at Cambridge Consultants. And when NETRI heard about the 3,000x boost, “they trusted us even though we didn’t trust ourselves at first,” he quipped.

Leveraging Math, Algorithms and GPUs

A handful of specialists at Cambridge Consultants delivered the 3,000x speedup using multiple techniques. For example, math and algorithm experts employed a mix of Gaussian filters, multivariate calculus and other tools to eliminate redundant tasks and reduce peak RAM requirements.

Software developers migrated NETRI’s Python code to CuPy to take advantage of the massive parallelism of NVIDIA’s CUDA software. And hardware specialists optimized the code to fit into GPU memory, eliminating unnecessary data transfers inside the DGX-1.

The CUDA profiler helped find bottlenecks in NETRI’s code and alternatives to resolve them. “NVIDIA gave us the tools to execute this work efficiently — it happened within a week with a core team of four researchers and a few specialists,” said Epstein.

Looking ahead, Cambridge Consultants expects to find further speedups for the code using the DGX-1 that could enable real-time manipulation of neurons using a laser. Researchers also aim to explore NVIDIA IndeX software to help visualize neural activity.

The work with NETRI is one of several DGX-1 applications at the company. It also hosts a Digital Greenhouse for AI research. Last year, it used the DGX-1 to create a low-cost but highly accurate tool for monitoring tuberculosis.

The post Researchers Make Movies of the Brain with CUDA appeared first on The Official NVIDIA Blog.

Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show

Back in the day, the annual SC supercomputing conference was filled with tabletops hung with research posters. Three decades on, the show’s Denver edition this week was a sea of sharp-angled booths, crowned with three-dimensional signage, promoting logos in a multitude of blues and reds.

But nowhere on the SC19 show floor drew more of the show’s 14,000 attendees than NVIDIA’s booth, built around a broad, floor-to-ceiling triangle with 2,500 square feet of ultra-high def LED screens. With a packed lecture hall on one side and HPC simulations playing on a second, it was the third wall that drew the most buzz.

Cycling through was a collection of AI-enhanced photos of several hundred GPU developers — grad students, CUDA pioneers, supercomputing rockstars — together with descriptions of their work.

Like accelerated computing’s answer to baseball cards, they were rendered into art using AI style transfer technology inspired by various painters — from the classicism of Vermeer to van Gogh’s impressionism to Paul Klee’s abstractions.

Meanwhile, NVIDIA sprinted through the show, kicking things off with a news-filled keynote by founder and CEO Jensen Huang, helping to power research behind the two finalists nominated for the Gordon Bell prize, and joining in to celebrate its partner Mellanox.

And in its booth, 200 engineers took advantage of free AI training through the Deep Learning Institute and dozens of tech talks were provided by leading researchers packed in shoulder to shoulder.

Wall in the Family 

Piecing together the Developer Wall project took a dozen NVIDIANs scrambling for weeks in their spare time. The team of designers, technologists and marketers created an app where developers could enter some background, which would be paired with their photo once it’s run through style filters at, a German startup that’s part of NVIDIA’s Inception startup incubator.

“What we’re trying to do is showcase and celebrate the luminaries in our field. They amazing work they’ve done is the reason this show exists,” said Doug MacMillian, a developer evangelist who helped run the big wall initiative.

Behind him flashed an image of Jensen Huang, rendered as if painted by Cezanne. Alongside him was John Stone, the legendary HPC researcher at the University of Illinois, as if painted by Vincent Van Gogh. Close by were Erik Lindahl, who heads the international GROMACS molecular simulation project, right out of a Joan Miró painting. Paresh Kharya, a data center specialist at NVIDIA, looked like an abstracted sepia-tone circuit board.

Enabling the Best and Brightest 

That theme — how NVIDIA’s working to accelerate the work of people in an ever growing array of industries — continued behind the scenes.

In a final rehearsal hours before Huang’s keynote, Ashley Korzun — a Ph.D. engineer who’s spent years working on the manned mission to Mars set for the 2030s — saw for the first time a demo visualizing her life’s work at the space agency.

As she stood on stage, she witnessed an event she’s spent years simulating purely with data – the fiery path that the Mars lander, a capsule the size of a two-story condo, will take as it slows in seven dramatic minutes from 12,000 miles an hour to gently stick its landing on the Red Planet.

“This is amazing,” she quietly said through tears. “I never thought I’d be able to visualize this.”

Flurry of News

Huang later took the stage and in a broad-sweeping two hour keynote set out a range of announcements that show how NVIDIA’s helping others do their life’s work, including:

Award-Winning Work

SC19 plays host to a series of awards throughout the show, and NVIDIA featured in a number of them.

Both finalists for the Gordon Bell Prize for outstanding achievement in high performance computing — the ultimate winner, ETH Zurich, as well as University of Michigan — ran their work on Oak Ridge National Laboratory’s Summit supercomputer, powered by nearly 28,000 V100 GPUs.

NVIDIA’s founding chief scientist, David Kirk, received this year’s Seymour Cray Computer Engineering Award, for innovative contributions to HPC systems. He was recognized for his path-breaking work around development of the GPU.

And NVIDIA’s Vasily Volkov co-authored with UC Berkeley’s James Demmel a seminal paper 11 years ago recognized with the Time of Time Award  for a work of lasting impact. The paper, which has resulted in a new way of thinking and modeling algorithms on GPUs, has had nearly 1,000 citations.

Looking Further Ahead

If the SC show is about powering the future, no corner of the show was more forward looking than the annual Supercomputing Conference Student Cluster Competition.

This year, China’s Tsinghua University captured the top crown. It beat out 15 other undergrad teams using NVIDIA V100 Tensor Core GPUs in an immersive HPC challenge demonstrating the breadth of skills, technologies and science that it takes to build, maintain and use supercomputers. Tsinghua also won the IO500 competition, while two other prizes were won by Singapore’s Nanyang Technological University.

The teams came from xx different markets, including Germany, Latvia, Poland and Taiwan, in addition to China and Singapore.

Up Next: More Performance for the World’s Data Centers

NVIDIA’s frenetic week at SC19 ended with a look at what’s next, with Jensen joining Mellanox CEO Eyal Waldman on stage at an evening event hosted by the networking company, which NVIDIA agreed to acquire earlier this year.

Jensen and Eyal discussed how their partnership will enable the future of computing, with Jensen detailing the synergies between the companies. “Mellanox has an incredible vision,” Huang said. ““In a couple years we’re going to bring more compute performance to data centers than all of the compute since the beginning of time.”

The post Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show appeared first on The Official NVIDIA Blog.

Eni Doubles Up on GPUs for 52 Petaflops Supercomputer

Italy energy company Eni is upgrading its supercomputer with another helping of NVIDIA GPUs aimed at making it the most powerful industrial system in the world.

The news comes a little more than two weeks before SC19, the annual supercomputing event in North America. Growing adoption of GPUs as accelerators for the world’s toughest high performance computing and AI jobs will be among the hot topics at the event.

The new Eni system, dubbed HPC5, will use 7,280 NVIDIA V100 GPUs capable of delivering 52 petaflops of peak double-precision floating point performance. That’s nearly triple the performance of its previous 18 petaflops system that used 3,200 NVIDIA P100 GPUs.

When HPC5 is deployed in early 2020, Eni will have at its disposal 70 petaflops including existing systems also installed in its Green Data Center in Ferrera Erbognone, outside of Milan. The figure would put it head and shoulders above any other industrial company on the current TOP500 list of the world’s most powerful computers.

The new system will consist of 1,820 Dell EMC PowerEdge C4140 servers, each with four NVIDIA V100 GPUs and two Intel CPUs. A Mellanox InfiniBand HDR network running at 200 Gb/s will link the servers.

Green Data Center Uses Solar Power

Eni will use its expanded computing muscle to gather and analyze data across its operations. It will enhance its monitoring of oil fields, subsurface imaging and reservoir simulation and accelerate R&D in non-fossil energy sources. The data center itself is designed to be energy efficient, powered in part by a nearby solar plant.

“Our investment to strengthen our supercomputer infrastructure and to develop proprietary technologies is a crucial part of the digital transformation of Eni,” said Chief Executive Officer Claudio Descalzi in a press statement. The new system’s advanced parallel architecture and hybrid programming model will allow Eni to process seismic imagery faster, using more sophisticated algorithms.

Eni was among the first movers to adopt GPUs as accelerators. NVIDIA GPUs are now used in 125 of the fastest systems worldwide, according to the latest TOP500 list. They include the world’s most powerful system, the Summit supercomputer, as well as four others in the top 10.

Over the last several years, designers have increasingly relied on NVIDIA GPU accelerators to propel these beasts to new performance heights.

The SC19 event will be host to three paper tracks, two panels and three invited talks that touch on AI or GPUs. In one invited talk, a director from the Pacific Northwest National Laboratory will describe six top research directions to increase the impact of machine learning on scientific problems.

In another, the assistant director for AI at the White House Office of Science and Technology Policy will share the administration’s priorities in AI and HPC. She’ll detail the American AI Initiative announced in February.

The post Eni Doubles Up on GPUs for 52 Petaflops Supercomputer appeared first on The Official NVIDIA Blog.

Now for the Soft Part: For AI, Hardware’s Just the Start, NVIDIA’s Ian Buck Says

Great processors — and great hardware — won’t be enough to propel the AI revolution forward, Ian Buck, vice president and general manager of NVIDIA’s accelerated computing business, said Wednesday at the AI Hardware Summit.

“We’re bringing AI computing way down in cost, way up in capability and I fully expect this trend to continue not just as we advance hardware, but as we advance AI algorithms, AI software and AI applications to help drive the innovation in the industry,” Buck told an audience of hundreds of press, analysts, investors and entrepreneurs in Mountain View, Calif.

Buck — known for creating the CUDA computing platform that puts GPUs to work powering everything from supercomputing to next-generation AI — spoke at a showcase for some of the most iconic computers of Silicon Valley’s past at the Computer History Museum.

NVIDIA’s Ian Buck speaking Wednesday at the AI Hardware Summit, in Silicon Valley.

AI Training Is a Supercomputing Challenge

The industry now has to think bigger — much bigger — than the boxes that defined the industry’s past, Buck explained, weaving together software, hardware,and infrastructure designed to create supercomputer-class systems with the muscle to harness huge amounts of data.

Training, or creating new AIs able to tackle new tasks, is the ultimate HPC challenge – exposing every bottleneck in compute, networking, and storage, Buck said.

“Scaling AI training poses some hard challenges, not only do you have to build the fast GPU, but optimize for the full data center as the computer,” Buck said. “You have to build system interconnections, memory optimizations, network topology, numerics.”

That’s why NVIDIA is investing in a growing portfolio of data center software and infrastructure, from interconnect technologies such as NVLink and NVSwitch to NVIDIA Collective Communications Library, or NCCL, which optimizes the way data moves across vast systems.

From ResNet-50 to BERT

Kicking off his brisk, half-hour talk, Buck explained that GPU computing has long served the most demanding users — scientists, designers, artists, gamers. More recently that’s included AI. Initial AI applications focused on understanding images, a capability measured by benchmarks such as ResNet-50.

“Fast forward to today, with models like BERT and Megatron that understand human language – this goes way beyond computer vision but actually intelligence,” Buck said. “When I said something, what did I mean? This is a much more challenging problem, it’s really true intelligence that we’re trying to capture in the neural network.”

To help tackle such problems, NVIDIA yesterday announced the latest version of NVIDIA’s inference platform, TensorRT 6. On the T4 GPU, it runs BERT-Large, a model with super-human accuracy for language understanding tasks, in only 5.8 milliseconds, nearly half the 10 ms threshold for smooth interaction with humans. It’s just one part of our ongoing effort to accelerate the end-to-end pipeline.

Accelerating the Full Workflow

Inference tasks, or putting trained AI models to work, are diverse, and usually part of larger applications application that obeys Amdhahl’s Law — if you accelerate only one piece of the pipeline, for example matrix multipliers, you’ll still be limited by the rest of the processing steps.

Making an AI that’s truly conversational will require a fully accelerated speech pipeline able to bounce from one crushingly compute-intensive task to another, Buck said.

Such a system could require 20 to 30 containers end to end, harnessing assorted convolutional neural networks and recurrent neural networks made up of multilayer perceptrons working at a mix of precisions, including INT8, FP16 and FP32. All at a latency of less than 300 milliseconds, leaving only 10 ms for a single model.

Data Center TCO Is Driven by Its Utilization

Such performance is vital as investments in data centers will be judged by the amount of utility that can be wrung from their infrastructures, Buck explained. “Total cost of ownership for the hyperscalers is all about utilization,” Buck said. ”NVIDIA having one architecture for all the AI powered use cases drives down the TCO.”

Performance — and flexibility — is why GPUs are already widely deployed in data centers today, Buck said. Consumer internet companies use GPUs to deliver voice search, image search, recommendations, home assistants, news feeds, translations and ecommerce.

Hyperscalers are adopting NVIDIA’s fast, power-efficient T4 GPUs — available on major cloud service providers such as Alibaba, Amazon, Baidu, Google Cloud and Tencent Cloud. And inference is now a double-digit percentage contributor to NVIDIA’s data center revenue.

Vertical Industries Require Vertical Platforms

In addition to delivering supercomputing-class computing power for training, and scalable systems for data centers serving hundreds of millions, AI platforms will need to grow increasingly specialized, Buck said.

Today AI research is concentrated in a handful of companies, but broader industry adoption needs verticalized platform, he continued.

“Who is going to do the work,” of building out all those applications? Buck asked. “We need to build domain-specific, verticalized AI platforms, giving them an SDK that gives them a platform that is already tuned for their use cases,” Buck said.

Buck highlighted how NVIDIA is  building verticalized platforms for industries such as automotive, healthcare, robotics, smart cities, and 3D rendering, among others.

Zooming in on the auto industry as an example, Buck touched on a half dozen of the major technologies NVIDIA is developing. They include the NVIDIA Xavier system on a chip, NVIDIA Constellation automotive simulation software, NVIDIA DRIVE IX software for in-cockpit AI and NVIDIA DRIVE AV software to help vehicles safely navigate streets and highways.

Wrapping up, Buck offered a simple takeaway: the combination of AI hardware, AI software, and AI infrastructure promise to make more powerful AI available to more industries and, ultimately, more people

“We’re driving down the cost of computing AI, making it more accessible, allowing people to build powerful AI systems and I predict that cost reduction and improved capability will continue far into the future.”

The post Now for the Soft Part: For AI, Hardware’s Just the Start, NVIDIA’s Ian Buck Says appeared first on The Official NVIDIA Blog.

GPU Computing 101: Why University Educators Are Pulling NVIDIA Teaching Kits into Their Classrooms

Along with the usual elements of university curriculums — lectures, assignments, lab exercises — there’s a new tool that educators are increasingly leaning into: NVIDIA Teaching Kits.

University educators around the world are tapping into these kits, which include downloadable teaching materials and online courses that provide the foundation to understand and build hands-on expertise in areas like deep learning, accelerated computing and robotics.

The kits are offered by the NVIDIA Deep Learning Institute, a hands-on training program in AI, accelerated computing, and data science to help technologists solve challenging problems.

Co-developed with university faculty, NVIDIA Teaching Kits provide content to enhance a university curriculum, including lecture slides, videos, hands-on labs, online DLI certificate courses, e-books and GPU cloud resources.

Accelerated Computing at University of California, Riverside

Daniel Wong, an assistant professor of electrical and computer engineering at the University of California, Riverside, used the Accelerated Computing Teaching Kit for two GPU-centric computer science courses — a graduate course and an undergrad course on “GPU Computing and Programming.”

“The teaching kit presented a very well structured way to teach GPU programming, especially given the way many of our students come from very diverse backgrounds,” Wong said.

Wong’s undergrad course took place over 10 weeks with an enrollment of about three dozen students and is currently in its second offering. The kit was central in teaching the basics of CUDA, such as CUDA threading models, parallel patterns, common optimizations and other important parallel programming primitives, Wong said.

“Students know that the material we present is state of the art and up to date so it gives them confidence in the material and drew a lot of excitement,” he said.

The course built up to a final project with students accelerating an application of their choice, such as implementations and performance comparison of CNNs in cuDNN, TensorFlow, Keras, facial recognition on NVIDIA Jetson boards, and fluid dynamics and visualization. In addition, several of Wong’s undergraduate students have gone on to pursue GPU-related undergraduate research.

Deep Learning at University Hospital Erlangen

At the Institute of Neuropathology of the University Hospital Erlangen in Germany, a deep learning morphology research group applies deep learning algorithms to various problems around histopathologic brain tumors.

The university’s medical students have little background in computer science, so principal investigator Samir Jabari uses the NVIDIA Teaching Kit as part of sessions he conducts every few weeks on the field of computer vision.

Through lecture slides on convolutional neural networks and lab assignments, the teaching kit helps provide insights into the field of computer vision and its specific challenges toward histopathology.

Robotics at Georgia State University

Georgia State University’s Computer Science department used the Robotics Teaching Kit in its “Introduction to Robotics” course, first introduced in spring 2018.

The course grouped two to three students per kit to engage them in learning basic sensor interaction and path-planning experiments. At the end of the class, students presented projects during the department’s biannual poster and demonstration day.

The course was a hit. When first taught, it registered 32 students. The upcoming fall course has already received 60 registration requests — nearly double the registration capacity.

Beyond the classroom, Georgia State faculty and students are using NVIDIA Teaching Kits to facilitate projects in the greater community in interdisciplinary areas such as environmental sensing and cybersecurity.

“This kind of in-class hardware kit-based teaching is new to the department,” said Ashwin Ashok, assistant professor of computer science at Georgia State. “These kits have really gained a lot of traction for potential uses in courses as well as research at Georgia State.”

Watch Teaching Kits in Action

At the University of Delaware, undergraduate students trained with NVIDIA Teaching Kits, assistant professor Sunita Chandrasekaran said. At the end of their training, the students took a serial code, which ran for 14 hours, and optimized its performance to run in two minutes using OpenACC on NVIDIA Volta GPU accelerators.

Cristina Nader Vasconcelos, assistant professor at Universidad Federal Fluminense in Rio de Janeiro, Brazil, said NVIDIA Teaching Kits help make sure her course aligns with the state of the art in industry research.

Watch their stories in the video below.

The post GPU Computing 101: Why University Educators Are Pulling NVIDIA Teaching Kits into Their Classrooms appeared first on The Official NVIDIA Blog.

NVIDIA Delivers More Than 6,000x Speedup on Key Algorithm for Hedge Funds

NVIDIA’s AI platform is delivering more than 6,000x acceleration for running an algorithm that the hedge fund industry uses to benchmark backtesting of trading strategies.

This enormous GPU-accelerated speedup has big implications across the financial services industry.

Hedge funds — there are more than 10,000 of them — will be able to design more sophisticated models, stress test them harder, and still backtest them in just hours instead of days. And quants, data scientists and traders will be able to build smarter algorithms, get them into production more quickly and save millions on hardware.

Financial trading algorithms account for about 90 percent of public trading, according to the Global Algorithmic Trading Market 2016–2020 report. Quants, specifically, have grown to about a third of all trading on the U.S. stock markets today, according to the Wall Street Journal.

The breakthrough results have been validated by the Securities Technology Analysis Center (STAC), whose membership includes more than 390 of the world’s leading banks, hedge funds and financial services technology companies.

STAC Benchmark Infographic
Click to view the infographic in full.

NVIDIA demonstrated its computing platform’s capability using STAC-A3, the financial services industry benchmark suite for backtesting trading algorithms to determine how strategies would have performed on historical data.

Using an NVIDIA DGX-2 system running accelerated Python libraries, NVIDIA shattered several previous STAC-A3 benchmark results, in one case running 20 million simulations on a basket of 50 instruments in the prescribed 60-minute test period versus the previous record of 3,200 simulations. This is the STAC-A3.β1.SWEEP.MAX60 benchmark, see the official STAC Report for details.

STAC-A3 parameter-sweep benchmarks use realistic volumes of data and backtest many variants of a simplified trading algorithm to determine profit and loss scores for each simulation. While the underlying algorithm is simple, testing many variants in parallel was designed to stress systems in realistic ways.

According to Michel Debiche, a former Wall Street quant who is now STAC’s director of analytics research, “The ability to run many simulations on a given set of historical data is often important to trading and investment firms. Exploring more combinations of parameters in an algorithm can lead to more optimized models and thus more profitable strategies.”

The benchmark results were achieved by harnessing the parallel processing power of 16 NVIDIA V100 GPUs in a DGX-2 server and Python, which uses NVIDIA CUDA-X AI software along with NVIDIA RAPIDS and Numba machine learning software.

RAPIDS is an evolving set of libraries that simplifies GPU acceleration of common Python data science tasks. Numba allows data scientists to write Python that is compiled into the GPU’s native CUDA, making it easy to extend the capabilities of RAPIDS.

RAPIDS and Numba software make it possible for data scientists and traders to replicate this performance without needing in-depth knowledge of GPU programming.


Feature image credit: Lorenzo Cafaro

The post NVIDIA Delivers More Than 6,000x Speedup on Key Algorithm for Hedge Funds appeared first on The Official NVIDIA Blog.

NVIDIA CEO Ties AI-Driven Medical Advances to Data-Driven Leaps in Every Industry

Radiology. Autonomous vehicles. Supercomputing. The changes sweeping through all these fields are closely related. Just ask NVIDIA CEO Jensen Huang.

Speaking in Boston at the World Medical Innovation Forum to more than 1,800 of the world’s top medical professionals, Huang tied Monday’s news — that NVIDIA is collaborating with the American College of Radiology to bring AI to thousands of hospitals and imaging centers — to the changes sweeping through fields as diverse as autonomous vehicles and scientific research.

In a conversation with Keith Dryer, vice chairman of radiology at Massachusetts General Hospital, Huang asserted that data science — driven by a torrent of data, new algorithms and advances in computing power — is becoming a fourth pillar of scientific discovery, alongside theoretical work, experimentation and simulation.

Putting data science to work, however, will require enterprises of all kinds to learn how to handle data in new ways. In the case of radiology, the privacy of the data is too important, and the expertise is local,  Huang told the audience. “You want to put computing at the edge,” he said.

As a result, the collaboration between NVIDIA and the American College of Radiology promises to enable thousands of radiologists nationwide to use AI for diagnostic radiology in their own facilities, using their own data, to meet their own clinical needs.

Huang began the conversation by noting that the Turing Award, “the Nobel Prize of computing,” had just been given to the three researchers who kicked off today’s AI boom: Yoshua Bengio, Geoffrey Hinton and Yann LeCunn.

“The takeaway from that is that this is probably not a fad, that deep learning and this data-driven approach where software and the computer is writing software by itself, that this form of AI is going to have a profound impact,” Huang said.

Huang drew parallels between radiology and other industries putting AI to work, such as automotive, where Huang sees an enormous need for computing power in autonomous vehicles that can put multiple intelligenceS to work, in real time, as they travel through the world.

Similarly, in medicine, putting one — or more — AI models to work will only enhance the capabilities of the humans guiding these models.

These models can also guide those doing cutting-edge work at the frontiers of science, Huang said, citing Monday’s announcement that the Accelerating Therapeutics for Opportunities in Medicine, or ATOM, consortium will collaborate with NVIDIA to scale ATOM’s AI-driven drug discovery program.

The big idea: to pair data science with more traditional scientific methods, using neural networks to help “filter” through the large combination of possible molecules to decide which ones to simulate to find candidates for in vitro testing, Huang explained

Software Is automation, AI Is the Automation of Automation

Huang sees such techniques being used in all fields of human endeavor — from science to front-line healthcare and even to running a technology company. As part of that process, NVIDIA has built one of the world’s largest supercomputers, SATURNV, to support its own efforts to train

AI models with a broad array of capabilities. “We use this for designing chips, for improving our systems, for computer graphics,” Huang said.

Such techniques promise to revolutionize every field of human endeavor, Huang said, asserting that AI is “software that writes software,” and that software’s “fundamental purpose is automation.”

“AI therefore is the automation of automation,” Huang said. “And if we can harness the automation of automation, imagine what good we could do.”



The post NVIDIA CEO Ties AI-Driven Medical Advances to Data-Driven Leaps in Every Industry appeared first on The Official NVIDIA Blog.

Home Helper: Startup’s Robot Can Tidy Up a Messy House

The robot rolls up to a towel, drops an arm to grasp it and then scoots along to release it in a laundry bin. It zips up to pens scattered across the floor to grab and then place them into a box.

Take a break, Roomba. Preferred Networks has been developing this home cleaning robot since early last year, and it’s moving us closer to a reality of robots as home helpers.

The company’s goal is to create intuitive and interactive personal robots capable of multiple tasks. It aims to launch them for consumers in Japan by 2023 and would like to be in the U.S. after that..

Tokyo-based Preferred Networks — Japan’s largest startup by valuation — this week discussed its home cleaning robot aimed at consumers at the GPU Technology Conference.

And it’s got skills.

The Preferred Networks home robot can take cleaning commands, understand hand gestures and recognize more than 300 objects.

It can map locations of objects in a room using object detection from convolutional neural networks the company developed. Plus, it can place items back where they belong and even tell forgetful humans where objects are located.

“We’re focusing on personal robots, allowing the robots to work in a home environment or maybe an office environment or maybe some restaurants or bars,” said Jun Hatori, a software engineer at Preferred Networks.

The robots were built on Toyota’s HSR platform for robotics and runs with a computer with NVIDIA GeForce GTX 1080 Ti.


Beefy Vision Brains

Its robot packs powerful object detection. The developers used the Open Images Detection Dataset Version 4, which included 1.7 million annotated images with 12 million bounding boxes.

Its base convolutional neural network model was trained using 512 NVIDIA V100 GPUs and won second prize at the Google AI Open Images Challenge in the object detection track in 2018.

But it still has training to do.

For that, they use the same  512 NVIDIA GPU cluster used in the Google competition — whose nodes are interconnected by Mellanox technology. For object detection, they use the ImageNet dataset. They collected domain-specific data for the robot’s room setting and the objects, which were used to do the data tuning on top of the base network.

“We only support 300 objects so far, and that’s not enough. The system needs to be able to recognize almost everything at home,” said Hatori.

Chatty Plus Helpful

The Preferred Networks robot can speak a reply to many commands. It can connect a human command to objects mapped out in a room as well. For example, the system can map a spoken question like “Where is the striped shirt?” and tell the user where it’s located in the room.

Developers have encoded the spoken commands with LSTM and mapped them to real-world objects in the mapped room.

The system combines language interpretation with gesture. Users can point to a location in the room and give a command like “put that there.” Then the system can incorporate the user’s gesture with the spoken command.

It’s just the start.

“We’d like to do more than the simple tidying up, probably much more — some other kinds of home chores as well,” Hatori said.


The post Home Helper: Startup’s Robot Can Tidy Up a Messy House appeared first on The Official NVIDIA Blog.

NVIDIA Tesla T4 Powers Next Generation of Virtual Workstations

The NVIDIA Tesla T4 GPU now supports virtualized workloads with NVIDIA virtual GPU (vGPU) software.

The software, including NVIDIA GRID Virtual PC (GRID vPC) and NVIDIA Quadro Virtual Data Center Workstation (Quadro vDWS), provides virtual machines with the same breakthrough performance and versatility that the T4 offers to a physical environment. And it does so using the same NVIDIA graphics drivers that are deployed on non-virtualized systems.

NVIDIA launched T4 at GTC Japan as an AI data center platform for bare-metal servers. It’s designed to meet the needs of public and private cloud environments as their scalability requirements grow. It has seen rapid adoption, including its recent release on the Google Cloud Platform.

The Tesla T4 is the most universal GPU to date — capable of running any workload to drive greater data center efficiency. In a bare-metal environment, T4 accelerates diverse workloads, including deep learning training and inferencing as well as graphics. Support for virtual desktops with GRID vPC and Quadro vDWS software is the next level of workflow acceleration.

Roughly the size of a cell phone, the T4 has a low-profile, single-slot form factor. It draws a maximum of 70W power, so it requires no supplemental power connector.

Specifications for NVIDIA Tesla GPUs for virtualization workloads.

Its highly efficient design allows NVIDIA vGPU customers to reduce their operating costs considerably and offers the flexibility to scale their vGPU deployments by installing additional GPUs in a server. Two T4 GPUs can fit into the same space as a single Tesla M10 or M60 GPU, which could consume more than 3x the power.

The T4 is built on NVIDIA’s Turing architecture — the biggest architectural leap forward for GPUs in over a decade — enabling major advances in efficiency and performance.

Some of the key features provided by the Turing architecture include Tensor Cores for acceleration of deep learning inference workflows and new RT Cores for real-time ray tracing acceleration and batch rendering.

It’s also the first GPU architecture to support GDDR6 memory, which provides improved performance and power efficiency versus the previous generation GDDR5.

The Tesla T4 is an RTX-capable GPU, supporting the enhancements of the RTX platform, including:

  • Real-time ray-tracing performance
  • Accelerated batch rendering for faster time to market
  • AI-enhanced denoising to speed creative workflows
  • Photorealistic design with accurate shadows, reflections and refractions

The T4 is well-suited for a wide range of data center workloads, including:

  • Virtual desktops for knowledge workers using modern productivity applications
  • Virtual workstations for scientists, engineers and creative professionals
  • Deep learning inferencing and training

Read the full T4 Technical Brief for virtualization.

Check out what Cisco is saying about the Tesla T4, or find an NVIDIA vGPU partner to get started.

Learn more about GPU virtualization at GTC in Silicon Valley, March 17-21.

The post NVIDIA Tesla T4 Powers Next Generation of Virtual Workstations appeared first on The Official NVIDIA Blog.

Dazzling in Dallas: Tsinghua University Wins Student Cluster Competition at SC18

A 48-hour supercomputing battle pushes students to question decisions. GPUs aren’t one of them.

At the SC18 supercomputing show’s Student Cluster Competition, Tsinghua Technological University, of China — running eight NVIDIA V100 GPUs — snagged the crown for overall winner.

The honor of highest Linpack score — a measurement of a system’s floating point computing horsepower — went to Nanyang Technological University, in Singapore, which achieved 56.51 teraflops.

“If you don’t have GPUs, best of luck. It’s essential,” said Bu-sung Lee, team leader and faculty adviser at Nanyang, which won the competition’s top honor last year.

The 16 teams of six members from around the globe came to compete in the world’s premier supercomputing event, held in Dallas this year.

The competition — which limits students to using 3,000 watts of power — was grueling. Teams hunkered down during battle, staring intensely into laptops. Take-out food boxes delivered by their coaches, who weren’t permitted to interact with the students, littered their booths.

In an endurance race that goes around the clock for two days straight, students tackle real-life computing workloads. “They are taking turns staying up through the night, three at a time,” said Lee.

The Student Cluster Competition gives students interested in high performance computing a chance to deploy their workloads on supercomputing systems and mix with others in the industry, encouraging the pursuit of careers in the field.

Students have duked it out in teams over these supercomputer cluster competitions since the Supercomputing Conference of 2007. That first SC07 battle, held in Reno, Nevada, has inspired student competitions in Germany, China and South Africa.

The clusters are designed to take on specific tasks already faced by those deploying high performance computing.

All teams were offered the chance to run their clusters on our NVIDIA V100 32GB GPUs, which we donate to contestants each year. “It’s rare to be able to play with the latest tech,” said Lee.

The post Dazzling in Dallas: Tsinghua University Wins Student Cluster Competition at SC18 appeared first on The Official NVIDIA Blog.