Path Math: How AI Can Find a Way Around Pathologist Shortage

Ever since a Dutch cloth merchant accidentally discovered bacteria in 1676, microscopes have been a critical tool for medicine. Today’s microscopes are 800,000 times more powerful than the human eye, but they still need a person to scrutinize what’s under the lens.

That person is usually a pathologist — and that’s a problem. Worldwide, there are too few of these doctors who interpret lab tests to diagnose, monitor and treat disease.

Now SigTuple, a member of our Inception startup incubator program, is testing an AI microscope that could help address the pathologist shortage. The GPU-powered device automatically scans and analyzes blood smears and other biological samples to detect problems.

Global workforce capacity in pathology and laboratory medicine. Image reprinted from The Lancet.
Global workforce capacity in pathology and laboratory medicine. Image reprinted from The Lancet, Access to pathology and laboratory medicine services: a crucial gap. Copyright (2018), with permission from Elsevier.

One in a Million

The dearth of pathologists is crucial problem in the poorest countries, where patients lacking a proper diagnosis are often given inappropriate treatments, according to studies published this month in The Lancet medical journal. In sub-Saharan Africa, for example, there is a single pathologist for every million people, the journal reported.

But the problem isn’t confined to poor countries. In China, there’s one pathologist for every 130,000 people, The Lancet reported. That compares with 5.7 per 100,000 people in the U.S., according to a the most recent figures available. And in the U.S., studies predict the number of pathologists will shrink to 3.7 per 100,000 people by 2030.

In India, that’s now 1 pathologist per 65,000 people — a total of 20,000 pathologists available to treat the nation of 1.3 billion people, said Tathagato Rai Dastidar, co-founder and chief technology officer of Bangalore-based SigTuple.

“There is a human cost here. In many places, where there is no pathologist, a half-trained technician will write out a report and cases will go undetected until it’s too late,” Dastidar said.

SigTuple’s automated microscope costs a fraction of what existing devices do, making it affordable for developing countries where pathologists are few. Image courtesy of SigTuple.
SigTuple’s automated microscope costs a fraction of what existing devices do, making it affordable for developing countries where pathologists are few. Image courtesy of SigTuple.

Low Cost, High Performance Microscope

SigTuple’s device isn’t the first automated microscope. Instruments known as digital slide scanners automatically convert glass slides to digital images and interpret the results. But SigTuple’s microscope sells for a fraction of the price of digital slide scanners, making it affordable for most labs, including those in the developing world.

The company’s AI microscope works by scanning slides under its lens and then using GPU-accelerated deep learning to analyze the digital images either on SigTuple’s AI platform in the cloud or on the microscope itself. It uses different deep learning models to analyze blood, urine and semen.

The microscope performs functions like identifying cells, classifying them into categories and subcategories, and calculating the numbers of different cell types.

For a blood smear, for example, Shonit — that’s Sanskrit for blood — identifies red and white blood cells and platelets, pinpoints their locations and calculates ratios of different types of white blood cells (commonly known as differential count). It also computes 3D information about cells from their 2D images using machine learning techniques.

In studies SigTuple conducted with some of India’s leading labs, Shonit’s accuracy matched that of other automated analyzers. It also successfully identified rare varieties of cells that both pathologists and automated tools usually miss.

Expert Review in the Cloud

In addition to providing a low-cost method for interpreting slides, Dastidar sees SigTuple’s AI platform as an ideal tool for providing expert review of tests when no expert is available. As well as automating analysis, it stores data in the cloud so any pathologist anywhere can interpret test results.

The company’s cloud platform also makes it far easier for pathologists to collaborate on difficult cases.

“Before that would have meant shipping the slide from one lab to another,” Dastidar said.

SigTuple next plans a formal trial of Shonit and is beginning to roll it out commercially.

For more information about SigTuple and Shonit, watch Dastidar’s GTC talk or read SigTuple’s recent paper, Analyzing Microscopic Images of Peripheral Blood Smear Using Deep Learning.

The post Path Math: How AI Can Find a Way Around Pathologist Shortage appeared first on The Official NVIDIA Blog.

Hidden Figures: How AI Could Spot a Silent Cancer in Time to Save Lives

It’s no wonder Dr. Elliot Fishman sounds frustrated when he talks about pancreatic cancer.

As a diagnostic radiologist at Johns Hopkins Hospital, one of the world’s largest centers for pancreatic cancer treatment, he has the grim task of examining pancreatic CT scans for signs of a disease that’s usually too advanced to treat.

Because symptoms seldom show up in the early stages of pancreatic cancer, most patients don’t get CT scans or other tests until the cancer has spread. By then, the odds of survival are low: Just 7 percent of patients live five years after diagnosis, the lowest rate for any cancer.

“Our goal is early detection of pancreatic cancer, and that would save lives,” Fishman said.

Fishman aims to spot pancreatic cancers far sooner than humans alone can by applying GPU-accelerated deep learning to the task. He helps spearhead Johns Hopkins’ Felix project, a multimillion dollar effort supported by the Lustgarten Foundation to improve doctors’ ability to detect the disease.

Pancreatic cancer that has invaded the vessels, meaning it's too advanced to be treated with surgery.
This video depicts pancreatic cancer that has invaded the vessels — the branch-like structures at the center of the picture — surrounding the pancreas. That means the disease is too advanced to be treated with surgery. Video courtesy of Dr. Elliot Fishman, Johns Hopkins Hospital.

Deep Learning Aids Hunt for Silent Killer

The pancreas — a six-inch long organ located behind the stomach — plays an essential role in converting the food we eat into fuel for the body’s cells. It’s located deep in the abdomen, making it hard for doctors to feel during routine examinations, and making it difficult to detect tumors using imaging tests like CT scans.

Some radiologists, like Fishman, see thousands of cases a year. But others lack the experience to spot the cancer, especially when the lesions — abnormalities in organs and tissue — are at their smallest in the early stages of the disease.

“If people are getting scanned and diagnoses aren’t being made, what can we do differently?” Fishman asked in a recent talk at the GPU Technology Conference, in San Jose. “We believe deep learning will work for the pancreas.”

Johns Hopkins is ideally suited to developing a deep learning solution because it has the massive amounts of data on pancreatic cancer needed to teach a computer to detect the disease in a CT scan. Hospital researchers also have our DGX-1 AI supercomputer, an essential tool for deep learning research.

 The pancreas, a fish-shaped organ, is pictured here in golden brown, above the kidneys and below the spleen. The dark circle at the center of the image is a tumor. Image courtesy of Dr. Elliot Fishman, Johns Hopkins Hospital.
The pancreas, a fish-shaped organ, is pictured here in golden brown, above the kidneys and below the spleen. The dark circle at the center of the image is a tumor. Image courtesy of Dr. Elliot Fishman, Johns Hopkins Hospital.

Detecting Pancreatic Cancer with Greater Accuracy

Working with a team of computer scientists, oncologists, pathologists and other physicians, Fishman is helping  train deep learning algorithms to spot minute textural changes to tissue of the pancreas and nearby organs. These changes are often the first indication of cancer.

The team trained its algorithms on about 2,000 CT scans, including 800 from patients with confirmed pancreatic cancer. It wasn’t easy. Although Johns Hopkins has ample data, the images must be labeled to point out key characteristics that are important in determining the state of the pancreas. At four hours per case, it’s a massive undertaking.

In the first year of the project, the team trained an algorithm to recognize the pancreas and the organs that surround it, achieving a 70 percent accuracy rate. In tests this year, the deep learning model has accurately detected pancreatic cancer about nine times out of 10.

Earlier Diagnosis Possible  

The team is now examining instances where cancer was missed to improve its algorithm. It’s also working to go beyond identifying tumor cells to predict likely survival rates and whether the patient is a candidate for surgery.

Finding an answer is urgent because even though pancreatic cancer is rare, it’s on the rise. Not long ago, it was the fourth-leading cause of cancer deaths in the U.S. Today it’s No. 3. And less than a fifth of patients are eligible for surgery at the time of presentation, the primary treatment for the disease.

For Fishman, deep learning detection methods could mean earlier diagnosis. He estimates that nearly a third of the cases he sees could have been detected four-12 months sooner.

“We want to train the computer to be the best radiologist in the world,” Fishman said. “We’re hopeful we can make a difference.”

To learn more about Fishman’s research, watch his GTC talk, The Early Detection of Pancreatic Cancer Using Deep Learning: Preliminary Observations.

Also, here are two of his recent papers:

* Main image for this story pictures a pancreatic cancer cell.

The post Hidden Figures: How AI Could Spot a Silent Cancer in Time to Save Lives appeared first on The Official NVIDIA Blog.

How Deep Learning Is Bringing Elusive, Light-Bending Gravitational Lenses Into Focus

Gravitational lenses have long been one of astronomy’s white whales, perplexing those who have devoted themselves to finding and studying them.

But by applying deep learning and computer vision to the abundant data generated by today’s powerful telescopes, scientists are on the verge of being able to use hundreds of thousands of gravitational lenses to expand our understanding of the universe.

Gravitational lenses occur when a galaxy, or a cluster of galaxies, blocks the view of another galaxy “behind” it, and the gravity of the first causes the light from the second to bend. This effectively makes the first galaxy a sort of magnifying glass for observing the second.

By first conclusively identifying gravitational lenses — which has proven to be a huge challenge — and then analyzing the telescope data, scientists not only can better observe those more distant galaxies, they also can gain understanding of the nature of dark matter, an unknown form of matter which seems to permeate our universe.

“There is lots of science to be learned from gravitational lenses,” said Yashar Hezaveh, a NASA Hubble postdoctoral fellow at Stanford University’s Kavli Institute for Particle Astrophysics and Cosmology. “We can use the data to look into the distribution of dark matter, and the formation of stars and galaxies.”

Delving Into Deep Learning

Until recently, scientists used large and sophisticated computer codes to analyze images. This required very large computations on superclusters and a significant amount of human intervention. But when Hezaveh and his team of researchers decided to apply computer vision and neural networks, everything changed.

“We had no expectations of how awesome it was going to be, or if it was going work at all,” said Laurence Perreault Levasseur, a postdoctoral fellow at Stanford University and a coauthor of a paper on the topic.

Another way to think about gravitational lenses is as funhouse mirrors, where the challenge is to remove the effect of mirror distortions and find the true image of the object in front of it. Traditional methods compare the observations against a large dataset of simulated images of that same object viewed in different distorted mirrors to find which one is more similar to the data.

But neural networks can directly process the images and find the answers without the need for comparison against many simulations. This can, in principle, speed up the calculations. But training a deep learning model that can understand how the various undulations affect the behavior of matter, not to mention our view of it, also requires enormous computing power.

Once Hezaveh and his team adopted GPUs to analyze the data, they had the speed and accuracy needed to unlock new knowledge of the universe. Using Stanford’s Sherlock high performance computing cluster, which runs on a combination of NVIDIA Tesla and TITAN X GPUs, the team was able to train its models up to 100x faster than on CPUs.

The resulting understanding of gravitational lenses is expected to provide a lot of fodder for those trying to understand the universe better.

“A lot of scientific questions can be addressed with this tool,” said Perreault Levasseur.

Wanted: Gravitational Lenses

Of course, to analyze data on gravitational lenses, you first have to find them, and that’s where complementary research underway by scientists at three universities in Europe comes into play.

Researchers at the Universities of Groningen, Naples and Bonn have been using deep learning methods to identify new lenses as part of the Kilo-Degree Survey (KiDS), an astronomical survey intended to better understand dark matter and the distribution of mass in the universe.

Carlo Enrico Petrillo, coauthor of a paper detailing the deep learning effort, said as many as 2,500 gravitational lenses could be uncovered using AI in conjunction with KiDS, even though the survey is only observing a small sliver (about 4 percent) of the sky.

But there was one significant challenge to making this happen: The lack of the kind of significant training dataset deep learning applications typically require. Petrillo said his team countered this by simulating the arcs and rings that surround gravitational lenses and incorporating them into images of real galaxies.

“In this way we could simulate gravitational lenses with all the specific characteristics, such as resolution, wavelength and noise, of the images coming from the surveys,” said Petrillo.

In other words, the team treated the problem as one of binary classification: galaxies surrounded by arcs and rings that match the simulations are labeled as lenses, and those that don’t are labeled as non-lenses. As the network learns from each simulation, researchers can narrow down candidates. The group’s paper notes this method initially enabled them to whittle 761 candidates down to a list of 56 suspected gravitational lenses.

NVIDIA GPUs helped to make this possible by slashing the time it takes to run a batch of images against the simulations. Doing so on a CPU required 25 seconds per batch, but a GeForce GTX 1080 GPU provides a 50x increase in speed. (The paper details results on an older generation GeForce GPU, but Petrillo recently upgraded to the newer one.)

“Using the CPU would have made my job hell,” he said.

Data Deluge Coming

As the innovations in telescopic and deep learning technology continue, the amount of data on gravitational lenses figures to increase substantially. For instance, Petrillo said the European Space Agency’s Euclid telescope is expected to produce tens of petabytes of data, while the Large Synoptic Survey Telescope in Chile will generate 30 terabytes of data each night.

That means lots of data to crunch, many gravitational lenses to be discovered and new space frontiers to be grasped — so long as scientists can keep up.

“Having a lot of lenses means building an accurate picture of the formation and evolution of galaxies, having insights on the nature of dark matter and on the structure of the space-time continuum itself,” said Petrillo. “We need efficient and fast algorithms to analyze all this data and, surely, machine learning will be common business among astronomers.”

The post How Deep Learning Is Bringing Elusive, Light-Bending Gravitational Lenses Into Focus appeared first on The Official NVIDIA Blog.

This Is Your Disease on Drugs: How an AI Startup Could Defeat Now Unbeatable Bugs

An aging population. Antibiotic-resistant infections. Afflictions by the hundreds that still lack a cure.

The need for new medications is higher than ever, but so is the cost and time to bring them to market. Developing a new drug can cost billions and take as long as 14 years, according to the U.S. Food and Drug Administration. Yet with all that effort, only 8 percent of drugs make it to market, the FDA said.

“We need to make smarter decisions about which potential medicines we develop and test,” said Abraham Heifets, co-founder of San Francisco-based startup Atomwise.

The six-year-old company, a member of our Inception startup incubator program, is working to make that happen by using GPU-accelerated deep learning to predict which molecules are most likely to lead to treatments. It’s already had some success, identifying possible medicines for multiple sclerosis and the deadly Ebola virus.

How Atomwise Finds Drug Candidates

To understand Atomwise, it helps to know a little about how drug discovery works.

Researchers first identify the biological cause of a disease — usually a protein — to target with a treatment. A protein may help a tumor grow or cause inflammation, for example. Next they search for a medicine that will hit that target, inhibiting or boosting its function.

The company’s AtomNet deep learning software sifts through millions of possible molecules for effective treatments. It then analyzes stimulations that show how the potential medicine will behave in the human body.

The software predicts whether the treatment works against the target, how it affects other parts of the body, its toxicity and possible side effects. Atomwise uses our Tesla V100 and other NVIDIA GPUs for both training and inference on AtomNet.

After completing its evaluation, Atomwise delivers the drug candidates to customers, such as the pharmaceutical giant Merck and top research institutions like the Dana Farber Cancer Institute at Harvard, Stanford University and the Baylor School of Medicine. These organizations conduct further studies to determine if a compound can be used as an approved treatment.

This simulation shows the Janus kinas 3 protein, which has been implicated in cancer and immune function. Atomwise aims to discover molecules that could be new medications for these and other diseases. Image courtesy of Atomwise.
This simulation shows the Janus kinas 3 protein, which has been implicated in cancer and immune function. Image courtesy of Atomwise.

Simulation Revelation

For every molecule that becomes a drug, millions might be physically tested and determined to be unsuitable, Heifets said. By using AI to analyze simulations, Atomwise reduces the time researchers spend building and testing new medications that ultimately won’t work out.

“Every other form of manufacturing simulates its prototypes before it builds them,” he added. “Our goal is to give the pharmaceutical industry the same advantage that other industries have.”

According to Heifets, the company’s method is about 100x faster than high-throughput screening, a commonly used technique to automate drug compound evaluation. It’s a million times faster than a medicinal chemist doing custom synthesis. And its hit rate is 10,000x better than wet lab experiments, he said.

Ebola virus particles (in blue) in a colorized image from a scanning electron microscope. Atomwise found what may turn into new medications for the deadly disease. Image courtesy of the National Institute of Allergy and Infectious Diseases.
Ebola virus particles (in blue) in a colorized image from a scanning electron microscope. Image courtesy of the National Institute of Allergy and Infectious Diseases.

Targeting Ebola, Multiple Sclerosis

Atomwise already has made progress on the Ebola virus and multiple sclerosis, which currently lack sufficient treatments. Ebola, with a death rate as high as 90 percent, has killed thousands of people since it appeared in 1976. Atomwise found a drug candidate that may block Ebola’s entry into healthy cells.

Multiple sclerosis, a potentially disabling disease of the brain and spinal cord, affects about 2.3 million people worldwide, according to the National Multiple Sclerosis Society. Devising a cure that can reach the brain is extremely difficult because of what’s known as the blood-brain barrier, which prevents most molecules from entering the brain. A potential cure would have to pass through this barrier.

Atomwise explored 8.2 million molecules to discover several candidates that could prove to be cures. These were effective in animal trials, and have been licensed to a pharmaceutical company in the U.K. for further exploration.

“I want to see that we can solve hard problems and find molecules that become treatments for disease,” Heifets said.

Learn more about NVIDIA technology to advance deep learning in healthcare.

* Main image for this story shows Atomwise’s simulated drug research in which a neural network learns to recognize chemical functional groups. Image courtesy of Atomwise.

The post This Is Your Disease on Drugs: How an AI Startup Could Defeat Now Unbeatable Bugs appeared first on The Official NVIDIA Blog.

How Do I Understand Deep Learning Performance?

There’s a lot of confusion out there about deep learning performance. How do you measure it? What should you measure?

The simple answer: PLASTER.

The not so simple reality: “Hyperscale data centers are the most complicated computers the world has ever made — how could it be simple?” NVIDIA CEO Jensen Huang explained at NVIDIA’s GPU Technology Conference earlier this year, before cramming each of the factors that drive this performance into that single acronym.

Here’s what PLASTER stands for:

  • Programmability
  • Latency
  • Accuracy
  • Size of Model
  • Throughput
  • Energy Efficiency
  • Rate of Learning

Read the white paper “PLASTER: A Framework for Deep Learning Performance,” from Tirias Research, which puts all of these factors into context.

The post How Do I Understand Deep Learning Performance? appeared first on The Official NVIDIA Blog.

AI Podcast: The 411 on NVIDIA Research

De-noising, semantic manipulation, and unsupervised text modeling. These are only some of the projects that our NVIDIA Research team have been tackling for the past several months.

In the latest AI Podcast episode, Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, gives a full rundown of the group’s recent discoveries and shares what else in store for NVIDIA Research.

“The goal of NVIDIA research is to figure out what things are going to change the future of the company, and then build prototypes that show the company how to do that,” Catanzaro said in a conversation with AI Podcast host Noah Kravitz. “And AI is a good example of that.”

Noise! Noise! Noise!

Developed by NVIDIA’s research team in Sweden and Finland, the Noice2Noise project discovered that a matching set of images is not necessary to solve de-noising problems.

“So people have been working on de-noising for a while,” Catanzaro said. “And the insight that led to this Noise2Noise de-noiser is that you don’t actually need the clean image in order to do this.”

The standard AI de-noising method requires paired sets of identical images – half the batch is clean, the other half is noisy duplicates.

In many cases, clean images are inaccessible. So ultimately, you are trying to train the neural network to reproduce perfect images, even though it was only given noisy ones.

“As long as you have multiple copies of the same image, or a very similar image, where the noise is different, then you can train the model on all those noisy images and it will learn to remove the noise nonetheless,” Catanzaro said.

That’s Just Semantics

Building whole new virtual worlds just got a lot easier with semantic manipulation.

According to Catanzaro, the technique relies on a trained generative model to create photorealistic outputs with only high-level semantic descriptions of a scene.

And the possibilities with this are endless.

Subscribe to the AI Podcast – iTunes // SoundCloud // Stitcher // Overcast // Google Play // RSS

“It opens the door to a lot of new techniques for rendering graphics, as well as manipulating images,” Catanzaro explained. “For example, in image editing, if I was able to change the type of object in an image, I could do that at a very broad scale. I could have a huge, big paintbrush and just sort of paint trees onto an image. And where I painted the trees it knows how to draw trees that fit there.”

Semantic manipulation is still in the research prototype stage, but the NVIDIA Research team open-sourced the project and discovered people were using the tool to create artificial satellite imagery.

AI and Beyond

Other projects NVIDIA Research has in store include unsupervised text modeling, in which the team is trying to develop a model that can indicate whether a section of text contains either positive or negative sentiment.

By training the model on an NVIDIA DGX system, “we can do things that, in the original paper that got published a few months ago took a month to do, we can do that in less than a day,” said Catanzaro.

Looking back, Catanzaro was surprised at how quickly AI has grown.

“Nowadays I think we’re just at the beginning,” said Catanzaro. “I see so many amazing opportunities just waiting for people to pick up, that I think we’re going to be finding really interesting, really valuable things to do with AI for quite some time to come.”

And when asked how NVIDIA managed to successfully adopt AI at the right time, he remarks:

“I think the part of the story that people sometimes miss out on is that NVIDIA prepared itself for this change by doing the research.”

For more information about how our researchers are revolutionizing graphics, see the papers (listed below) or read our related articles, “NVIDIA Research Brings AI to Graphics” and “NVIDIA Researchers Showcase Major Advances in Deep Learning at NIPS.”

The post AI Podcast: The 411 on NVIDIA Research appeared first on The Official NVIDIA Blog.

Investing in Artificial Intelligence: A Path for US Leadership

I am fortunate today to be in Washington with leaders from three dozen companies, including many of our close partners, to discuss with administration officials how the U.S. can continue to lead the world in research, development, and the adoption of artificial intelligence. I’d like to thank the Office of Science and Technology Policy for convening this important meeting.

AI is becoming the world’s most important computational tool — applicable to a wide variety of industries including transportation, energy and healthcare. But AI is enormously demanding in terms of computation — it requires processing hundreds of millions of data points to extract insight. Therefore it’s important for us to discuss how to improve our nation’s computing infrastructure to support AI and maintain leadership in this space.

In our meetings, I hope to reiterate some of the themes I shared a few months ago with the House Subcommittee on Information Technology — that we need to increase funding for research, give university researchers access to GPU computing clusters, open access to data sets, and train more AI developers and data scientists.

There’s simply no replacement for the federal government significantly increasing support for fundamental research to bolster university research. Funding drives research. Research, in turn, drives innovation, from startups to multinationals.

Government also has a role in providing the infrastructure to support research. Universities need access to large-scale, state-of-the art, GPU-accelerated computing systems to do cutting edge research. But most lack the expertise to procure and run them. The government should provide better access to universities for future computing systems — all of which need to support high performance computing and AI workloads.

Data is the lifeblood of AI. Developers and researchers need access to high-quality data. Federal agencies should disclose what data sets are available, including anonymized healthcare, weather, satellite and industrial data sets.

Simulation for Safe Autonomous Vehicles

Safety is essential for autonomous vehicles and it’s our highest priority. For the U.S. to lead, we need to ensure safety and time to market. Developing a safe vehicle requires traveling billions of miles, which is extraordinarily challenging. Computer simulation is an ideal methodology to test and validate AI for self-driving cars, enabling us to accelerate development and improve safety under a wide variety of road and weather conditions.

Simulation together with AI will greatly advance autonomous vehicle technology to achieve the highest levels of safety. Simulation should be part of the virtual “drivers test” of autonomous systems. This will help reduce the terrible toll of 37,000 American fatalities each year.

States often have different regulations for transportation infrastructure. The federal government should make recommendations for all 50 states to share unified autonomous vehicle guidelines and smart infrastructure, including street lights, sensors and construction zones.

The government should partner with industry to train more developers and data scientists. Academia can’t do this by itself. NVIDIA trains tens of thousands of developers and data scientists each year, partnering with educational leaders including Coursera and Udacity.

In my recent testimony before Congress, I said that AI represents the biggest technological and economic shift in our lifetime. The stakes are huge — trillions of dollars in opportunity for American companies, and life-saving breakthroughs. I look forward to continuing to work with our partners in Washington and throughout the country to strengthen our leadership, foster innovation and drive advances that will lead us to a brighter future.

The post Investing in Artificial Intelligence: A Path for US Leadership appeared first on The Official NVIDIA Blog.

Take Two Algorithms and Call Me in the Morning

Three, it turns out, is better than one. At least that’s how it worked for a trio of former rivals who teamed up to claim the just-announced top prize in this year’s Data Science Bowl.

The fourth annual event focused on one of healthcare’s most pressing problems — the soaring cost and time needed to discover new drugs. A record-setting 18,000 participants battled over 90 days to deliver a deep learning algorithm to accelerate a crucial step in the drug-discovery pipeline: identifying the nucleus of each cell.

This year’s Data Science Bowl was “driven by a very real need to develop new treatments faster and more accurately,” said Anne Carpenter, director of the imaging platform at the Broad Institute of MIT and Harvard, the nonprofit partner for the contest.

Data Science Bowl participants used images like this one supplied by the Broad Institute at MIT and Harvard to train deep learning algorithms to spot nuclei and speed drug discovery.
Data Science Bowl participants used images like this one supplied by the Broad Institute of MIT and Harvard to train deep learning algorithms to spot nuclei and speed drug discovery.

International Team Takes the Prize

The winners beat out nearly 4,000 teams to win the Data Science Bowl, presented by the consulting firm Booz Allen Hamilton and the Kaggle platform for data science competitions, with additional sponsorship from NVIDIA and the medical diagnostics company PerkinElmer. Creators of the top algorithms will split $170,000 in cash and prizes, including powerful NVIDIA GPU hardware for deep learning.

In addition to the difficulty of spotting cell nuclei in dense medical images, the winning threesome — Selim Seferbekov, Alexander Buslaev and Victor Durnov — faced the challenge of collaborating across six time zones and three countries, Germany, Belarus and Russia. Using our TITAN Xp and GeForce GTX 1080 Ti GPUs for both training and inference, the team toiled for some 300 hours to create and implement their algorithm.

Their efforts paid off: Together they’ll collect $50,000 in cash, plus an estimated $70,000 in the latest NVIDIA GPUs built on our new Volta architecture. Volta uses NVIDIA CUDA Tensor Cores to deliver unprecedented levels of deep learning performance in hardware like our DGX Station, one of the most powerful tools for researchers.

Record-Setting Data Science Bowl

Collectively, competition participants worked an estimated 288,000 hours and submitted 68,000 algorithms, nearly three times as many submissions as in last year’s Data Science Bowl.

Other teams in the top three were:

  • Second Place ($25,000): Minxi Jiang, chief data scientist at a Beijing-based startup, who finished in the top one percent in last year’s Data Science Bowl. She used our TITAN Xp GPUs to create her algorithm.
  • Third Place ($12,000): Angel Lopez-Urrutia, a marine biologist in Spain who uses machine learning to automatically classify images of plankton, a challenge that was central to the inaugural Data Science Bowl. He used our Jetson TK1 and Tesla K80 GPUs to develop his algorithm.
Researchers used images like this to train their deep learning algorithms in the fourth Data Science Bowl, aimed at speeding drug discovery.
Researchers used images like this to train their deep learning algorithms to speed drug discovery in the Data Science Bowl. Image is courtesy of the Broad Institute of MIT and Harvard.

Drug Discovery Bottleneck

Finding new drugs is a complex and laborious task that can cost billions and take a decade or more per treatment. Biochemists try thousands of chemical compounds to figure out which, if any, are effective against a particular virus or bacteria or which cause a desired reaction in the human body. They do that by measuring how diseased and healthy cells respond to various treatments.

Because nearly all human cells contain a nucleus, the most direct route to identifying each cell is to spot the nucleus. Existing methods require time-consuming researcher oversight. Sometimes biologists have no choice but to personally examine thousands of images to complete their experiments.

“By identifying nuclei quickly and accurately, the algorithms developed in this competition can free up biologists to focus on other aspects of their research, shortening the approximately 10 years it takes for each new drug to come to market and, ultimately, improving quality of life,” said Ray Hensberger, a Booz Allen Hamilton principal.

Carpenter, of the Broad Institute, aims to use a winning algorithm to build deep learning software for drug discovery. The institute is now exploring the idea of creating a user-friendly, open source software that biomedical researchers can use in their day-to-day work.

Learn more about NVIDIA technology to advance deep learning in healthcare.

* Main image for this story shows human cell nuclei, which contains most of cells’ genetic material. RNA-processing proteins are in red and chromosomes are in blue. Image courtesy of the National Cancer Institute.

The fourth Data Science Bowl challenged participants to use deep learning to speed drug discovery.
The fourth Data Science Bowl challenged participants to use deep learning to speed drug discovery.

The post Take Two Algorithms and Call Me in the Morning appeared first on The Official NVIDIA Blog.

Lots of People Are Listening to the AI Podcast Right Now — and I Kind of Don’t Care

I’ve got a confession. You’re not the real audience for our AI Podcast. None of you are.

The real audience won’t arrive for 20 or 30 years. They’ll be people who will want to listen to the conversations we’ve been recording over the past 18 months with the people who we believe are defining the world in which coming generations will live.

That’s the thought, at least, that guides our conversations: we seek out people who are using deep learning as a lever to change the world. And we sit them down with our excellent host, Noah Kravitz, and ask them to tell us their story, one 20-minute session at a time.

We’re just letting you listen in.

It’s an experiment that’s gone better than we could’ve expected. Not only have episodes of the podcast been downloaded more than half a million times since launching in late 2016. We’ve found deep learning to be the thread we can follow to find some of the smartest people in the world today. These are the kind of people who are otherwise impossible to spot in the moment: the kind of people you’ll want to know more about decades from now.

And Noah, our host, asks just the kind of human, simple, egoless — sometimes sassy — questions you’d want to ask if you could go back to 1880 to question Thomas Edison’s one-time assistant, William Hammer, as he began mass production of electric lighting. Or to 1803 to talk to Richard Trevethick as he was harnessing the steam engine to power a new kind of transportation: the locomotive. And then Noah lets our guests talk.

Here’s a look at our latest three conversations.

Reinventing Every Retail Store — Steve Gu, Co-Founder and CEO of AiFi

Grab the goods and go. AiFi co-founder and CEO Steve Gu wants to give every store — from Mom and Pop bodegas to supermarket chains — the ability to let customers saunter out of the door without so much as a wave at a cashier.

The benefits involve more than just convenience: stores will have a better idea of how their customers behave and get a real-time bead on their inventory.

To do that, Gu and his team at startup AiFi rely on advanced sensor fusion, simulation and deep learning.

Unleashing Cheap, Clean Energy — William Tang, Principal Research Physicist, Princeton Plasma Physic Laboratory

Clean, cheap fusion energy would change everything for the better. William Tang has spent a career at the forefront of that field, currently as principal research physicist at the Princeton Plasma Physics Laboratory.

He’s also one of the world’s foremost experts on how the science of fusion energy and high performance computing intersect.

Now, he sees new tools — deep learning and artificial intelligence — being put to work to enable big-data-driven discovery in key scientific endeavors, such as the quest to deliver fusion energy.

Rebuilding the World, From the Molecules On Up — Olexandr Isayev, UNC Eshelman School of Pharmacy

Deep learning has helped machines understand how to move pieces around a board to master, and win at, Go, the most complicated game mankind has ever invented.

Now it’s helping a new generation of chemists better understand how to move molecules around to model new kinds of materials. Olexandr Isayev, an assistant professor at the UNC Eshelman School of Pharmacy, at the University of North Carolina at Chapel Hill, joined our show to explain how deep learning, Go, sci-fi, and computational chemistry intersect.

How to Tune in to the AI Podcast

We’ve got more conversations to come. So tune in next week. Or better still, tune in 20 years from now. Either way, we aim to make this worth your time, whatever time you happen to be tuning in from.

The AI Podcast is available through iTunes, DoggCatcher, Google Play MusicOvercastPlayerFMPodbay, Pocket Casts, PodCruncher, PodKicker, Stitcher and Soundcloud. If your favorite isn’t listed here, email us at aipodcast[at]nvidia.com.

The post Lots of People Are Listening to the AI Podcast Right Now — and I Kind of Don’t Care appeared first on The Official NVIDIA Blog.

NVAIL Partners Showcase Trailblazing Deep Learning Research at ICLR

The International Conference on Learning Representations isn’t a name that rolls off the tongue. But for researchers looking to stay on the cutting edge of deep learning, it’s the place to be.

Better known as ICLR, this year’s conference will bring together experts from the world’s top AI research labs to Vancouver from April 30-May 3. Three of our NVIDIA AI Labs (NVAIL) partners — the Swiss AI Lab (IDSIA), New York University and the University of Tokyo — are among those sharing their work.

IDSIA researchers aim to give robots the same kind of understanding of the physical world that comes naturally to people. A University of Tokyo team will discuss its innovative method for improved sound recognition. And researchers from NYU and the University of the Basque Country will explain how they’re improving machines’ ability to translate languages.

Our NVAIL program helps us keep these and other AI pioneers ahead of the curve with support for students, assistance from our researchers and engineers, and access to the industry’s most advanced GPU computing power.

What Goes Up Must Come Down

Humans innately understand the physical world. We can navigate rooms we’ve never visited. If a shoe drops, we know it’ll hit the floor. And we’re well aware we can’t walk through walls. Even infants possess some basic physical understanding.

Machines don’t have it so easy. Today, training a deep learning model to understand things like “what goes up, must come down” requires lots of data and human effort to label it, said Sjoerd van Steenkiste, a Ph.D. student at IDSIA.

He and a team of researchers from IDSIA and the University of California, Berkeley, are working to streamline that process by eliminating the need for massive data and human interaction.

In a paper for ICLR, the researchers describe how they trained a neural network without human input, a process known as unsupervised learning. Using our DGX-1 AI supercomputer, they trained a deep learning model to distinguish individual objects in a scene and predict the consequences of actions.

Eventually, this research could make it easier to train robots and other machines to interact with their environments, van Steenkiste said.

Sound Mix

Some things are just better mixed together. Peanut butter paired with chocolate is heavenly. Metals are stronger and harder when they’re combined. And planting two crops together can yield bigger harvests.

Yuji Tokozume is applying the same idea to deep learning. The doctoral student and two other University of Tokyo researchers are set on improving sound recognition by using what they call between-class sounds — two sounds mixed together — to train a deep learning model. The model, trained on our Tesla P100 GPU accelerators, identifies the two sounds and determines the ratio of the one sound to another.

In their ICLR paper, the researchers report that between-class learning not only delivered higher accuracy than existing techniques but also surpassed human performance on environmental recordings in a standard dataset known as ESC-50. The team has applied the same approach to improve AI image recognition performance.

Learn more by viewing a talk on between-class learning for sound recognition at our recent GPU Technology Conference in Silicon Valley.

Lost in Translation

For all AI has achieved in automatic language translation, it doesn’t do much for less common tongues like Basque, Oromo and Quechua. That’s because training a deep learning model typically requires large datasets — in this case, vast amounts of text that’s been manually translated into other languages.

Ample data for widely spoken languages like Chinese, English and Spanish makes it possible to directly translate Chinese to English or Spanish to Chinese. Researchers at NYU and the University of the Basque Country aim to bring that capability to languages with smaller numbers of speakers.

Currently languages like Basque — spoken by an estimated 700,000 people, mostly in a region that straddles Spain and France — must first be translated into English (or another major language) before they can be converted to anything else, according to Mikel Artetxe, a doctoral student at the University of the Basque Country.

The same holds true for languages such as Oromo, which is spoken by more than 30 million people in the Horn of Africa, or Quechua, which is spoken by as many as 11 million people in South America.

The research team used our TITAN Xp GPUs to train a neural network to perform these translations without any manually translated training data, relying on independent text of both languages instead. In their ICLR paper, researchers said that accuracy improved when they added a small amount of parallel data, although it was still far below that of a human translation.

“Our goal is to be able to translate more languages with better results,” said Artexe.

The post NVAIL Partners Showcase Trailblazing Deep Learning Research at ICLR appeared first on The Official NVIDIA Blog.