Backed by Google, Intel, Baidu, NVIDIA and dozens more technology leaders, the new MLPerf benchmark suite measures a wide range of deep learning workloads. Aiming to serve as the industry’s first objective AI benchmark suite, it covers such areas as computer vision, language translation, personalized recommendations and reinforcement learning tasks.
NVIDIA achieved the best performance in the six MLPerf benchmark results it submitted for. These cover a variety of workloads and infrastructure scale – ranging from 16 GPUs on one node to up to 640 GPUs across 80 nodes.
The six categories include image classification, object instance segmentation, object detection, non-recurrent translation, recurrent translation and recommendation systems. NVIDIA did not submit results for the seventh category for reinforcement learning, which does not yet take advantage of GPU acceleration.
A key benchmark on which NVIDIA technology performed particularly well was language translation, training the Transformer neural network in just 6.2 minutes. More details on all six submissions are available on the NVIDIA Developer news center.
NVIDIA is the only company to have entered as many as six benchmarks, demonstrating the versatility of V100 Tensor Core GPUs for the wide variety of AI workloads deployed today.
“The new MLPerf benchmarks demonstrate the unmatched performance and versatility of NVIDIA’s Tensor Core GPUs,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “Exceptionally affordable and available in every geography from every cloud service provider and every computer maker, our Tensor Core GPUs are helping developers around the world advance AI at every stage of development.”
State-of-the-Art AI Computing Requires Full Stack Innovation
Performance on complex and diverse computing workloads takes more than great chips. Accelerated computing is about more than an accelerator. It takes the full stack.
NVIDIA’s AI platform is also the most accessible and affordable. Tensor Core GPUs are available on every cloud and from every computer maker and in every geography.
The same power of Tensor Core GPUs is also available on the desktop, with the most powerful desktop GPU, NVIDIA TITAN RTX costing only $2,500. When amortized over three years, this translates to just a few cents per hour.
NVIDIA’s Record-Setting Platform Available Now on NGC
The software innovations and optimizations used to achieve NVIDIA’s industry-leading MLPerf performance are available free of charge in our latest NGC deep learning containers. Download them from the NGC container registry.
The containers include the complete software stack and the top AI frameworks, optimized by NVIDIA. Our 18.11 release of the NGC deep learning containers includes the exact software used to achieve our MLPerf results.
Developers can use them everywhere, at every stage of development:
For enterprises, the containers accelerate the application of AI to their data in the cloud with NVIDIA GPU-accelerated instances from Alibaba Cloud, AWS, Baidu Cloud, Google Cloud Platform, IBM Cloud, Microsoft Azure, Oracle Cloud Infrastructure and Tencent Cloud.
For organizations building on-premise AI infrastructure, NVIDIA DGX systems and NGC-Ready systems from Atos, Cisco, Cray, Dell EMC, HP, HPE, Inspur, Lenovo, Sugon and Supermicro put AI to work.
To get started on your AI project, or to run your own MLPerf benchmark, download containers from the NGC container registry.
Sneaker aficionados invest hundreds of dollars into rare Nike Air Jordans and the hottest Kanye West Adidas Yeezys. But scoring an authentic pair amid a crush of counterfeits is no slam dunk.
Culver City, Calif., startup GOAT (a nod to the sports shorthand for “greatest of all time”) operates the world’s largest sneaker marketplace that uses AI to stomp out fakes. The company offers a seal of authenticity for shoes approved for sale on its site.
Counterfeit sneakers are rampant online for some of the most sought after basketball brands.
“Yeezys and Jordans are now the most faked shoes in the world, and over 10 percent of all sneakers sold online are fake,” said Michael Hall, director of data at GOAT.
A pair of sought-after Kanye West Adidas Yeezys or Nike Air Jordans can easily set you back more than $300.
Pop culture interest in iconic shoes developed for sports stars and celebrity rappers is fueling instant sellouts in new releases. Meanwhile, there’s a heated aftermarket for the most popular footwear fashions as well as scarce vintage and retro models.
As a result, sneaker fans and novices alike are turning to a new wave of shoe sellers, such as GOAT, to ensure they’re getting getting an authentic pair of the most sought-after shoes.
GOAT pioneered the ship-to-verify model in the sneaker industry. This means that sellers can list any shoes on GOAT’s marketplace, but shoes that sell are first sent to the company for authentication by its image detection AI. If the shoes are found to be replicas or not as described, they don’t ship and buyers are given a refund.
Founded in 2015, GOAT’s business is booming. The startup, which has expanded to more than 500 employees, attracts more than 10 million users and has the largest catalog of sneakers in the world at 35,000 skus. This year, the company merged with Flight Club, a sneaker consignment store with locations in Los Angeles,New York City and Miami.
GOAT’s popular app and website — some users have sold more than $10 million in sneakers — has secured nearly $100 million in venture capital funding. The company is a member of NVIDIA’s Inception program, which offers technical guidance to promising AI startups.
AI to Kick Out Counterfeits
When you’re offering 35,000 unique styles, tracking down counterfeit sneakers is no small challenge. GOAT has teams of sneaker experts trained in the art of spotting replicas without AI. “They can spot a fake in like 10 seconds,” said Emmanuelle Fuentes, lead data scientist at GOAT.
Image recognition assists GOAT’s teams of authenticators and quality assurance representatives to ID and authenticate shoes in the warehouse. And the more GOAT’s experts provide helpful metadata to train the AI as they work, the better it helps all those vetting sneakers.
There’s a long list of data signals that are fed into a cloud instance of GPUs for the identification process and for training the network. GOAT’s convolutional neural networks are trained for anomaly and fraud detection.
GOAT, which has multiple neural networks dedicated to brands of sneakers, provides proprietary tools to help its authenticators upload data to train its identification networks.
GPUs Slam Dunk
Tracking and sharing expertise on so many models of high-end sneakers requires logging a ton of photos of authentic sneakers to help assist team members in handling shoes sent in for verification.
“The resolution that we are capturing things at and the scale that we are capturing the images — it’s a high-resolution, massive computational challenge requiring GPUs,” said Fuentes.
GOAT turned to NVIDIA TITAN Xp GPUs and NVIDIA Tesla GPUs on P2 instances of AWS running the cuDNN-accelerated PyTorch deep learning framework to initially train their neural networks on 75,000 images of authentic sneakers.
The company relies on the power of GPUs for identification of all of its sneaker models, Hall added. “For some of the most-coveted sneakers, there are more inauthentic pairs than real ones out there. Previously there wasn’t a way for sneakerheads to purchase with confidence,” he said.
A deluge of data is fueling AI innovation. It’s a trend that shows no sign of slowing.
As organizations in every industry around the globe attempt to streamline their data pipelines and maximize data science productivity, one challenge looms large: implementing AI initiatives effectively.
This is where IBM SpectrumAI with NVIDIA DGX comes in.
At the core of data science productivity is the infrastructure and software used for building and training machine learning and deep learning workloads.
With IBM and NVIDIA’s new converged infrastructure offering, organizations can take advantage of integrated compute, storage and networking. It’s the latest systems and software to support the complete lifecycle of AI — from data preparation to training to inference.
IBM SpectrumAI with NVIDIA DGX is built on:
IBM Spectrum Scale v5: Software-defined to streamline data movement through the AI data pipeline
NVIDIA DGX-1 servers: Purpose-built for AI and machine learning
NVIDIA DGX software stack: Optimized for maximum GPU training performance
Proven data performance: Over 100GB/s throughput, supporting up to nine DGX-1 servers in a single rack
IBM SpectrumAI with NVIDIA DGX helps businesses deploy AI infrastructure quickly, efficiently and with top-tier performance — and it’s easier for IT teams to manage.
It’s the latest addition to our NVIDIA DGX-1 Reference Architecture lineup, which includes data center solutions from select storage technology partners. The solutions help enterprises and their data science teams:
Focus on innovation, research and transforming the world through their AI initiatives.
Minimize the design complexities behind architectures optimized for AI workloads.
Effortlessly scale AI workloads with predictable performance that’s also cost effective.
IBM software-defined storage offers performance, flexibility and extensibility for the AI data pipeline. NVIDIA DGX-1 provides the fastest path to machine learning and deep learning. Pairing the two results in an integrated, turnkey AI infrastructure solution with proven productivity, agility and scalability.
Register for our joint webinar on Jan. 29, or check out these resources to learn more:
It started as friends hacking a radio-controlled boat over a weekend for fun. What happened next is classic Silicon Valley: Three childhood buddies parlay robotics and autonomous vehicle skills into an autonomous ship startup and cold-call the world’s third-largest cargo shipper.
San Francisco-based Shone — founded by Parisian classmates Ugo Vollmer, Clement Renault and Antoine de Maleprade — has made a splash in maritime and startup circles. The company landed a pilot deal with shipping giant CMA CGM and recently won industry recognition.
Shone aims to modernize shipping. The startup applies NVIDIA GPUs to a flood of traditional cargo ship data such as sonar, radar, GPS and AIS, a ship-to-ship tracking system. This has enabled it to quickly process terabytes of training data on its custom algorithms to develop perception, navigation and control for ocean freighters. The company has added cameras to offer better seafaring object detection as well.
“The first part is packaging all of the perception so the crew can make better decisions,” said Vollmer, the company’s CEO, previously an autonomous vehicle maps engineer at Mapbox. “But there’s tons of value of connecting communications from the ship to the shore.”
Cargo ships are among a wave of industries, including locomotives, joining the autonomous vehicle revolution.
Shone is a member of NVIDIA Inception, a virtual accelerator program that helps startups get to market faster.
GPUs Set Sail
Founded in 2017, Shone has expanded to eight employees to navigate seafaring AI. Its NVIDIA GPU-powered software is now deployed on several CMA CGM cargo ships in pilot tests to help with perception for ship captains.
“What is particularly interesting for CMA CGM is what artificial intelligence can bring to systems on board container ships in terms of safety. AI will facilitate the work of crews on board, whether in decision support, maritime safety or piloting assistance,” said Jean-Baptiste Boutillier, deputy vice president at CMA CGM.
It didn’t come easy. The trio of scrappy entrepreneurs had to hustle. After hacking the radio-controlled boat and realizing they had a model for an autonomous boat, they raised $200,000 in friends-and-family money to start the company.
Next they got accepted into Y Combinator. Shone’s partnership with CMA CGM, which operates 500 ships worldwide, came as the team was working as a member of the winter 2018 batch at the accelerator and urgently seeking a pilot to prove their startup’s potential.
Renault and Vollmer recall cold-calling CMA CGM, the surprise of getting an appointment, and going in jeans and T-shirts and underprepared to meet with executives — who were in suits. Despite that, the execs were impressed with the team’s technical knowledge and encouraged them to come back — in six months.
“There is this industry that is moving 90 percent of the products in the world, and the tech is basically from the ‘80s — we were like, ‘What? How is that possible?’” said Renault, who previously worked on autonomous trucks at Starsky Robotics.
Prototype AI Boat
Not to be discouraged by the first meeting, the team decided to buy a real boat to outfit with AI to show CMA CGM just how serious they were. Renault spent $10,000 for a 30-foot boat he located on Craigslist, but the trailer blew a tire on the way back from the Sacramento River delta. Just one more obstacle to overcome, but he got help and got it back to the office.
De Maleprade’s robotics skills came into play next. (“He was building rockets when he was 14 and blowing things up,” Vollmer said.) De Maleprade fitted the boat with a robotic hydraulic steering mechanism. The other two went to work on the software with him, installing a GPU on the boat, as well.
The boat development was sped by training their algorithms on GPU workstations, and the on-board GPUs of the boat enabled inferencing, said de Maleprade.
Three months later, the trio had the autonomous boat prototype ready. They showed off its capabilities to CMA CGM, which was even more impressed. The executives invited them to develop the platform on several of their freighters, which span four football fields in length and transport 14,000 containers from Southern California to China.
“CMA CGM has built its success on strong entrepreneurial values and constant innovations. That is why we decided to allow the Shone team to board some of our ships to develop their ideas and their potential,” said Boutillier.
The founders often sleep on the cargo ships, departing from the Port of Long Beach to arrive in the Port of Oakland, to observe the technology in action and study CMA CGM’s needs.
Synthesizing all the maritime data with various home-brewed algorithms and other techniques to develop Shone’s perception platform was a unique challenge, said de Maleprade, who worked on robotics before becoming CTO at Shone.
“We took a video with a drone to show CMA CGM our prototype boat could offer autonomous navigation assistance for crews. They liked what we had,” said de Maleprade. He says they’re now working on simulation to speed their development timeline.
The Shone team recently finished up at Y Combinator, leaving a strong impression at its Demo Day presentation and with $4 million in seed funding from investors.
Imagine robots straining to grip door handles. Or lifting plastic bananas and dropping them into dog bowls. Or struggling to push Lego pieces around a metal bin.
If you visited the laboratory of professor Sergey Levine at the University of California at Berkeley, you might see some puzzling scenes.
“We want to put our robots into an environment where they can explore, where they can essentially play,” says Levine, an assistant professor who runs the Robotic Artificial Intelligence and Learning Lab within the Artificial Intelligence Research Lab at Berkeley, which is a participant in our NVIDIA AI Labs initiative.
Why would robots be playing? Because a key to intelligence may be the way that creatures learn about their physical environment by poking at things, pushing things and observing what happens.
“The only proof of the existence of intelligence is in humans, and humans exist in a physical world, they’re embodied,” explains Levine. “In fact, all intelligent creatures we know are embodied. Perhaps they don’t have to be, but we don’t know that.”
Hence, “I see the robotics as actually a lens on artificial intelligence” more broadly, he says.
From the Bottom Up
One of the biggest lessons of robotics over many years, says Levine, is that it confirms “Moravec’s Paradox.”
Hans Moravec, a Carnegie Mellon University professor of robotics, wrote about a dichotomy in AI in his 1988 book “Mind Children: The Future of Robot and Human Intelligence.”
Machines can be taught to do well “things humans find hard,” such as mastering the game of chess. But machines do poorly at “what is easy for us,” such as basic motor skills.
“If you want a machine to play chess, it is actually comparatively quite easy,” observes Levine. “If you want a machine to pick up the chess pieces, that is incredibly difficult.”
Moravec viewed that dichotomy as a “giant clue” to the problem of constructing machines that think. He argued for building up intelligence by following the path of Darwinian evolution. That is, the gradual development, from the bottom up, of basic sensorimotor systems and then, much later, higher reasoning.
There’s a parallel to deep learning, such as the breakthroughs in image recognition with convolutional neural networks (CNNs). Neural networks that automatically learn the most basic features in data —“edge detectors” and “corner detectors,” say — can then assemble hierarchies of representation.
“What we saw was that a method that can figure out those low-level features can then also figure out the higher-level features,” says Levine.
Unlike pictures of cats on the internet, however, there isn’t a ready supply of data for robots to learn from. Hence his lab’s focus on having machines explore the environment “for weeks, autonomously pushing things around, manipulating objects and then learning something about their world.”
Levine uses a variety of machine learning techniques to train robots, including CNNs but also, especially, reinforcement learning, where a route to a destination is planned by inferring from a current state to a goal state. The policy is then used by the robot at test time to carry out new instances of those tasks.
During the training phase, the play with objects is “unsupervised.” There is no hand-engineering by humans of the precise movements the robot should make to carry out a task. Nor is the goal even specified.
The neural network figures out what goal it should accomplish, and then figures out what policies, including angles of movement of its appendages, can lead to that goal.
‘Learning to Learn’
Training makes use of clusters of NVIDIA GPUs in an offsite facility. During test time, a single GPU is attached to each robot, which is used to run the policies that have been learned. In some more ambitious tests, such as learning a new policy from watching a video demonstration by a human, a more powerful NVIDIA DGX-1 is attached to each machine.
Says Levine, GPU compute power has brought two benefits to AI. By speeding up training, it “allows us to do science faster.” Second, during inference, the power of GPUs allows for real-time response, something that is “a really big deal for robots.”
“When the robot is actually in the physical world, if it’s doing something dynamic, such as flying at a closed door,” as in the case of a drone, “it needs to figure out the door’s closed before it hits it.”
The work of Levine and his staff with reinforcement learning has evolved to greater levels of sophistication. It’s one thing to teach a robot to perform a task at test time similar to what it learned in training. More ambitious is for the robot to learn new policies for problem solving at test time on tasks that are novel. The machine is “learning to learn,” says Levine.
The latter, called meta-learning, is an increasing focus of his lab. In a recent paper, “One-shot Hierarchical Imitation Learning of Compound Visuomotor Tasks,” a robot first watches a human demonstrate a simple, “primitive” task, such as dropping an object into a bowl. It develops a policy to imitate that action.
At test time, the robot is shown a “compound” task, such as dropping the object in the bowl and then moving the bowl along the table. The robot uses its prior experience with the simple tasks to compose a “sequence” of policies by which to perform actions in succession.
Levine’s robots were able to imitate the human’s demonstration of the compound task after seeing it demonstrated just one time, what’s known as “single-shot” learning.
What will humans learn about intelligence from all this? Some of what robots arrive at may be a rather alien form of intelligence. Robots developed to, say, work in a power plant, may build “some kind of internal representation of the world through learning that is kind of unique, that they are setting for the job they’ve been tasked with doing,” says Levine.
“It might be very different from our [representations],” he says, “and very different from how we might design them if we were to do it manually.”
Bot So Fast
Levine is very mindful of skeptics of AI such as NYU professor Gary Marcus, with whom he agrees that deep learning today doesn’t lead to higher reasoning.
“People transfer past knowledge from past experience, and most of what we call AI systems that are deployed today don’t do that, and they should.”
Development of higher reasoning may be a process over the lifetime of a robot, not a single neural network.
“I think it would be fantastic if, in the future, robots have a kind of a childhood, the same way that we do,” he says, one in which they make progress through various developmental stages.
“Except that with robots, the nice thing is that you can copy and paste their brain,” perhaps speeding that development.
In an eventual adulthood, muses Levine, robots’ mental development would continue.
“If you have a robot that has to perform some kind of task, maybe it has to do construction, for example, in its off-time, the robot doesn’t just sit there in the closet collecting dust, it actually practices things the same way a person would.”
Coming to a Real World Near You
There’s a tremendous amount of systems engineering that has to be coupled to deep learning to make robots viable. But Levine is confident that “over the span of the next five years or so, we’ll see that these things will actually make their way into the real world.”
“It may start with industrial robots, things like robots in warehouses, grocery stores and so on, but I think we’ll see more and more robots in our daily lives.”
For centuries, scientists have marveled at telescopic imagery, theorized about much of what they see and drawn conclusions from their observations.
More recently, astronomers and astrophysicists are using the computing performance of GPUs and AI to glean more from that imagery than ever.
A research team at the University of California, Santa Cruz, and Princeton University has been pushing these limits. Led by Brant Robertson, of UC Santa Cruz and NASA Hubble Fellow Evan Schneider, the team has been optimizing their use of NVIDIA GPUs and deep learning tools to accommodate larger calculations.
Their goal: expand their ability to do more accurate hydrodynamic simulations, and thereby gain a better understanding of how galaxies are formed.
The team started by simply moving their efforts from CPUs to GPUs. Doing so to measure matter passing in and out of the cell faces of a 3D grid mesh was akin to suddenly being able to solve many Rubik’s Cubes simultaneously.
With CUDA in the mix, the team could transfer an array of grids onto the GPU to do the necessary calculations, resulting in more detailed simulations.
Once they’d squeezed the most of that setup, the team’s ambitions shifted to a more powerful cluster of NVIDIA GPUs, namely the Titan supercomputer at the U.S. Department of Energy’s Oak Ridge National Laboratory. But to perform higher-resolution simulations, the team needed some powerful code to harness Titan’s 16,000-plus Tesla GPUs.
[Read how researchers from the University of Western Australia have trained a GPU-powered AI system to recognize new galaxies.]
Schneider, Robertson’s former graduate student and now a postdoctoral fellow at Princeton, was up to the task. She wrote a GPU-accelerated piece of hydrodynamic code called CHOLLA, which stands for Computational Hydrodynamics On paraLLel Architectures.
Robertson believes CHOLLA will help scientists answer previously unanswerable questions. For instance, applying the code to M82, a galaxy revered by astronomers for its prodigious star-formation rate and powerful galactic winds, could provide new levels of understanding of how stars are formed.
“How does that wind get there? What sets the properties in the wind? How does the wind control the mass of the galaxy? These are all questions we’d like to understand, but it’s a very difficult computational problem,” Robertson said. “Evan is the first person to solve this with any fidelity.”
CHOLLA, which was written a few years ago, has enabled Schneider and Robertson to leverage 100 million core hours on Titan. The code is unique in that it performs all calculations on GPUs, enabling the team to do sophisticated simulations on their NVIDIA DGX and DGX-1 deep learning systems in the lab and then transfer them to Titan, where they can be expanded.
“You want to take advantage of the floating-point operation power of the GPUs. You don’t want to spend your time waiting for information to go back and forth between the GPUs if you can avoid it,” said Robertson. “Spending as much time as possible computing on the GPU is where you want to be.”
Pushing to the Summit
CHOLLA’s ability to scale vast numbers of GPUs has enabled the team to perform a test calculation of 550 billion cells that Robertson called “one of the largest hydrosimulations ever done in astrophysics.”
Another student, Ryan Hausen, has paved the way to even more ambitious work by developing a deep learning framework called Morpheus that uses raw telescope data to classify galaxies. That opens the door to potentially processing giant surveys with billions of galaxies on a DGX system.
“That’s something I didn’t think was possible just a few years ago,” said Robertson.
Yet another huge leap may be coming soon, as Robertson is hoping to obtain time on Summit, the world’s most powerful supercomputer — powered by NVIDIA Volta GPUs. He believes CHOLLA will enable the team to do even more with Summit’s expansive GPU memory than it has with Titan.
“The computational power of NVIDIA GPUs enabled us to perform numerical simulations that were not possible before,” said Robertson. “And we plan to use NVIDIA GPUs to push what is possible.”
With “fake news” embedding itself into, well, our news, it’s become more important than ever to distinguish between content that is fake or authentic.
That’s why Vagelis Papalexakis, a professor of computer science at the University of California, Riverside, developed an algorithm that, so far, detects fake news with 75 percent accuracy.
“I want [the algorithm] to be a tool that helps educate folks about what it is they’re about to read,” said Papalexakis in a conversation with AI Podcast host Noah Kravitz.
While the algorithm is starting with text, Papalexakis hopes to expand to videos and images.
Since fake news is a “big umbrella term,” as Papalexakis calls it, his team didn’t come up with their own definition of fake for the algorithm.
Instead they rely on “well-known definitions that people use and have been using in the past.” Such solutions include a crowd-sourced tool called B.S. Detector.
The Greek bailout referendum of 2015 inspired Papalexakis, who is from Greece, to start working on a fake news detector.
“I sort of observed [the referendum] from a distance because I was in California, and I was not sure what was going on on either side. And so I wasn’t able to trust anything that I read from either side because there was so much conflicting information that I just gave up,” said Papalexakis.
“About a year ago, we decided to try to help from a technological point of view and at least try to provide citizens with tools so that they can help then decide about what they’re reading,” he said
With his research, Papalexakis dove into why and how people fall for fake news.
“I really hope that [this research] can actually have a tangible impact, especially if you try to turn this into guidelines and sort of education that starts from a young age,” said Papalexakis.
But Papalexakis cautions the solution to fake news won’t just come from technology, but also from policy and education.
“There needs to be a holistic approach to this,” Papalexakis said. “We’re merely a part of the puzzle.”
Listen to the complete discussion on the AI Podcast.
That’s what an audience of 500 leading deep learning researchers were asking themselves Tuesday night when we announced we’re giving away TITAN RTX GPUs to 10 lucky attendees.
The randomly chosen developers will be among the world’s first to get their hands on the first TITAN GPUs built with NVIDIA’s Turing architecture. Nicknamed T-Rex, TITAN RTX is the most powerful GPU for the PC ever, delivering faster training and inferencing speeds, plus twice the memory of previous-generation TITAN GPUs. They provide tremendous performance for AI research and data science.
Held in Montreal at the Conference on Neural Information Processing Systems (known as NeurIPS), the gathering also featured an original AI-composed tune — inspired by Queen and performed live by a tribute band.
Celebrating AI Pioneers at NeurIPS
The reception took place at the former Dalhousie Train Station in the heart of Old Montreal, now home to contemporary circus company Cirque Éloize. Attendees picked up plates of québécois specialties like poutine and smoked meat, taking in the green-and-black lit room.
Researchers mingled with some of AI’s leading lights, including NVIDIA Chief Scientist Bill Dally, Google AI head Jeff Dean, Montreal-based deep learning pioneer Yoshua Bengio and Google Brain’s Ian Goodfellow, the inventor of GANs.
The evening kicked off with Bryan Catanzaro, NVIDIA’s vice president of applied deep learning research, presenting Pioneer Awards to seven members of our NVIDIA AI Labs program.
The NVAIL program supports AI innovation at leading universities and research institutes worldwide — many of which are presenting at NeurIPS this week.
“It’s a growing community,” Catanzaro said. “We’re happy to celebrate all of that with you today.”
Spanning research areas from reinforcement learning to adversarial example detection, the award recipients hailed from UC Berkeley, University of Washington, Carnegie Mellon University, MIT, NYU, the Montreal Institute for Learning Algorithms and the Swiss AI lab IDSIA.
“It feels really great,” said award recipient Kurtland Chua, a student researcher at the Berkeley Artificial Intelligence Research lab advised by Assistant Professor Sergey Levine. “As an undergrad, I wouldn’t expect ever getting here.”
“We’re honored that NVIDIA supports our research,” added his collaborator Roberto Calandra.
Then came a titanic surprise.
TITAN Meets Turing
“Have you guys heard we launched a new GPU yesterday?” Catanzaro teased. “It’s gold and it’s heavy and it’s awesome. We’re going to give away a bunch of these tonight.”
Anticipation built as attendees checked their wristbands under black lights to see if they had a golden ticket: a glowing letter “T.” The 10 winners will be sent our just-announced TITAN RTX GPUs.
T-Rex transforms a PC into a supercomputer, offering 130 teraflops of deep learning performance and 11 GigaRays of ray-tracing performance.
It’s designed for a broad range of demanding applications — from computationally intensive AI and data science workloads to real-time ray tracing, VR and high performance computing.
Caltech Ph.D. student Guanya Shi was excited to use the new GPU for his robotics and machine learning research. Xavier Bouthillier, Ph.D. student at the Université de Montréal, said he looks forward to using T-Rex for his optimization work using deep neural networks.
Don’t Stop Me Now: AI-Composed Music Caps Off Evening
To top it all off, a Canadian rock group called Simply Queen played an original song composed by AI in the style of Queen.
Those who listened closely could catch lines like “Let the GeForce pull you in,” which were written by the band’s lead singer and drummer.
“There are some little touches in there, definitely,” said Rick Rock, the band’s frontman, who took to the stage in a Freddie Mercury-style yellow leather jacket. “Ultimately, it’s a collaboration with the AI.”
The neural networks behind the tune were developed by Luxembourg-based startup AIVA and trained on NVIDIA GPUs. After initial training on a database of hundreds of rock songs, the models had a second round of training on 50 Queen tracks to learn the musical style of the legendary band.
Rock music has more repetition in its rhythm than classical music, presenting a challenge for training the algorithm to capture Queen’s unique sound without overfitting to any particular song, said Pierre Barreau, AIVA’s CEO. “We had to make sure that it’s making something original while also respecting the style.”
“Me and my sister grew up watching my dad’s old videos of Queen concerts, so it was great — I really enjoyed it,” said Sarah Poole, data scientist at Verily Life Sciences. “You could definitely tell it was Queen’s style.”
As the night continued, the band serenaded the audience with Queen hits, including “Bohemian Rhapsody,” “Another One Bites the Dust” and, of course, “We Will Rock You.”
The crowd gathered by the stage, singing along, cheering and jumping up and down to the music.
“It was like the real Queen,” said Ahmed Touati, Ph.D. student at the Montreal Institute for Learning Algorithms.
The band concluded their set with a rousing rendition of “We Are the Champions” — a fitting tribute for this gathering of deep learning champions.
Grab the steering wheel. Step on the accelerator. Take a joyride through a 3D urban neighborhood that looks like Tokyo, or New York, or maybe Rio de Janeiro — all imagined by AI.
We’ve introduced at this week’s NeurIPS conference AI research that allows developers to render fully synthetic, interactive 3D worlds. While still early stage, this work shows promise for a variety of applications, including VR, autonomous vehicle development and architecture.
The tech is among several NVIDIA projects on display here in Montreal. Attendees huddled around a green and black racing chair in our booth have been wowed by the demo, which lets drivers navigate around an eight-block world rendered by the neural network.
Visitors to the booth hopped into the driver’s seat to tour the virtual environment. Azin Nazari, a University of Waterloo grad student, was impressed with the AI-painted scene, which can switch between the streets of Boston, Germany, or even the Grand Theft Auto game environment at sunset.
The demo uses Unreal Engine 4 to generate semantic layouts of scenes. A deep neural network trained on real-world videos fills in the features — depicting an urban scene filled with buildings, cars, streets and other objects.
This is the first time neural networks have been used with a computer graphics engine to render new, fully synthetic worlds, say NVIDIA researchers Ting-Chun Wang and Ming-Yu Liu.
“With this ability, developers will be able to rapidly create interactive graphics at a much lower cost than traditional virtual modeling,” Wang said.
Called vid2vid, the AI model behind this demo uses a deep learning method known as GANs to render photorealistic videos from high-level representations like semantic layouts, edge maps and poses. As the deep learning network trains, it becomes better at making videos that are smooth and visually coherent, with minimal flickering between frames.
Hitting a new state-of-the-art result, the researchers’ model can synthesize 30-second street scene videos in 2K resolution. By training the model on different video sequences, the model can paint scenes that look like different cities around the world.
For those in Montreal this week, stop by our booth — No. 209 — to sit behind the wheel and try it out for yourself.
Opposite of them, attendees are flocking to a table stacked with an odd assortment of items — cans of tomato soup and spam, a box of crackers, a mustard bottle. It may not sound like much, but this demo is DOPE. Literally.
DOPE, or Deep Object Pose Estimation, is an algorithm that detects the pose of known objects using a single RGB camera. It’s an ability that’s essential for robots to grasp these objects.
Giving new meaning to “hands-on demos,” booth visitors can pick up the cracker box and cans, moving them across the table and changing their orientation. A screen above displays the neural network’s inferences, tracking the objects’ edges as they shift around the scene.
“It’s a $30 camera, very cheap, very accessible for anyone interested in robotics,” said NVIDIA researcher Jonathan Tremblay. The tool, trained entirely on computer-generated image data, is publicly available on GitHub.
Booth visitors can also feast their eyes on stunning demos of real-time ray tracing. Running on a single Quadro RTX 6000 GPU, our Star Wars demo features beautiful, cinema-quality reflections enabled by NVIDIA RTX technology.
And while a few conspiracy theorists still question whether the Apollo 11 mission actually landed on the moon, a ray-traced recreation of one iconic lunar landing image shows that the photo looks just as it should if it were taken on the moon.
Data scientists exploring the booth will see the new TITAN RTX in action with the RAPIDS data analytics software, quickly manipulating a dataset of all movie ratings by IMDB users. Other demos showcase the computing power NVIDIA TensorRT software provides for inference, both in data centers and at the edge.
The NVIDIA booth at NeurIPS is open all week from 10am to 7pm. For more, see our full schedule of activities.
When Danny Weissberg’s grandmother was overcome by a stroke 10 years ago, she lost the ability to speak intelligibly, and their family lost the primary way to communicate with its matriarch.
In the wake of his grandmother’s sudden impairment, the young engineer resolved he’d find a way to help.
The result is Voiceitt, a Tel Aviv-based startup that has developed deep learning, signal processing and customizable speech recognition technologies to provide a synthesized voice for those whose speech has been garbled.
The beneficiaries won’t just be Weissberg’s grandmother, of course. Each year, millions of people have their speech impaired due to strokes and aneurysms, diseases such as cerebral palsy and Parkinson’s, brain injuries from accidents, and other medical conditions.
Earlier this month, the 16 employee-strong company was one of eight finalists in NVIDIA’s GTC Israel Inception awards competition to find the country’s best AI startup. The same week, it won a $1.5 million grant from VentureClash, a global venture competition backed by the state of Connecticut.
Voiceitt’s first product is a mobile app that converts non-standard language into readily understood speech. It’s already in beta testing with more than 200 users in four different languages.
The company is also working to integrate its software with voice-driven technology such as Amazon’s voice assistant Alexa. The goal is to allow individuals who normally require assistance to take control of their own needs — turning on lights, controlling the TV and asking for basic information, for example.
“As the world becomes more and more voice enabled, speech recognition becomes such an important thing in our life,” said Weissberg, a software developer with a master’s degree in philosophy. “It will be a basic human right to have accessibility to this technology.”
At the heart of Voiceitt’s offerings are deep learning algorithms, which are trained on a limited vocabulary of up to 80 calibrated words or phrases. Adding to the challenge is the sparsity of data — due to the difficulty of obtaining language samples. The model is then fine-tuned for each user because speech patterns are unique to each individual and their condition.
“Everyone’s impairment is different, but there are certain similarities within a particular group of speech impairments,” said Stas Tiomkin, Voiceitt’s co-founder and CTO, who holds a Ph.D. in speech recognition from Hebrew University. “We work very hard to collect speech samples and build a generic acoustic model that then gets customized.”
Among those Voiceitt is helping are a 17-year-old New York woman who, following a harrowing auto accident, is wheelchair-bound and has severely impaired speech. Another user’s disabilities confine him to a wheelchair, but he plans to use the app at a fast food restaurant, taking drive-through orders. A middle-aged Israeli man with cerebral palsy uses the technology to help control his television.
Voiceitt has begun licensing its app to institutions and is working on deals with manufacturers that could integrate the tool into smart speakers.
While other language assistance devices require pointing with a device or detect glances, Voiceitt offers a more natural way to communicate. Tiomkin points out that, according to speech and language pathologists, there is evidence that engaging in conversation can help develop brain functioning and improve speech.
“In that sense, this isn’t just an application for communication, it’s a therapy,” Tiomkin says.
Voiceitt’s next frontier is enabling speech detection with a far broader range of vocabulary, so that it could render free-form conversation fully intelligible.
The company estimates that there are 100 million individuals worldwide who can’t communicate and be understood with their voices. But they see the market as potentially far larger. At least 5 percent of individuals can’t use voice-enabled AI devices due to disabilities, strong accents or age, including an estimated 8 percent of those over the age of 65.