From ER Visit to AI Startup, CloudMedx Pursues Predictive Healthcare Models

Twice Sahar Arshad’s father-in-law went to an emergency room in Pakistan complaining of frequent headaches. Twice doctors sent him home with a diagnosis of allergies.

Turns out he was suffering from a subdural hematoma — bleeding inside the head. Following the second misdiagnosis, he went into a coma and required emergency brain surgery (and made a full recovery.)

Arshad and her husband, Tashfeen Suleman — both computer scientists living in Bellevue, Wash., at the time — afterwards tried to get to the root of the inaccurate diagnoses. The hematoma turned out to be a side effect of a new medication Suleman’s father had been prescribed a couple weeks prior. And he lacked physical symptoms like slurred speech and difficulty walking, which would have prompted doctors to order a CT scan and detect the bleeding earlier.

Too Much Data, Too Little Time

It’s a common problem, Arshad and Suleman found. Physicians often have to rely on limited information, either because there’s insufficient data on a patient or because there’s not enough time to analyze large datasets.

The couple thought AI could help address this challenge. In late 2014, they together founded CloudMedx, a Palo Alto-based startup that develops predictive healthcare models for health providers, insurers and patients.

A member of the NVIDIA Inception virtual accelerator program, CloudMedx is working with the University of California, San Francisco; Barrow Neurological Institute, a member of Dignity Health, a nonprofit healthcare organization; and some of the largest health insurers in the country.

Its AI models, trained using NVIDIA V100 Tensor Core GPUs through Amazon Web Services, can help automate medical coding, predict disease progression and determine the likelihood a patient may have a complication and need to be readmitted to the hospital within 30 days.

“What we’ve built is a natural language model that understands how different diseases, symptoms and medications are related to each other,” said Arshad, chief operating officer at CloudMedx. “If we’d had this tool in Tashfeen’s father’s case, it would have flagged the risk of internal head hemorrhaging and recommended obtaining a CT scan.”

Working with an AI to Risk Assessment

The CloudMedx team has developed a deep neural network that can process medical data to provide risk assessment scores, saving clinicians time and providing personalized insight for patients. It’s trained on a dataset of 54 million patient encounters.

In a study to evaluate its deep learning model, the clinical AI tool took a mock medical exam — and outperformed human doctors by 10 percent, on average. On their own, physicians scored between 68 to 81 percent. When taking the exam along with CloudMedx AI, they achieved a high score of 91 percent.

The startup’s AI models are used in multiple tools, including a coding analyzer that converts doctor’s notes into a series of medical codes that inform the billing process, as well as a clinical analyzer that evaluates a patient’s health records to generate risk assessments.

CloudMedx is collaborating with UCSF’s Division of Gastroenterology to stratify patients awaiting liver transplants based on risk, so that patients can be matched with donors before the tumor progresses too far for a transplant.

The company is also working with one of the largest health insurers in the U.S. to better identify congestive heart failure patients with a high risk of readmission to the hospital. With these insights, health providers can follow up more often with at-risk patients, reducing readmissions and potentially saving billions of dollars in treatment costs.

Predictive Analytics for Every Healthcare Player

Predictive analytics can even improve the operational side of healthcare, giving hospitals a heads-up when they might need additional beds or staff members to meet rising patient demand.

“It’s an expensive manual process to find additional resources and bring on extra nurses at the last minute,” Arshad said. “If hospitals are able to use AI tools for surge prediction, they can better plan resources ahead of time.”

In addition to providing new insights for health providers and payers, these tools save time by processing large amounts of medical data in a fraction of the time it would take humans.

CloudMedx has also developed an AI tool for patients. Available on the Medicare website to its 53 million patient beneficiaries, the system helps users access their own claims data, correlates a person’s medical history with symptoms, and will soon also estimate treatment costs.

NVIDIA Inception Program

As members of the NVIDIA Inception program, the CloudMedx team was able to reach out to NVIDIA developers and the company’s healthcare team for help with some of the challenges they faced when scaling up for cloud deployment.

Inception helps startups during critical stages of product development, prototyping and deployment with tools and expertise to help early-stage companies grow.

Both Suleman and Arshad have spoken at NVIDIA’s annual GPU Technology Conference, with Arshad participating in a Women@GTC healthcare panel last year. The conference has helped the team meet some of their customers, said Arshad, who’s also a finalist for Entrepreneur of the Year at the 2020 Women in IT Awards New York.

Check out the healthcare track for GTC, taking place in San Jose, March 22-26.

The post From ER Visit to AI Startup, CloudMedx Pursues Predictive Healthcare Models appeared first on The Official NVIDIA Blog.

AWS Outposts Station a GPU Garrison in Your Datacenter

All the goodness of GPU acceleration on Amazon Web Services can now also run inside your own data center.

AWS Outposts powered by NVIDIA T4 Tensor Core GPUs are generally available starting today. They bring cloud-based Amazon EC2 G4 instances inside your data center to meet user requirements for security and latency in a wide variety of AI and graphics applications.

With this new offering, AI is no longer a research project.

Most companies still keep their data inside their own walls because they see it as their core intellectual property. But for deep learning to transition from research into production, enterprises need the flexibility and ease of development the cloud offers — right beside their data. That’s a big part of what AWS Outposts with T4 GPUs now enables.

With this new offering, enterprises can install a fully managed rack-scale appliance next to the large data lakes stored securely in their data centers.

AI Acceleration Across the Enterprise

To train neural networks, every layer of software needs to be optimized, from NVIDIA drivers to container runtimes and application frameworks. AWS services like Sagemaker, Elastic MapReduce and many others designed on custom-built Amazon Machine Images require model development to start with the training on large datasets. With the introduction of NVIDIA-powered AWS Outposts, those services can now be run securely in enterprise data centers.

The GPUs in Outposts accelerate deep learning as well as high performance computing and other GPU applications. They all can access software in NGC, NVIDIA’s hub for GPU-accelerated software optimization, which is stocked with applications, frameworks, libraries and SDKs that include pre-trained models.

For AI inference, the NVIDIA EGX edge-computing platform also runs on AWS Outposts and works with the AWS Elastic Kubernetes Service. Backed by the power of NVIDIA T4 GPUs, these services are capable of processing orders of magnitudes more information than CPUs alone. They can quickly derive insights from vast amounts of data streamed in real time from sensors in an Internet of Things deployment whether it’s in manufacturing, healthcare, financial services, retail or any other industry.

On top of EGX, the NVIDIA Metropolis application framework provides building blocks for vision AI, geared for use in smart cities, retail, logistics and industrial inspection, as well as other AI and IoT use cases, now easily delivered on AWS Outposts.

Alternatively, the NVIDIA Clara application framework is tuned to bring AI to healthcare providers whether it’s for medical imaging, federated learning or AI-assisted data labeling.

The T4 GPU’s Turing architecture uses TensorRT to accelerate the industry’s widest set of AI models. Its Tensor Cores support multi-precision computing that delivers up to 40x more inference performance than CPUs.

Remote Graphics, Locally Hosted

Users of high-end graphics have choices, too. Remote designers, artists and technical professionals who need to access large datasets and models can now get both cloud convenience and GPU performance.

Graphics professionals can benefit from the same NVIDIA Quadro technology that powers most of the world’s professional workstations not only on the public AWS cloud, but on their own internal cloud now with AWS Outposts packing T4 GPUs.

Whether they’re working locally or in the cloud, Quadro users can access the same set of hundreds of graphics-intensive, GPU-accelerated third-party applications.

The Quadro Virtual Workstation AMI, available in AWS Marketplace, includes the same Quadro driver found on physical workstations. It supports hundreds of Quadro-certified applications such as Dassault Systèmes SOLIDWORKS and CATIA; Siemens NX; Autodesk AutoCAD and Maya; ESRI ArcGIS Pro; and ANSYS Fluent, Mechanical and Discovery Live.

Learn more about AWS and NVIDIA offerings and check out our booth 1237 and session talks at AWS re:Invent.

The post AWS Outposts Station a GPU Garrison in Your Datacenter appeared first on The Official NVIDIA Blog.

Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show

Back in the day, the annual SC supercomputing conference was filled with tabletops hung with research posters. Three decades on, the show’s Denver edition this week was a sea of sharp-angled booths, crowned with three-dimensional signage, promoting logos in a multitude of blues and reds.

But nowhere on the SC19 show floor drew more of the show’s 14,000 attendees than NVIDIA’s booth, built around a broad, floor-to-ceiling triangle with 2,500 square feet of ultra-high def LED screens. With a packed lecture hall on one side and HPC simulations playing on a second, it was the third wall that drew the most buzz.

Cycling through was a collection of AI-enhanced photos of several hundred GPU developers — grad students, CUDA pioneers, supercomputing rockstars — together with descriptions of their work.

Like accelerated computing’s answer to baseball cards, they were rendered into art using AI style transfer technology inspired by various painters — from the classicism of Vermeer to van Gogh’s impressionism to Paul Klee’s abstractions.

Meanwhile, NVIDIA sprinted through the show, kicking things off with a news-filled keynote by founder and CEO Jensen Huang, helping to power research behind the two finalists nominated for the Gordon Bell prize, and joining in to celebrate its partner Mellanox.

And in its booth, 200 engineers took advantage of free AI training through the Deep Learning Institute and dozens of tech talks were provided by leading researchers packed in shoulder to shoulder.

Wall in the Family 

Piecing together the Developer Wall project took a dozen NVIDIANs scrambling for weeks in their spare time. The team of designers, technologists and marketers created an app where developers could enter some background, which would be paired with their photo once it’s run through style filters at, a German startup that’s part of NVIDIA’s Inception startup incubator.

“What we’re trying to do is showcase and celebrate the luminaries in our field. They amazing work they’ve done is the reason this show exists,” said Doug MacMillian, a developer evangelist who helped run the big wall initiative.

Behind him flashed an image of Jensen Huang, rendered as if painted by Cezanne. Alongside him was John Stone, the legendary HPC researcher at the University of Illinois, as if painted by Vincent Van Gogh. Close by were Erik Lindahl, who heads the international GROMACS molecular simulation project, right out of a Joan Miró painting. Paresh Kharya, a data center specialist at NVIDIA, looked like an abstracted sepia-tone circuit board.

Enabling the Best and Brightest 

That theme — how NVIDIA’s working to accelerate the work of people in an ever growing array of industries — continued behind the scenes.

In a final rehearsal hours before Huang’s keynote, Ashley Korzun — a Ph.D. engineer who’s spent years working on the manned mission to Mars set for the 2030s — saw for the first time a demo visualizing her life’s work at the space agency.

As she stood on stage, she witnessed an event she’s spent years simulating purely with data – the fiery path that the Mars lander, a capsule the size of a two-story condo, will take as it slows in seven dramatic minutes from 12,000 miles an hour to gently stick its landing on the Red Planet.

“This is amazing,” she quietly said through tears. “I never thought I’d be able to visualize this.”

Flurry of News

Huang later took the stage and in a broad-sweeping two hour keynote set out a range of announcements that show how NVIDIA’s helping others do their life’s work, including:

Award-Winning Work

SC19 plays host to a series of awards throughout the show, and NVIDIA featured in a number of them.

Both finalists for the Gordon Bell Prize for outstanding achievement in high performance computing — the ultimate winner, ETH Zurich, as well as University of Michigan — ran their work on Oak Ridge National Laboratory’s Summit supercomputer, powered by nearly 28,000 V100 GPUs.

NVIDIA’s founding chief scientist, David Kirk, received this year’s Seymour Cray Computer Engineering Award, for innovative contributions to HPC systems. He was recognized for his path-breaking work around development of the GPU.

And NVIDIA’s Vasily Volkov co-authored with UC Berkeley’s James Demmel a seminal paper 11 years ago recognized with the Time of Time Award  for a work of lasting impact. The paper, which has resulted in a new way of thinking and modeling algorithms on GPUs, has had nearly 1,000 citations.

Looking Further Ahead

If the SC show is about powering the future, no corner of the show was more forward looking than the annual Supercomputing Conference Student Cluster Competition.

This year, China’s Tsinghua University captured the top crown. It beat out 15 other undergrad teams using NVIDIA V100 Tensor Core GPUs in an immersive HPC challenge demonstrating the breadth of skills, technologies and science that it takes to build, maintain and use supercomputers. Tsinghua also won the IO500 competition, while two other prizes were won by Singapore’s Nanyang Technological University.

The teams came from xx different markets, including Germany, Latvia, Poland and Taiwan, in addition to China and Singapore.

Up Next: More Performance for the World’s Data Centers

NVIDIA’s frenetic week at SC19 ended with a look at what’s next, with Jensen joining Mellanox CEO Eyal Waldman on stage at an evening event hosted by the networking company, which NVIDIA agreed to acquire earlier this year.

Jensen and Eyal discussed how their partnership will enable the future of computing, with Jensen detailing the synergies between the companies. “Mellanox has an incredible vision,” Huang said. ““In a couple years we’re going to bring more compute performance to data centers than all of the compute since the beginning of time.”

The post Smart into Art: NVIDIA SC19 Booth Turns Computer Scientists into Art at News-Filled Show appeared first on The Official NVIDIA Blog.

NVIDIA CloudXR Delivers Low-Latency AR/VR Streaming Over 5G Networks to Any Device

Enterprises can now deliver virtual and augmented reality experiences across 5G networks to any device with the introduction of NVIDIA CloudXR.

Built on NVIDIA GPU technology, the new NVIDIA CloudXR software development kit helps businesses create and deliver high-quality, wireless AR and VR experiences from any application based on OpenVR, the broadly used VR hardware and software interface.

With NVIDIA CloudXR, users don’t need to be physically tethered to a high-performance computer to drive rich, immersive environments. The SDK runs on NVIDIA servers located in the cloud or on-premises, delivering the advanced graphics performance needed for wireless virtual, augmented or mixed reality environments — which collectively are known as XR.

Companies that have their own 5G networks can use NVIDIA CloudXR to stream immersive environments from their on-prem data centers. Telcos, software makers and device manufacturers can use the high bandwidth and low latency of 5G signals to provide high framerate, low-latency immersive XR experiences to millions of customers in more locations than previously possible.

From product designers who review 3D models at scale to first responders who practice rescue scenarios through simulations, anyone can benefit from CloudXR using Windows and Android devices, including handheld tablets, VR headsets and AR glasses.

NVIDIA CloudXR Takes Wireless Streaming to the Edge

NVIDIA CloudXR leverages VR-ready NVIDIA GPUs to provide enterprises with huge computational power, so they can deliver high-quality immersive environments for even the most graphics-intensive XR configurations.  The SDK includes:

  • Server driver that runs in the data center
  • Easy-to-use client library to enable VR/AR streaming for a multitude of OpenVR applications to Android and Windows devices
  • SDK for portable client devices that let application developers easily stream rendered content from the cloud

These components work in tandem to dynamically optimize streaming parameters and maximize image quality and frame rates, so XR experiences can maintain optimal quality under any network condition.

To learn more, sign up for early access to the NVIDIA CloudXR SDK.

The post NVIDIA CloudXR Delivers Low-Latency AR/VR Streaming Over 5G Networks to Any Device appeared first on The Official NVIDIA Blog.

Put AI Label On It: Startup Aids Annotators of Healthcare Training Data

Deep learning applications are data hungry. The more high-quality labeled data a developer feeds an AI model, the more accurate its inferences.

But creating robust datasets is the biggest obstacle for data scientists and developers building machine learning models, says Gaurav Gupta, CEO of, a member of the NVIDIA Inception virtual accelerator program.

The startup has created a web platform to help researchers and companies manage their data labeling workflow and use AI-assisted segmentation tools to improve the quality of their training datasets.

“When the labels are accurate, then the AI models learn faster and they reach higher accuracy faster,” said Gupta.

The company’s web interface, which runs on NVIDIA T4 GPUs for inference in Google Cloud, helped one healthcare radiology customer speed up labeling by 10x and decrease its labeling error rate by more than 15 percent.

The Devil Is in the Details 

The higher the data quality, the less data needed to achieve accurate results. A machine learning model can produce the same results after training on a million images with low-accuracy labels, Gupta says, or just 100,000 images with high-accuracy labels.

Getting data labeling right the first time is no easy task. Many developers outsource data labeling to companies or crowdsourced workers. It may take weeks to get back the annotated datasets, and the quality of the labels is often poor.

A rough annotated image of a car on the street, for example, may have a segmentation polygon around it that also includes part of the pavement, or doesn’t reach all the way to the roof of the car. Since neural networks parse images pixel by pixel, every mislabeled pixel makes the model less precise.

That margin of error is unacceptable for training a neural network that will eventually interact with people and objects in the real world — for example, identifying tumors from an MRI scan of the brain or controlling an autonomous vehicle.

Developers can manage their data labeling through’s web interface, while administrators can assign image labeling tasks to annotators, view metrics about individual data labelers’ performance and review the actual image annotations.

Using AI to Train Better AI 

When a data scientist first runs a machine learning model, it may only be 60 percent accurate. The developer then iterates several times to improve the performance of the neural network, each time adding new training data. is helping AI developers across industries use their early-stage machine learning models to ease the process of labeling new training data for future versions of the neural networks — a process known as active learning.

With this technique, the developer’s initial machine learning model can take the first pass at annotating the next set of training data. Instead of starting from scratch, annotators can just go through and tweak the AI-generated labels, saving valuable time and resources.

The startup offers active learning for data labeling across multiple industries. For healthcare data labeling, its platform integrates with the NVIDIA Clara Deploy SDK, allowing customers to use the software toolkit for AI-assisted segmentation of healthcare datasets.

Choose Your Own Annotation Adventure chose to deploy its platform on cloud-based GPUs to easily scale usage up and down based on customer demand. Researchers and companies using the tool can choose whether to use the interface online, connected to the cloud backend, or instead use a containerized application running on their own on-premises GPU system.

“It’s important for AI teams in healthcare to be able to protect patient information,” Gupta said. “Sometimes it’s necessary for them to manage the workflow of annotating data and training their machine learning models within the security of their private network. That’s why we provide Docker images to support on-premises annotation on local datasets.”

Balzano, a Swiss startup building deep learning models for radiologists, is using’s platform linked to an on-premises server of NVIDIA V100 Tensor Core GPUs. To develop training datasets for its musculoskeletal orthopedics AI tools, the company labels a few hundred radiology images each month. Adopting’s interface saved the company a year’s worth of engineering effort compared to building a similar solution from scratch.

“’s features allow us to annotate and segment anatomical features of the knee and cartilage more efficiently,” said Stefan Voser, chief operating officer and product manager at Balzano, which is also an Inception program member. “As we ramp up the annotation process, this platform will allow us to leverage AI capabilities and ensure the segmented images are high quality.”

Balzano and will showcase their latest demos in the NVIDIA booth (#10939) at the annual meeting of the Radiological Society of North America, Dec. 1-6 in Chicago.

The post Put AI Label On It: Startup Aids Annotators of Healthcare Training Data appeared first on The Official NVIDIA Blog.

Intel Xeon Scalable Processors Accelerate Big Data Computing in Alibaba Cloud

What’s New: Intel and Alibaba Cloud, the data intelligence backbone of Alibaba Group, jointly announced the two companies have optimized big data performance and decision support within the Alibaba Cloud based on Intel® Xeon® Scalable processors. Through their close collaboration, Alibaba Cloud published the industry’s first 100,000 scale factor on Alibaba Cloud MaxCompute cluster running on Intel Xeon Scalable processors, demonstrating Alibaba Cloud’s ability to deliver high-performance, cost-effective cloud services.

“This achievement is a clear example of Intel delivering on its promise to provide customers with what they need to drive innovations in the industry. Our close partnership enables us to deliver Alibaba Cloud MaxCompute the ability to quickly and seamlessly process large amounts of data to deliver actionable insights. We look forward to more such breakthroughs based on Intel’s data-centric product portfolio and optimized solutions.”
–Jason Grebe, Intel corporate vice president and general manager of the Cloud Platforms and Technology Group.

Why It Matters: Alibaba Cloud is the first cloud service provider to release benchmark results on 100TB data; previous records were on 10TB and 30TB data performances. Benchmark results released by the Transaction Processing Performance Council (TPC) are an important reference standard enabling customers to choose the best software and hardware platforms. TPC benchmarks comprehensively measure system performance through the most commonly used big data application scenarios.

  • On the TPCx-BigBench benchmark tests, Alibaba Cloud MaxCompute reached 25,641 BBQpm (Big Bench Query-per-minute) with price/performance ratio of up to $224.49/BBQpm. TPCx-BB measures the performance of hardware and software components of Hadoop-based big data systems.
  • On TPC-DS benchmark tests, Alibaba Cloud’s Elastic MapReduce (EMR) reached 14,861,137 QphDS (queries per hour) and a price/performance ratio of up to $0.18/QphDS.

What Alibaba Says: “Alibaba Cloud has been investing in Big Data analytic technology for about 10 years, and we’re very excited to see our core products, MaxCompute and PAI, represented so well in the international big data performance benchmark, TPCx-BigBench,” said Yangqing Jia, VP of Alibaba Group and President and Senior Fellow of Data Platform, Alibaba Cloud Intelligence. “One of the keys to this achievement is the close collaboration between Alibaba and Intel, which allowed both sides to give full attention to the technology and ecosystem advantages to make this happen. Together, we will continue to deepen innovation and cooperation, to make leading products and technologies, to create more value for the industry, and to drive the development of the digital economy.”

More context: TPCx-BB Advanced Sort Result List | TPC-DS Advanced Sort Result List

More Customer Stories: Intel Customer Spotlight on | Customer Stories on Intel Newsroom

The post Intel Xeon Scalable Processors Accelerate Big Data Computing in Alibaba Cloud appeared first on Intel Newsroom.

Answering the Call: NVIDIA CEO to Detail How AI Will Revolutionize 5G, IoT

Highlighting the growing excitement at the intersection of AI, 5G and IoT, NVIDIA CEO Jensen Huang kicks off the Mobile World Congress Los Angeles 2019 Monday, Oct. 21.

The keynote, NVIDIA’s debut at the wireless industry’s highest-profile gathering in the U.S., will be the first of a slate of talks and training sessions from NVIDIA and its partners.

The AI revolution is spurring a wave of progress across the mobile technology industry that’s unleashing unprecedented capabilities and new opportunities.

NVIDIA is at the center of this, thanks to AI and accelerated computing capabilities that have been adopted by industries across the globe.

Jensen Huang to Deliver Agenda-Setting Keynote

Huang will detail how the latest AI and accelerated computing innovations will transform the wireless industry in a keynote that’s open to all on Monday, Oct. 21, at the Los Angeles Convention Center’s Petree Hall.

If you’re not registered for MWC-LA, RSVP for our keynote.

Get Trained with DLI

Our Deep Learning Institute — one of the largest training programs in the world for AI and accelerated computing — has partnered with the show’s sponsor, the GSMA.

Together, we’re offering hands-on training to the show’s attendees in the South Hall, booth 1743.

The training is on a first-come, first-served basis. No need to sign up in advance.

Get Inspired at NVIDIA Booth 1745

If you’re attending the event, our booth will serve as a hub for the innovations we’re bringing to the show.

At the booth, you’ll find NVIDIA Inception partners using our Metropolis platform to showcase a variety of real-world applications that demand GPUs at the edge.

Get Oriented at the NVIDIA Theater

Want to dig into the nit and grit of delivering services such as these? Stop by the NVIDIA Theater to hear speakers from NVIDIA, our partners and our customers.

Among the highlights, Saurabh Jain, director of products and strategic partnerships at NVIDIA, will detail how edge computing brings compute and storage closer to the point of action.

That’s critical for smart cities, and it’s opening up new business and service revenue opportunities for the telecom industry.

Visit NVIDIA booth 1745 at 1:30 pm on Oct. 23 to hear his talk, and stick around for others from key industry leaders.

The post Answering the Call: NVIDIA CEO to Detail How AI Will Revolutionize 5G, IoT appeared first on The Official NVIDIA Blog.

Amazon Brings AI Performance to the Cloud with NVIDIA T4 GPUs

Automated yet human-like customer service. Professional workstation performance on any connected device. Cinematic-quality PC gaming.

These are a few of the diverse capabilities coming to cloud users with NVIDIA T4 Tensor Core GPUs now in general availability on AWS in North America, Europe and Asia via new Amazon EC2 G4 instances.

NVIDIA T4 GPUs, supported by an extensive software stack, provide G4 instance users with performance, versatility and efficiency.

The software platform is optimized for a rich set of applications, including NVIDIA cuDNN for deep learning, NVIDIA RAPIDS for data analytics and machine learning, NVIDIA Quadro Virtual Workstation for cloud workstation graphics, and NVIDIA GeForce for cloud gaming. The software stack also includes a wide selection of APIs, CUDA and domain-specific CUDA-X libraries such as TensorRT, NCCL, OptiX and Video Codec SDK.

AWS users can leverage a single instance to accelerate multiple types of production workloads seamlessly and cost-efficiently.

“We focus on solving the toughest challenges that hold our customers back from taking advantage of compute-intensive applications,” said Matt Garman, vice president of Compute Services at AWS. “AWS offers the most comprehensive portfolio to build, train and deploy machine learning models powered by Amazon EC2’s broad selection of instance types optimized for different machine learning use cases. With new G4 instances powered by T4 GPUs, we’re making it more affordable to put machine learning in the hands of every developer.”

Do More AI for Less

NVIDIA T4 is a second-generation Tensor Core GPU, a reinvention of the GPU that achieves the highest performance for AI applications while maintaining the programmability of CUDA.

With up to 130 TOPS of INT8 performance, NVIDIA T4 features mixed-precision tensor processing required to accelerate the constantly evolving innovation, diversity and complexity of AI-based applications like image classification, object detection, natural language understanding, automated speech recognition and recommender systems.

Amazon has been one of the fastest hyperscalers in the industry to provision NVIDIA GPUs with support for ready-to-use NVIDIA NGC containers for training and inference. EC2 P3 instances feature NVIDIA V100 Tensor Core GPUs, allowing customers to reduce machine learning training from days to hours using the Automatic Mixed Precision feature. With EC2 G4, customers can deploy AI services at scale while significantly reducing operational costs.

And through our recently announced partnership with VMware, VMware Cloud on AWS customers will soon gain access to a new, highly scalable and secure cloud service consisting of Amazon EC2 bare metal instances to be accelerated by NVIDIA T4 GPUs and our new NVIDIA Virtual Compute Server (vComputeServer) software.

Businesses will be able to use this enterprise-grade hybrid cloud platform to accelerate application modernization. They’ll be able to unify deployment, migration and operations across a consistent VMware infrastructure from data center to the AWS cloud in support of the most compute-intensive workloads, including AI, machine learning and data analytics.

Real-Time Ray Tracing and AI-Enhanced Graphics Anywhere, Anytime

The long-sought holy grail of computer graphics, real-time ray tracing delivers the most life-like scenes. Designers and artists can create content in a new way with real-time photorealistic rendering, AI-enhanced graphics, and video and image processing.

NVIDIA T4 is the first NVIDIA RTX ray tracing GPU in the cloud.T4 GPUs offer RT Cores, dedicated compute resources that perform ray-tracing operations with extraordinary efficiency, eliminating expensive ray-tracing approaches of the past.

The new G4 instances, combined with NVIDIA Quadro Virtual Workstation (Quadro vWS) Amazon Machine Images, support the latest ray-tracing APIs, including Microsoft DXR, NVIDIA OptiX and Vulkan. Technical and creative professionals in industries like media and entertainment, architecture, manufacturing, and oil and gas can run the latest graphics software applications from the AWS cloud.

Deploying a virtual workstation with AWS is easy and fast, taking less than five minutes. Just visit the AWS Marketplace and select the NVIDIA Quadro vWS machine image and the G4 instance, which is available on Windows Server 2016 and Windows Server 2019.

GPU-Powered Cloud Gaming

The Turing architecture that powers the T4 also brings NVIDIA’s gaming prowess to AWS, enabling the most demanding games to be rendered and streamed using the GPU’s hardware encoder engine, which is programmable with the Video Codec SDK.

Game publishers can build their own cloud-gaming instances based on the latest NVIDIA technology and make their entire catalog of PC titles available to gamers on nearly any device.

Gamers can enjoy all the latest titles at fast, fluid frame rates at high resolutions — without ever needing to worry about hardware upgrades or updating drivers or game patches.

The NVIDIA driver powering this capability is available in the AWS Marketplace and runs on the AWS G4 instance on Windows Server 2016, Windows Server 2019 and Linux OS.

Get Started with AWS EC2 G4 Instances

Clarifai, Electronic Arts, GumGum and PurWeb are among the initial customers using Amazon EC2 G4 instances to take advantage of the compute-versatility and performance of NVIDIA T4 for running a wide diversity of compute-intensive workloads at scale. As a result, both are providing powerful services while reducing costs to build and deploy these services to their own customers.

In coming weeks, G4 instances will also support Amazon Elastic Inference, which allows users to add GPU acceleration to any Amazon EC2 or Amazon SageMaker instance for faster inference at a much lower cost — up to 75 percent savings.

Visit the AWS G4 instances page to learn more and try out the NVIDIA T4 today.

The post Amazon Brings AI Performance to the Cloud with NVIDIA T4 GPUs appeared first on The Official NVIDIA Blog.

Cure for the Common Code: San Francisco Startup Uses AI to Automate Medical Coding

Doctors’ handwriting is notoriously difficult to read. Even more cryptic is medical coding — the process of turning a clinician’s notes into a set of alphanumeric codes representing every diagnosis and procedure.

Although this system is used in over 100 countries worldwide, accurate coding is of particular significance in the U.S., where medical codes form the basis for the bills doctors, clinics and hospitals issue to insurance providers and patients.

More than 150,000 codes are used in the U.S.’s adaptation of the International Classification of Diseases, a cataloging standard developed by the World Health Organization.

The diagnostic code for a pedestrian hit by a pickup truck? V03.10XA. Type 2 diabetes diagnosis? E11.9. There are also a set of procedural codes for everything a doctor might do, like put a cast on a patient’s broken right forearm (2W3CX2Z) or insert a pacemaker into a coronary vein (02H40NZ).

After every doctor’s appointment or procedure, a clinician’s summary of the interaction is converted into these codes. When done by humans, the turnaround time for medical chart coding — within a healthcare organization or at a private firm — is often two days or more. Natural language processing AI, accelerated by GPUs, can shrink that time to minutes or seconds.

San Francisco-based Fathom is developing deep learning tools to automate the painstaking medical coding process while increasing accuracy. The startup’s tools can help address the shortage of trained clinical coders, improve the speed and precision of billing, and allow human coders to focus on complex cases and follow-up queries.

“Sometimes you have to go back to the doctor to ask for clarification,” said Christopher Bockman, co-founder and chief technology officer of Fathom, a member of the NVIDIA Inception virtual accelerator program. “The longer that process takes, the harder it is for the doctor to remember what happened.”

Fathom uses NVIDIA P100 and V100 Tensor Core GPUs in Google Cloud for both training and inference of its deep learning algorithms. Founded in 2016, the company now works with several of the largest medical coding operations in the U.S., representing more than 200 million annual patient encounters. Its tools can reduce human time spent on medical coding by as much as 90 percent.

Deciphering the Doctor

At any doctor’s appointment, emergency room visit or surgical procedure, healthcare providers type up notes describing the interaction. While there are some standardized formats, these medical records differ by hospital, by type of appointment or procedure, and by whether the note is written during the patient interaction or after.

Medical coders make sense of this unstructured text, categorizing every test, treatment and procedure into a list of codes. Once coded, a healthcare provider’s billing department turns the reports into an invoice to collect payments from insurance providers and patients.

It’s a messy process — for a human or an AI. Human coders agree with each other less than two-thirds of the time in key scenarios, studies show. And research has found that half or more medical charts have coding errors.

“The challenge for us is these notes can vary quite a bit,” Bockman said. “There’s a push to standardize, but that tends to make the doctor’s job a lot harder. Human health is complex, so it’s hard to come up with a format that works for every case.”

Coding an AI that Codes

As a machine learning problem, medical coding shares elements of two kinds of tasks: multilabel classification and sequence-to-sequence NLP. An effective AI must understand the text in a doctor’s note and accurately tag it with a list of diagnoses and procedures organized in the right order for billing.

Fathom is tackling this challenge, aided by tools such as NVIDIA’s GPU-optimized version of BERT, a leading natural language understanding model. The team uses the TensorFlow deep learning framework and relies on the mixed-precision training provided by Tensor Cores to accelerate the large-scale processing of medical documents that vary widely in size.

Using NVIDIA GPUs for inference allows Fathom to easily scale up to process upwards of millions of healthcare encounters per hour.

“While lowering costs matter, the ability to instantly add the capacity of thousands of medical coders to their operations has been the game-changer for our clients,” said Andrew Lockhart, Fathom’s co-founder and CEO.

Relying on NVIDIA GPUs on Google Cloud helps the team ramp its usage up and down based on demand.

“We have very bursty needs,” Bockman said, referring to the team’s fluctuating computational workload. “Sometimes we might be trying to retrain different variants of the same large model, while other times we’re doing a lot of experimentation or just doing inference. We might need a single GPU or many dozens of them.”

The startup chose Google Cloud, Bockman said, in part because the data is encrypted by default — one of the requirements for compliance with HIPAA and SOC 2 privacy requirements.

While medical coding is the main activity done today with doctor’s notes, unlocking the information contained in these health records could enable a wide range of use cases beyond billing and reimbursement, Bockman says.

AI that quickly and accurately analyzes medical charts and appointment records at scale can help doctors spot patient illnesses that may otherwise have been missed, predict likely patient outcomes, suggest treatment options — and even identify promising patient candidates for clinical trials.

The post Cure for the Common Code: San Francisco Startup Uses AI to Automate Medical Coding appeared first on The Official NVIDIA Blog.

Intel Optane DC Persistent Memory Improves Search, Reduces Costs in Baidu’s Feed Stream Services

Intel Baidu Grebe 2

» Download all images (ZIP, 42 MB)

What’s New: Intel today announced Baidu is architecting the in-memory database of its Feed Stream services to harness the high-capacity and high-performance capabilities of Intel® Optane DC persistent memory. Paired with 2nd Gen Intel® Xeon® Scalable processors, building a new memory platform based on Intel Optane DC persistent memory allows Baidu to lower its total cost of ownership (TCO) while delivering more personalized search results to users. Intel and Baidu disclosed details of this deployment and other joint collaborations on Thursday at the 2019 Baidu ABC Summit in Beijing.

“For over 10 years, Intel and Baidu have worked closely together to accelerate Baidu’s core businesses, from search to AI to autonomous driving to cloud services. Our deep collaboration enables us to rapidly deploy the latest Intel technologies and improve the way customers experience Baidu’s services.”
–Jason Grebe, Intel corporate vice president and general manager of the Cloud Platforms and Technology Group

Why It’s Important: As companies like Baidu manage the explosive growth of data, the need to quickly and efficiently access and store data is imperative. With today’s news, Baidu is advancing its Feed Stream services to deliver more personalized content to its customers.

Why It’s Different: Baidu uses an advanced in-memory database called Feed-Cube to support data storage and information retrieval in its cloud-based Feed Stream services. Deploying Intel Optane DC persistent memory and 2nd Gen Intel Xeon Scalable processors enable Baidu to ensure high concurrency, large capacity and high performance for Feed-Cube, while reducing TCO1.

Through close collaboration, Intel and Baidu architected a hybrid memory configuration that includes both Intel Optane DC persistent memory and DRAM within the Baidu Feed Stream services. With this approach, Feed-Cube saw a boost in search result response times under the pressure of large concurrent access1. At the same time, single-server DRAM usage dropped by more than half, which reduces costs in terms of the petabyte-level storage capacity of Feed-Cube1. Intel and Baidu have published a detailed case study of this work, including examples of other applications using Intel Optane DC persistent memory technology, such as Redis, Spark and function-as-a-services.

“Using Intel Optane DC persistent memory within the Feed-Cube database enables Baidu to cost-effectively scale memory capacity to stay on top of the continuously expanding demands placed on our Feed Stream services,” said Tao Wang, chief architect, Recommendation Technology Architecture at Baidu.

What’s Next: Today’s news comes on the heels of Intel and Baidu recently signing a new memorandum of understanding (MoU) aimed at increasing the collaboration between the two companies in Baidu’s core business areas. Baidu and Intel will continue to work together to enable new products and technologies that play an increasingly important role in growing core Internet business scenarios as well as critical applications and services. The deeper collaboration between Baidu and Intel will help Baidu provide a more diverse and engaging user experience to its customers.

What Else Was Disclosed at the 2019 Baidu ABC Summit:

  • New HPC Solution for Baidu ABC Storage: Intel and Baidu unveiled a new storage solution to accelerate machine learning performance in high-performance computing workloads. The new HPC solution is offered in Baidu’s cloud environment and provides users with end-to-end HPC support covering a range of capabilities, from data pre-processing, model training and evaluation to inferencing and result publishing. The Baidu Cloud ABC Storage service uses Intel® Optane™ DC SSDs and Intel® QLC 3D NAND SSDs.
  • Confidential Computing Consortium: Intel and Baidu recently joined the Confidential Computing Consortium under the Linux Foundation. As part of the consortium, Intel and Baidu will work with industry partners to deploy and save private trusted computing services in the Baidu cloud based on Intel® Software Guard Extension (Intel® SGX®) technology.

More Context: Data Center News | Storage and Memory News

The Small Print:

1Data cited from Baidu’s internal verification and testing based on 2nd Gen Intel Xeon Scalable Processor and Intel Optane DC persistent memory. For more details on these tests, please contact Baidu.Intel does not control or audit third-party data. You should review this content, consult other sources, and confirm whether referenced data are accurate.

The post Intel Optane DC Persistent Memory Improves Search, Reduces Costs in Baidu’s Feed Stream Services appeared first on Intel Newsroom.