What’s New: At Intel FPGA Technology Day, Intel announced a new, customizable solution to help accelerate application performance across 5G, artificial intelligence, cloud and edge workloads. The new Intel® eASIC N5X is the first structured eASIC family with an Intel® FPGA compatible hard processor system. The Intel eASIC N5X helps customers migrate both their custom logic and designs — using the embedded hard processor in the FPGA — to structured ASICs, bringing benefits like lower unit cost, faster performance and reduced power consumption.
“The potential for data to transform industries and business has never been greater. The announcement of the new Intel eASIC N5X uniquely positions our customers to more fully benefit from the flexibility and time-to-market advantages of Intel FPGAs with the performance benefits and lower operating power of structured ASICs. The combination of FPGA, eASIC and ASIC products from Intel enables customers to take advantage of this potential by providing what we call the ‘custom logic continuum,’ and is a capability not available from any other vendor in the market.”
–Dave Moore, Intel corporate vice president and general manager of the Programmable Solutions Group
Why It Matters: FPGAs offer the best time-to-market advantage and highest flexibility for customer designs, while ASICs and structured ASIC devices provide the best hardware-optimized performance at the lowest power and cost. FPGAs are ideal for enabling agile innovation and are the fastest path to next-generation technology exploration. The programmability of FPGAs helps customers quickly develop hardware for their specific workloads and adapt to changing standards over time – as happened during the early phases of the 5G rollout and the migration toward open RAN implementations.
The new innovative Intel eASIC N5X devices deliver up to 50% lower core power and lower cost compared to FPGAs, while providing faster time to market and lower non-recurring engineering costs compared to ASICs. This allows customers to create power-optimized, high-performance and highly differentiated solutions. Intel eASIC N5X devices also help customers to meet the key security needs of many applications by incorporating a secure device manager adapted from the Intel® Agilex™ FPGA family, including secure boot, authentication and anti-tamper features.
Intel’s Unique Approach: Intel is the world’s only semiconductor company to offer the complete custom logic continuum of FPGAs, such as Intel Agilex and Intel® Stratix™ 10; structured ASICs, such as Intel eASIC N5X; and ASICs to its customers. This comprehensive data-processing portfolio of customizable logic devices helps enable Intel customers to truly optimize unit cost, performance, power consumption and time to market for their market-specific solutions in a manner that’s unique in the industry.
About Intel FPGA Technology Day: This is a one-day virtual event on Nov. 18, 2020, that brings together Intel executives, partners and customers to showcase the latest Intel programmable products and solutions through a series of keynotes, webinars and demonstrations.
Amazon Web Services’ first GPU instance debuted 10 years ago, with the NVIDIA M2050. At that time, CUDA-based applications were focused primarily on accelerating scientific simulations, with the rise of AI and deep learning still a ways off.
Since then, AWS has added to its stable of cloud GPU instances, which has included the K80 (p2), K520 (g3), M60 (g4), V100 (p3/p3dn) and T4 (g4).
The P4d instance delivers AWS’s highest performance, most cost-effective GPU-based platform for machine learning training and high performance computing applications. The instances reduce the time to train machine learning models by up to 3x with FP16 and up to 6x with TF32 compared to the default FP32 precision.
Each P4d instance features eight NVIDIA A100 GPUs and, with AWS UltraClusters, customers can get on-demand and scalable access to over 4,000 GPUs at a time using AWS’s Elastic Fabric Adaptor (EFA) and scalable, high-performant storage with Amazon FSx. P4d offers 400Gbps networking and uses NVIDIA technologies such as NVLink, NVSwitch, NCCL and GPUDirect RDMA to further accelerate deep learning training workloads. NVIDIA GPUDirect RDMA on EFA ensures low-latency networking by passing data from GPU to GPU between servers without having to pass through the CPU and system memory.
In addition, the P4d instance is supported in many AWS services, including Amazon Elastic Container Services, Amazon Elastic Kubernetes Service, AWS ParallelCluster and Amazon SageMaker. P4d can also leverage all the optimized, containerized software available from NGC, including HPC applications, AI frameworks, pre-trained models, Helm charts and inference software like TensorRT and Triton Inference Server.
P4d instances are now available in US East and West, and coming to additional regions soon. The instances can be purchased as On-Demand, with Savings Plans, with Reserved Instances, or as Spot Instances.
The first decade of GPU cloud computing has brought over 100 exaflops of AI compute to the market. With the arrival of the Amazon EC2 P4d instance powered by NVIDIA A100 GPUs, the next decade of GPU cloud computing is off to a great start.
NVIDIA and AWS are making it possible for applications to continue pushing the boundaries of AI across a wide array of applications. We can’t wait to see what customers will do with it.
The popularity of public cloud offerings is evident — just look at how top cloud service providers report double-digit growth year over year.
However, application performance requirements and regulatory compliance issues, to name two examples, often require data to be stored locally to reduce distance and latency and to place data entirely within a company’s control. In these cases, standard private clouds also may offer less flexibility, agility or on-demand capacity.
To help resolve these issues, Lenovo, Microsoft and NVIDIA have engineered a hyperconverged hybrid cloud that enables Azure cloud services within an organization’s data center.
By integrating Lenovo ThinkAgile SX, Microsoft Azure Stack Hub and NVIDIA Mellanox networking, organizations can deploy a turnkey, rack-scale cloud that’s optimized with a resilient, highly performant and secure software-defined infrastructure.
Fully Integrated Azure Stack Hub Solution
Lenovo ThinkAgile SX for Microsoft Azure Stack Hub satisfies regulatory compliance and removes performance concerns. Because all data is kept on secure servers in a customer’s data center, it’s much simpler to comply with the governance laws of a country and implement their own policies and practices.
Similarly, by reducing the distance that data must travel, latency is reduced and application performance goals can be more easily achieved. At the same time, customers can cloud-burst some workloads to the Microsoft Azure public cloud, if desired.
Lenovo, Microsoft and NVIDIA worked together to make sure everything performs right out of the box. There’s no need to worry about configuring and adjusting settings for virtual or physical infrastructure.
The power and automation of Azure Stack Hub software, the convenience and reliability of Lenovo’s advanced servers, and the high performance of NVIDIA networking combine to enable an optimized hybrid cloud. Offering the automation and flexibility of Microsoft Azure Cloud with the security and performance of on-premises infrastructure, it’s an ideal platform to:
deliver Azure cloud services from the security of your own data center,
enable rapid development and iteration of applications with on-premises deployment tools,
unify application development across entire hybrid cloud environments, and
easily move applications and data across private and public clouds.
Agility of a Hybrid Cloud
Azure Stack Hub also seamlessly operates with Azure, delivering an orchestration layer that enables the movement of data and applications to the public cloud. This hybrid cloud protects the data and applications that need protection and offers lower latencies for accessing data. And it still provides the public cloud benefits organizations may need, such as reduced costs, increased infrastructure scalability and flexibility, and protection from data loss.
A hybrid approach to cloud computing keeps all sensitive information onsite and often includes centrally used applications that may have some of this data tied to them. With a hybrid cloud infrastructure in place, IT personnel can focus on building proficiencies in deploying and operating cloud services — such as IaaS, PaaS and SaaS — and less on managing infrastructure.
A hybrid cloud requires a network that can handle all data communication between clients, servers and storage. The Ethernet fabric used for networking in the Lenovo ThinkAgile SX for Microsoft Azure Stack Hub leverages NVIDIA Mellanox Spectrum Ethernet switches — powered by the industry’s highest-performing ASICs — along with NVIDIA Cumulus Linux, the most advanced open network operating system.
At 25Gb/s data rates, these switches provide cloud-optimized delivery of data at line-rate. Using a fully shared buffer, they support fair bandwidth allocation and provide predictably low latency, as well as traffic flow prioritization and optimization technology to deliver data without delays, while the hot-swappable redundant power supplies and fans help provide resiliency for business-sensitive traffic.
Modern networks require advanced offload capabilities, including remote direct memory access (RDMA), TCP, overlay networks (for example, VXLAN and Geneve) and software-defined storage acceleration. Implementing these at the network layer frees expensive CPU cycles for user applications while improving the user experience.
To handle the high-speed communications demands of Azure Stack Hub, Lenovo configured compute nodes with a dual-port 10/25/100GbE NVIDIA Mellanox ConnectX-4 Lx, ConnectX-5 or ConnectX-6 Dx NICs. The ConnectX NICs are designed to address cloud, virtualized infrastructure, security and network storage challenges. They use native hardware support for RoCE, offer stateless TCP offloads, accelerate overlay networks and support NVIDIA GPUDirect technology to maximize performance of AI and machine learning workloads. All of this results in much needed higher infrastructure efficiency.
RoCE for Improved Efficiency
Microsoft Azure Stack Hub leverages Storage Spaces Direct (S2D) and Microsoft’s Server Message Block Direct 3.0. SMB Direct uses high-speed RoCE to transfer large amounts of data with little CPU intervention. SMB Multichannel allows servers to simultaneously use multiple network connections and provide fault tolerance through the automatic discovery of network paths.
The addition of these two features allows NVIDIA RoCE-enabled ConnectX Ethernet NICs to deliver line-rate performance and optimize data transfer between server and storage over standard Ethernet. Customers with Lenovo ThinkAgile SX servers or the Lenovo ThinkAgile SX Azure Hub can deploy storage on secure file servers while delivering the highest performance. As a result, S2D is extremely fast with disaggregated file server performance, almost equaling that of locally attached storage.
Run More Workloads
By using intelligent hardware accelerators and offloads, the NVIDIA RoCE-enabled NICs offload I/O tasks from the CPU, freeing up resources to accelerate application performance instead of making data wait for the attention of a busy CPU.
The result is lower latencies and an improvement in CPU efficiencies. This maximizes the performance in Microsoft Azure Stack deployments by leaving the CPU available to run other application processes. Efficiency gets a boost since users can host more VMs per physical server, support more VDI instances and complete SQL Server queries more quickly.
A Transformative Experience with a ThinkAgile Advantage
Lenovo ThinkAgile solutions include a comprehensive portfolio of software and services that supports the full lifecycle of infrastructure. At every stage — planning, deploying, supporting, optimizing and end-of-life — Lenovo provides the expertise and services needed to get the most from technology investments.
This includes single-point-of-contact support for all the hardware and software used in the solution, including Microsoft’s Azure Stack Hub and the ConnectX NICs. Customers never have to worry about who to call — Lenovo takes calls and drives them to resolution.
Hate hunting and pecking away at your keyboard every time you have a quick question? You’ll love this.
Microsoft’s Bing search engine has turned to Turing-NLG and NVIDIA GPUs to suggest full sentences for you as you type.
Turing-NLG is a cutting-edge, large-scale unsupervised language model that has achieved strong performance on language modeling benchmarks.
It’s just the latest example of an AI technique called unsupervised learning, which makes sense of vast quantities of data by extracting features and patterns without the need for humans to provide any pre-labeled data.
Microsoft calls this Next Phrase Prediction, and it can feel like magic, making full-phrase suggestions in real time for long search queries.
Turing-NLG is among several innovations — from model compression to state caching and hardware acceleration — that Bing has harnessed with Next Phrase Prediction.
Over the summer, Microsoft worked with engineers at NVIDIA to optimize Turing-NLG to their needs, accelerating the model on NVIDIA GPUs to power the feature for users worldwide.
A key part of this optimization was to run this massive AI model extremely fast to power real-time search experience. With a combination of hardware and model optimization Microsoft and NVIDIA achieved an average latency below 10 milliseconds.
By contrast, it takes more than 100 milliseconds to blink your eye.
Before the introduction of Next Phrase Prediction, the approach for handling query suggestions for longer queries was limited to completing the current word being typed by the user.
Now type in “The best way to replace,” and you’ll immediately see three suggestions for completing the phrase: wood, plastic and metal. Type in “how can I replace a battery for,” and you’ll see “iphone, samsung, ipad and kindle” all suggested.
With Next Phrase Prediction, Bing can now present users with full-phrase suggestions.
The more characters you type, the closer Bing gets to what you probably want to ask.
And because these suggestions are generated instantly, they’re not limited to previously seen data or just the current word being typed.
So, for some queries, Bing won’t just save you a few keystrokes — but multiple words.
As a result of this work, the coverage of autosuggestion completions increases considerably, Microsoft reports, improving the overall user experience “significantly.”
Ming-Yu Liu and Arun Mallya were on a video call when one of them started to break up, then freeze.
It’s an irksome reality of life in the pandemic that most of us have shared. But unlike most of us, Liu and Mallya could do something about it.
They are AI researchers at NVIDIA and specialists in computer vision. Working with colleague Ting-Chun Wang, they realized they could use a neural network in place of the software called a video codec typically used to compress and decompress video for transmission over the net.
Their work enables a video call with one-tenth the network bandwidth users typically need. It promises to reduce bandwidth consumption by orders of magnitude in the future.
“We want to provide a better experience for video communications with AI so even people who only have access to extremely low bandwidth can still upgrade from voice to video calls,” said Mallya.
Better Connections Thanks to GANs
The technique works even when callers are wearing a hat, glasses, headphones or a mask. And just for fun, they spiced up their demo with a couple bells and whistles so users can change their hair styles or clothes digitally or create an avatar.
A more serious feature in the works (shown at top) uses the neural network to align the position of users’ faces for a more natural experience. Callers watch their video feeds, but they appear to be looking directly at their cameras, enhancing the feeling of a face-to-face connection.
“With computer vision techniques, we can locate a person’s head over a wide range of angles, and we think this will help people have more natural conversations,” said Wang.
Say hello to the latest way AI is making virtual life more real.
How AI-Assisted Video Calls Work
The mechanism behind AI-assisted video calls is simple.
A sender first transmits a reference image of the caller, just like today’s systems that typically use a compressed video stream. Then, rather than sending a fat stream of pixel-packed images, it sends data on the locations of a few key points around the user’s eyes, nose and mouth.
A generative adversarial network on the receiver’s side uses the initial image and the facial key points to reconstruct subsequent images on a local GPU. As a result, much less data is sent over the network.
Liu’s work in GANs hit the spotlight last year with GauGAN, an AI tool that turns anyone’s doodles into photorealistic works of art. GauGAN has already been used to create more than a million images and is available at the AI Playground.
“The pandemic motivated us because everyone is doing video conferencing now, so we explored how we can ease the bandwidth bottlenecks so providers can serve more people at the same time,” said Liu.
GPUs Bust Bandwidth Bottlenecks
The approach is part of an industry trend of shifting network bottlenecks into computational tasks that can be more easily tackled with local or cloud resources.
“These days lots of companies want to turn bandwidth problems into compute problems because it’s often hard to add more bandwidth and easier to add more compute,” said Andrew Page, a director of advanced products in NVIDIA’s media group.
AI Instruments Tune Video Services
GAN video compression is one of several capabilities coming to NVIDIA Maxine, a cloud-AI video-streaming platform to enhance video conferencing and calls. It packs audio, video and conversational AI features in a single toolkit that supports a broad range of devices.
Announced this week at GTC, Maxine lets service providers deliver video at super resolution with real-time translation, background noise removal and context-aware closed captioning. Users can enjoy features such as face alignment, support for virtual assistants and realistic animation of avatars.
“Video conferencing is going through a renaissance,” said Page. “Through the pandemic, we’ve all lived through its warts, but video is here to stay now as a part of our lives going forward because we are visual creatures.”
Maxine harnesses the power of NVIDIA GPUs with Tensor Cores running software such as NVIDIA Jarvis, an SDK for conversational AI that delivers a suite of speech and text capabilities. Together, they deliver AI capabilities that are useful today and serve as building blocks for tomorrow’s video products and services.
NVIDIA and AWS are bringing the future of XR streaming to the cloud.
Announced today, the NVIDIA CloudXR platform will be available on Amazon EC2 P3 and G4 instances, which support NVIDIA V100 and T4 GPUs, allowing cloud users to stream high-quality immersive experiences to remote VR and AR devices.
With the ability to stream from the cloud, professionals can now easily set up, scale and access immersive experiences from anywhere — they no longer need to be tethered to expensive workstations or external VR tracking systems.
The growing availability of advanced tools like CloudXR is paving the way for enhanced collaboration, streamlined workflows and high fidelity virtual environments. XR solutions are also introducing new possibilities for adding AI features and functionality.
With the CloudXR platform, many early access customers and partners across industries like manufacturing, media and entertainment, healthcare and others are enhancing immersive experiences by combining photorealistic graphics with the mobility of wireless head-mounted displays.
Lucid Motors recently announced the new Lucid Air, a powerful and efficient electric vehicle that users can experience through a custom implementation of the ZeroLight platform. Lucid Motors is developing a virtual design showroom using the CloudXR platform. By streaming the experience from AWS, shoppers can enter the virtual environment and see the advanced features of Lucid Air.
“NVIDIA CloudXR allows people all over the world to experience an incredibly immersive, personalized design with the new Lucid Air,“ said Thomas Orenz, director of digital interactive marketing at Lucid Motors. “By using the AWS cloud, we can save on infrastructure costs by removing the need for onsite servers, while also dynamically scaling the VR configuration experiences for our customers.”
Another early adopter of CloudXR on AWS is The Gettys Group, a hospitality design, branding and development company based in Chicago. Gettys frequently partners with visualization company Theia Interactive to turn the design process into interactive Unreal Engine VR experiences.
“This is a game changer — by streaming collaborative experiences from AWS, we can digitally bring project stakeholders together on short notice for quick VR design alignment meetings,” said Ron Swidler, chief innovation officer at The Gettys Group. “This is going to save a ton of time and money, but more importantly it’s going to increase client engagement, understanding and satisfaction.”
Next-Level Streaming from the Cloud
CloudXR is built on NVIDIA RTX GPUs to allow streaming of immersive AR, VR or mixed reality experiences from anywhere.
The platform includes:
NVIDIA CloudXR SDK, which provides support for all OpenVR apps and includes broad client support for phones, tablets and HMDs. Its adaptive streaming protocol delivers the richest experiences with the lowest perceived latency by constantly adapting to network conditions.
NVIDIA Virtual Workstations to deliver the most immersive, highest quality graphics at the fastest frame rates. It’s available from cloud providers such as AWS, or can be deployed from an enterprise data center.
NVIDIA AI SDKs to accelerate performance and enhance immersive presence.
With the NVIDIA CloudXR platform on Amazon EC2 G4 and P3 instances supporting NVIDIA T4 and V100 GPUs, companies can deliver high-quality virtual experiences to any user, anywhere in the world.
Availability Coming Soon
NVIDIA CloudXR on AWS will be generally available early next year, with a private beta available in the coming months. Sign up now to get the latest news and updates on upcoming CloudXR releases, including the private beta.
With increasingly hybrid computing environments, dispersed users accessing networks around the clock, and the Internet of Things creating more data than security teams have ever seen, organizations are throwing more security tools than ever at the problem.
In fact, Jonathan Flack, principal systems architect at BroadBridge Networks, said it’s not unusual for a company with a big network and a large volume of intellectual property to have 50 to 75 vendor solutions deployed within their networks.
“That’s insanity to me,” said Flack. “How can you converge all the information in a single space in order to effectively to act upon it in context?”
That’s precisely the problem BroadBridge, based in Fremont, Calif., is looking to solve. The three-year-old company is a member of NVIDIA Inception, a program that provides AI startups go-to-market support, expertise and technology.
It’s applying AI, powered by NVIDIA GPUs, to security data such that varying data sources can be aligned temporally, essentially connecting all the dots for any moment in time.
A company might have active directory logs, Windows event logs and firewall logs, with events occurring within microseconds of each other. Overworked security staff don’t have time to fish through all those logs trying to align events.
Instead, BroadBridge does it for them, automatically collecting the data, correlating it and presenting it as a single slice of time, with precision down to the millisecond.
The company’s software effectively pinpoints the causes of events and suggests potential actions to be taken. And given that most security teams are understaffed amid a global shortage of qualified cybersecurity employees, they can use all the help they can get.
“Our objective is to lighten the workload so those people can go home after an eight-hour shift, spend time with their families and have some down time,” said Flack. “If you find an intrusion six months ago, you shouldn’t have to go mine through logs from all the affected systems to reassemble a picture of what happened. With all that data properly aggregated, aligned, and archived you simply run a BlazingSQL query against all of your network data for that specific timeframe.”
Organic Approach to Data
While BroadBridge’s original models were trained on open-source data from the security community, the company’s AI approach is different from other companies in that providing a more mature model out of the gate isn’t necessary. Instead, BroadBridge’s system is designed to be trained by each customer’s network.
“GM is going to have a different threat environment than some DoD office inside the Pentagon,” said Flack. “We provide a good initial starting point, and then we retrain the model using the customer’s own network data over time. The system is 100 percent self-reinforcing.”
The initial AI model provides security analysts with the ability to work through events that need to be investigated. They can triage and tag events as nominal or deserving of more investigation.
That metadata then gets stored, providing a record of what the inference server identified, what the analyst looked at, and what other events are worthy of analysis. All of that is then funneled into a deep learning pipeline that improves the model.
BroadBridge uses Kubernetes and Docker to provide dynamic scaling. Flack said the software can run real-time analytics on a 100GB network. The customer’s deep learning process is uploaded to an NVIDIA GPU instance on AWS, Azure, Google or Oracle clouds, where the AI is trained on the specifics of the customer’s network.
The company’s internal development has unfolded on NVIDIA DGX systems, which are purpose-built for the unique demands of AI. The first wave of development was conducted on DGX-1, and more recently on DGX A100, which Flack said has improved performance significantly.
“Four or five years ago, none of what we’re doing was at all possible,” he said. “Now we have a way to run multiple concurrent GPU-based workloads on systems that are as affordable as some 1U appliances.”
More to Come
Down the line, Flack said he envisions exposing an API to third-party vendors so they can use BroadBridge’s data to dynamically reconfigure device security postures. He also foresees the arrival of 5G as boosting the need for a tool that can parse through the increased data flows.
More immediately, Flack said the company has been looking to address the limitations of virtual private networks in the wake of the huge increase in working from home due to the COVID-19 pandemic.
Flack was careful to note that BroadBridge has no interest in replacing any of the sensors, logs or assessment tools companies are deploying in their security operations centers, or SOCs. Rather, it’s simply trying to create a platform to help security analysts make sense of all the data coming from all of these sources.
“Most of what you’re paying your SOC analysts for is herding cats,” he said. “Our objective is to stop them from herding cats so they can perform actual analysis.”
Promising to bring AI to every enterprise, VMware CEO Pat Gelsinger and NVIDIA CEO Jensen Huang kicked off VMworld 2020 Tuesday with a conversation detailing the companies’ broad new partnership.
VMware and NVIDIA announced that, together, they will deliver an end-to-end enterprise platform for AI as well as a new architecture for data center, cloud and edge that uses NVIDIA DPUs to support existing and next-generation applications.
“We’re going to bring the power of AI to every enterprise. We’re going to bring the NVIDIA AI computing platform and our AI application frameworks onto VMware,” Huang said.
View today’s VMworld 2020CEO discussion featuring Pat Gelsinger and Jensen Huang, and join us atGTC 2020 on October 5 to learn more.
“For every virtual infrastructure admin, we have millions of people that know how to run the vSphere stack,” Gelsinger said. “They’re running it every day, all day long, it’s now the same tools, the same processes, the same networks, the same security, is now fully being made available on the GPU infrastructure.”
This will help accelerate AI adoption, enabling enterprises to extend existing infrastructure for AI, manage all applications with a single set of operations, and deploy AI-ready infrastructure where the data resides, across the data center, cloud and edge.
Additionally, as part of VMware’s Project Monterey, also announced Tuesday, the companies will partner to deliver an architecture for the hybrid cloud based on SmartNIC technology, including NVIDIA’s programmable BlueField-2 DPU.
“The characteristics, the pillars of Project Monterey of offloading the operating system, the data center operating system, onto the SmartNIC, isolating the applications from the control plane and the data plane, and accelerating the data processing and the security processing to line speed is going to make the data center so much more powerful, so much more performant,” Huang said.
“I can’t imagine a more impactful use of AI than healthcare,” Huang said. “The intersection of people, disease and treatments is one of the greatest challenges of humanity, and one where AI will be needed to move the needle.”
A leader in the development of AI and analysis tools in medical imaging, the center uses the NVIDIA Clara healthcare application framework for AI-powered imaging and VMware Cloud Foundation to support a broad range of mission-critical workloads.
“This way of doing computing is going to be the way that the future data centers are built. It’s going to allow us to essentially turn every enterprise into an AI,” Huang said. “Every company will become AI-driven.”
“Our audience is so excited to see how we’re coming together, to see how everything they’ve done for the past two decades with VMware now it’s going to be even further expanded,” Gelsinger said.
NVIDIA founder and CEO Jensen Huang, speaking during the Oracle Live digital launch of the new instance, said: “Oracle is where companies store their enterprise data. We’re going to be able to take this data with no friction at all, run it on Oracle Cloud Infrastructure, conduct data analytics and create data frames that are used for machine learning to learn how to create a predictive model. That model will recommend actions to help companies go faster and make smarter decisions at an unparalleled scale.”
Watch Jensen Huang and Oracle Cloud Infrastructure Executive Vice President Clay Magouyrk discuss AI in the enterprise at Oracle Live.
Hundreds of thousands of enterprises across a broad range of industries store their data in Oracle databases. All of that raw data is ripe for AI analysis with A100 instances running on Oracle Cloud Infrastructure to help companies uncover new business opportunities, understand customer sentiment and create products.
The new Oracle Cloud Infrastructure bare-metal BM.GPU4.8 instance offers eight 40GB NVIDIA A100 GPUs linked via high-speed NVIDIA NVLink direct GPU-to-GPU interconnects. With A100, the world’s most powerful GPU, the Oracle Cloud Infrastructure instance delivers performance gains of up to 6x for customers running diverse AI workloads across training, inference and data science. To power the most demanding applications, the new instance can also scale up with NVIDIA Mellanox networking to provide more than 500 A100 GPUs in a single instance.
NVIDIA Software Accelerates AI and HPC for Oracle Enterprises
Accelerated computing starts with a powerful processor, but software, libraries and algorithms are all essential to an AI ecosystem. Whether it’s computer graphics, simulations like fluid dynamics, genomics processing, or deep learning and data analytics, every field requires its own domain-specific software stack. Oracle is providing NVIDIA’s extensive domain-specific software through the NVIDIA NGC hub of cloud-native, GPU-optimized containers, models and industry-specific software development kits.
“The costs of machine learning are not just on the hardware side,” said Clay Magouyrk, executive vice president of Oracle Cloud Infrastructure. “It’s also about how quickly someone can get spun up with the right tools, how quickly they can get access to the right software. Everything is pre-tuned on these instances so that anybody can show up, rent these GPUs by the hour and get quickly started running machine learning on Oracle Cloud.”
Oracle will also be adding A100 to the Oracle Cloud Infrastructure Data Science platform and providing NVIDIA Deep Neural Network libraries through Oracle Cloud Marketplace to help data scientists run common machine learning and deep learning frameworks, Jupyter Notebooks and Python/R integrated development environments in minutes.
On-Demand Access to the World’s Leading AI Performance
The new Oracle instances make it possible for every enterprise to have access to the world’s most powerful computing in the cloud. A100 delivers up to 20x more peak AI performance than its predecessors with TF32 operations and sparsity technology running on third-generation Tensor Cores. The world’s largest 7nm processor, A100 is incredibly elastic and cost-effective.
The flexible performance of A100 and Mellanox RDMA over Converged Ethernet networking makes the new Oracle Cloud Infrastructure instance ideal for critical drug discovery research, improving customer service through conversational AI, and enabling designers to model and build safer products, to highlight a few examples.
AI Acceleration for Workloads of All Sizes, Companies in All Stages
New businesses can access the power of A100 performance through the NVIDIA Inception and Oracle for Startups accelerator programs, which provide free Oracle Cloud credits for NVIDIA A100 and V100 GPU instances, special pricing, invaluable networking and expertise, marketing opportunities and more.
Oracle will soon introduce virtual machine instances providing one, two or four A100 GPUs per VM, and provide heterogeneous cluster networks of up to 512 A100 GPUs featuring bare-metal A100 GPU instances blended with Intel CPUs. Enterprises interested in accelerating their workloads with Oracle’s new A100 instance can get started with Oracle Cloud Infrastructure on Sept. 30.
To learn more about accelerating AI on Oracle Cloud Infrastructure, join Oracle at GTC, Oct. 5-9.
What’s New: At Baidu World 2020, Intel announced a series of collaborations with Baidu in artificial intelligence (AI), 5G, data center and cloud computing infrastructure. Intel and Baidu executives discussed the trends of intelligent infrastructure and intelligent computing, and shared details on the two companies’ strategic vision to jointly drive the industry’s intelligent transformation within the cloud, network and edge computing environments.
“In China, the most important thing for developing an industry ecosystem is to truly take root in the local market and its users’ needs. With the full-speed development of ‘new infrastructure’ and 5G, China has entered the stage of accelerated development of the industrial internet. Intel and Baidu will collaborate comprehensively to create infinite possibilities for the future through continuous innovation, so that technology can enrich the lives of everyone.”
– Rui Wang, Intel vice president in the Sales & Marketing Group and PRC country manager
Why It Matters: Zhenyu Hou, corporate vice president of Baidu, said that Baidu and Intel are both extremely focused on technology innovation and have always been committed to promoting intelligent transformation through innovative technology exploration. In the wave of new infrastructure, Baidu continues to deepen its collaboration with Intel to seize opportunities in the AI industry and bring more value to the industry, society and individuals.
A Series of Recent Collaborations:
AI in the Cloud: Intel and Baidu have delivered technological innovations over the past decade, from search and AI to autonomous driving, 5G and cloud services. Recently Baidu and Intel worked on customizing Intel® Xeon® Scalable processors to deliver optimized performance, thermal design power (TDP), temperature and feature sets within Baidu’s cloud infrastructure. With the latest 3rd generation Intel Xeon Scalable processor with built-in BFloat16 instruction set, Intel supports Baidu’s optimization of the PaddlePaddle framework to provide enhanced speech prediction services and multimedia processing support within the Baidu cloud to deliver highly optimized, highlight efficient cloud management, operation and maintenance.
Next-Gen server architecture: Intel and Baidu have designed and carried out the commercial deployment of next-generation 48V rack servers based on Intel Xeon Scalable processors to achieve higher rack power density, reduce power consumption and improve energy efficiency. The two companies are working to drive ecosystem maturity of 48V and promote the full adoption of 48V in the future based on the next-generation Xeon® Scalable processor (code named Sapphire Rapids).
Networking: In an effort to improve virtualization and workload performance, while accelerating data processing speeds and reducing total cost of ownership (TCO) within Baidu infrastructure, Intel and Baidu are deploying Smart NIC (network interface card) innovations based on Intel® SoC FPGAs and Intel® Ethernet 800 Series adapter with Application Device Queues (ADQ) technology. Smart NICs greatly increase port speed, optimize network load, realize large-scale data processing, and create an efficient and scalable bare metal and virtualization environment for the Baidu AICloud.
Baidu Smart NICs are built on the latest Intel Ethernet 800 series, Intel Xeon-D processor and Intel Arria® 10-based FPGAs. From the memory and storage side, Intel and Baidu built a high-performance, ultra-scalable, and unified user space single-node storage engine using Intel® Optane™ persistent memory and Intel Optane NVMe SSDs to enable Baidu to configure multiple storage scenarios through one set of software.
5G and edge computing: In the area of 5G and edge computing, Intel and Baidu have utilized their technology expertise and collaborated on a joint innovation using the capabilities of the OpenNESS (Open Network Edge Services Software) toolkit developed by Intel, and Baidu IME (Intelligent Mobile Edge), to help achieve a highly reliable edge compute solution with AI capabilities for low-latency applications.
What’s Next: Looking forward, Intel will continue to leverage its comprehensive data center portfolio to collaborate with Baidu on a variety of developments including:
Developing a future autonomous vehicle architecture platform and intelligent transportation vehicle-road computing architecture.
Exploring mobile edge computing to provide users with edge resources to connect Baidu AI operator services.
Expanding Baidu Smart Cloud in infrastructure technology.
Improving the optimization of Xeon Scalable processors and PaddlePaddle.
Bringing increased benefits to Baidu online AI businesses, thus creating world-changing technology that truly enriches lives.