Story image

Google Cloud TPU machine learning accelerators now available in beta

13 Feb 18

Google has made its Cloud TPUs available in beta on Google Cloud Platform (GCP) to help machine learning experts train and run their ML models faster.

Google defines its cloud TPUs (tensor processing unit) as hardware accelerators that are optimised to speed up and scale up specific ML workloads programmed with TensorFlow.

Each Cloud TPU is built with four custom ASICs, and provides up to 180 teraflops of floating-point performance and 64 GB of high-bandwidth memory onto a single board.

The boards can be used alone or connected via an ultra-fast, dedicated network to form multi-petaflop ML supercomputers called “TPU pods.”, Google explained in a blog post yesterday.

Google stated that it will offer these larger supercomputers on GCP later in the year.

“We designed Cloud TPUs to deliver differentiated performance per dollar for targeted TensorFlow workloads and to enable ML engineers and researchers to iterate more quickly,” Google said on its blog. The company elaborated on this with three examples:

  • Instead of waiting for a job to schedule on a shared compute cluster, you can have interactive, exclusive access to a network-attached Cloud TPU via a Google Compute Engine VM that you control and can customise
  • Rather than waiting days or weeks to train a business-critical ML model, you can train several variants of the same model overnight on a fleet of Cloud TPUs and deploy the most accurate trained model in production the next day
  • Using a single Cloud TPU and following this tutorial, you can train ResNet-50 to the expected accuracy on the ImageNet benchmark challenge in less than a day, all for well under $200

ML model training

Google’s Cloud TPUs can be programmed with high-level TensorFlow APIs, and the company has open-sourced a set of reference high-performance Cloud TPU model implementations.

Google plans to open-source additional model implementations over time.

“Adventurous ML experts may be able to optimise other TensorFlow models for Cloud TPUs on their own using the documentation and tools we provide,” Google added.

Google will introduce TPU pods later this year which will improve the time-to-accuracy of Cloud TPUs.

“Both ResNet-50 and Transformer training times drop from the better part of a day to under 30 minutes on a full TPU pod, no code changes required,” the blog detailed.

Two Sigma chief technology officer and former senior Google engineer Alfred Spector comments, “We made a decision to focus our deep learning research on the cloud for many reasons, but mostly to gain access to the latest machine learning infrastructure.”

“Google Cloud TPUs are an example of innovative, rapidly evolving technology to support deep learning, and we found that moving TensorFlow workloads to TPUs has boosted our productivity by greatly reducing both the complexity of programming new models and the time required to train them.”

Spector concludes, “Using Cloud TPUs instead of clusters of other accelerators has allowed us to focus on building our models without being distracted by the need to manage the complexity of cluster communication patterns.”

Why total visibility is the key to zero trust
Over time, the basic zero trust model has evolved and matured into what Forrester calls the Zero Trust eXtended (ZTX) Ecosystem.
Gartner names Proofpoint Leader in enterprise information archiving
The report provides a detailed overview of the enterprise information archiving market and evaluates vendors based on completeness of vision and ability to execute.
QNAP introduces new 10GbE and Thunderbolt 3 NAS series
The new series is supposedly an all-in-one NAS solution for file storage, backup, sharing, synchronisation and centralised management. 
Tensions on the rise after Huawei CFO arrest
“Recently our corporate CFO, Meng Wanzhou, was provisionally detained by the Canadian authorities on behalf of the United States of America."
CyrusOne investing in new Amsterdam data centre
CyrusOne is continuing its rapid and relentless investment into Europe, with news emerging of a new facility in the Netherlands.
HPE to supply tech to Formula E racing team
“At HPE, we believe the future belongs to the fast, and we’re focused on accelerating what’s next for enterprises, including in the world of auto racing."
Why the future of IT infrastructure is always on and always available
As more organisations embrace digital business, infrastructure and operations leaders will need to evolve their strategies and skills to keep up.
Digital transformation in Europe a €333b business
IDC has shared its forecast for digital transformation spending in Europe, which looks to be a very profitable industry.