Implicit Risks of Using the Cloud
December 13, 2020Risk in cloud computing is often framed in terms of uptime and failures. Discussed in a number of 9s, cloud providers provide guarantees such as 99.9% (3 9s) service uptime and 99.99999999999999% (16 9s) durability of globally-redundant storage. The cloud is designed to be fault tolerant with the global portfolio shining in disaster recovery situations.
However, what isn’t discussed in the guarantees is whether there will be capacity when customers need it in the future. The service from cloud providers is a lease on existing cloud resources. It’s not directly buying a promise for future capacity, but rather it’s a soft promise that the provider will enable you to scale limitlessly.
And so, with the adoption of the cloud, the risks of the cloud supply chain become implicit risks for cloud consumers. Furthermore, these risks are quite opaque as the details of over $100B of capital expenditure is highly sensitive. And in many cases, that we’ll see below, it’s not always in the control of cloud providers.
Cloud is dependent on the global supply chain
While it might be easy to use Covid-19 as a proof point, which we’ll come back to later, a more normal supply chain risk was observed in 2016-2018 with the random-access memory (RAM) shortages.
This one had little to do with cloud as a business, but rather was the fall out of mobile demand with drastically increased flash storage on phones. Phone manufacturers drained the inventory of RAM in the ecosystem. It eventually impacted server manufacturing and consumer PC prices. In a matter of a year, RAM prices doubled driven from the high demand, when usually you see prices slowly fall from Moore’s law.
Even if cloud providers had the money to spend, there just wasn’t enough components on the market to keep up with demand. The memory manufacturers had to tool up further to meet the market need. And so by 2018, the cloud demand caught up to the constrained server supply chain. Cloud resource quotas weren’t being approved and providers were triaging the requests to determine who gets to scale.
If you were looking to scale up your cloud services during that time period, you experienced the whiplash of the global supply chain, where it may have taken you months to get your cloud resource quota increased.
Impact of Covid-19
So fast forwarding to the present, Covid-19 has ravaged the global supply chain. The cloud is built up from thousands of components from items like RAM to heavy equipment like diesel generators. And while some companies have found ways to be resilient to the pandemic, the supply chain is as strong as its weakest link.
As we saw with the cloud capacity shortage in 2018, it can take a few years for the shock to the supply chain to manifest itself with cloud. The cloud capacity risk is likely something that will manifest in 2021 and 2022 from the Covid-19 shock.
Furthermore, with Covid-19, the world has shifted to a much higher adoption of digital work. Demand for the cloud has dramatically increased. Microsoft posted some of its best earnings despite the pandemic due to these broad behavior shifts.
The capacity crunch we saw in 2018 will pale to comparison to what we’ll see over the next few years.
Making it worse with Cloud Cost Control
One of the trends of 2020 has been focusing on reducing costs through optimizing usage. Flexera published a 2020 cloud study where 73% of the respondents listed “Optimize existing use of cloud (cost savings)” as a top initiative.
While a good practice to have in general, it poorly positions companies for the next few years because it assumes the cloud is limitless. It’s equivalent of checking out of some of your hotel rooms to expect that the hotel will have vacancies in the future for you. The hotel has no obligation to keep that room for you when you check out.
And it’s pretty likely the hotel won’t have many vacancies in the upcoming years.
Using commitments to mitigate capacity risk
Previously, we discussed how commitments were the economic solution to the forecast uncertainty problem, creating value for both the provider and consume. Planned well and it will save cloud consumers significant money. However, the stronger benefit is the risk reduction especially with what’s ahead.
Furthermore, properly planned commitments mean cloud customers likely have resources that won’t get consumed immediately causing anxiety for the cost control mindset. However, the surplus that is secured provides significant risk reduction for business continuity in the increasingly digital world. Enterprises need mechanisms to both figure out the plan and tell the narrative in an economically tight time.
It’s made founding Optrilo even more impactful despite the additional hurdles of launching during a global pandemic. We’re here to help customers get ahead of the cloud curve before it gets flattened.