Open Datacenter Hardware - What is OCP?

Let's say that you are a system administrator in a decently sized company. You're responsible for selecting new servers for a pretty decently sized upgrade round in your data centers, or maybe you're building a new datacenter. It's not that long ago this happened last time for your company, but since we're living in an ever-changing world things of course are complicated. What do you do? Let's look at your options.

  1. Go with the same vendor and same models as you did previously. Safe bet, will not get you fired even if it's the least cost effective solution. Probably will not get you promoted either though, and the pains of today will be the pains of tomorrow.
  2. Look around for new vendors. Maybe you're finally looking at Supermicro instead of only doing HPE, or the other way around. Risks are higher, but all those issues with iDRAC/iLO/IPMI are surely fixed on the other vendor - right?
This is the world I remember living in when I worked for a small ISP in a small city in Sweden. It was loads of fun evaluating servers to see what we wanted to use and what made financial sense. Effectively what we ended up with was some of X, some of Y, some of Z in order to always pressure the vendors for lower prices. It worked for us, but we had to pay the price in terms of having to maintain support contracts with all vendors, and our automation, monitoring, and documentation needed to be updated for all these models and servers. Even within the same vendor things could change drastically.

This is still the case for our fictional system administrator, only that he/she has way more machines than I did back then. So why is this even an issue? Let's look on what you need from a server when hosting it in a datacenter, and what parts you can assume something about.
  • How is it mounted?
You can probably assume 19" rack here, but how high is it? 1U? 2U? 1.5U?
What rails is it using? Are they included? How deep is the rack required to be?
  • What power does it require?
You can probably assume 230/110V, but does it have 1/2/4/x PSUs? How should they be connected for redundancy?
  • How are components repaired? What components can you use?
Some servers have disks mounted inside the box, which is a huge problem when they break.
Some hardware vendors require use of their own hardware, which might be hard to get in time, whilst other allows you to run down to the local computer store and pick up a new hard drive.
  • How do you manage your server?
Do you use iDRAC? iLO? IPMI? OpenBMC? Do you require any extra licenses to do what you need from your server?

We could extend the list above by several more items, but hopefully the point is clear: servers have many dimensions that differ between different server models. More or less what you can assume is that your server is going to boot an OS from some storage medium, and that's about it.

So, what can we do?

One way this has been handled by some vendors is the creation of chassis solutions often called blade servers where you have a chassis that has a specific physical footprint and offers a couple of arbitrary blade units of servers. Maybe a high performance blade occupies two blade units, while a normal server only needs one. This is great, because it allows us to easily replace and upgrade servers as needed, and a lot of decisions have been made for us already. So what's the problem?

These blade centers are proprietary, creating vendor lock-in. They are also quite bloated in that some try move features like management to the blade-center chassis controller, making them prone to be outdated as technology advances.

What we need is something that specifies how a server should behave, physically as well as logically, for more aspects than what we have today.

Enter Open Compute Project (OCP). This is how OCP describes itself.
"The Open Compute Project's mission is to design and enable the delivery of the most efficient server, storage and data center hardware designs for scalable computing."
Wikipedia, Open Compute Project
"We believe that openly sharing ideas and specifications is the key to maximizing innovation and reducing operational complexity in the scalable computing space."
Mission and Principles, Open Compute Project
With big companies like Facebook, Microsoft, and Google on the requirements producing end the organization is already a heavy player in the server space, and has already published hundreds of hardware specifications. Facebook for example has full datacenters built using OCP technologies.

Essentially what OCP provides is a forum for companies to discuss what they need, come up with a common way of solving that problem. The results are open specifications on how the problem, e.g. a server, should be solved and then it's up to any hardware manufacturer to pick that specification up and produce a product that matches that specification. Since OCP focuses on reducing operational complexity it means that parts strive towards being fully interchangeable.

Let's go back to our friend the system administrator and add a third alternative: invest into converting to the OCP 21" rack OpenRack. For example, the company could elect to deploy OpenRack in their new datacenter, and use 19" converter bays in their existing datacenters as servers are being replaced. The result is that moving capacity to a datacenter is as easy as taking a box not much bigger than a folded kickbike and pushing it in a free slot in the OCP rack. If the vendor is A, B, or even Z does not matter - the rack solves the problem of providing redundant power and physical enclosure to the server - nothing else, because that's all that is needed for a rack to do.

It should also be noted that OCP provides more than server and rack designs, networking is a notable example. At the time of this writing there are also early working groups starting up that are focusing on trying to standardize software concepts, e.g. firmware like BIOS/UEFI or BMC/management APIs. This means that while that today OCP enforces mostly hardware requirements in order to be compliant, there are also efforts for making sure the software interfaces are equally portable between vendors.

Hopefully you the reader now have an idea of what OCP tries to do. I know that I'm excited about this endevour, and maybe you are now as well. One of the great things about having open specifications for products like these are that you can learn so much more about them than would be possible with for example the ThinkPad I'm writing this on. Thanks to that, this post will be followed by some other posts consisting of technical deep dives into some OCP products that I have had the luck to get my hands on.

Thanks for reading!

Comments

Popular posts from this blog

Open Datacenter Hardware - Leopard Server