Saturday, 19 January, 2019 UTC


Summary

With an internet population quickly closing on 1 billion, China is one of the most challenging, and exciting, place to be when it comes to massively scaling digital platforms.
Last November, Alibaba alone set a new Singles Day record with more than USD 30.8 billion (with a “b”) in sales in only 24 hours.
This market demands elastic infrastructure able to cope with massive traffic surges, and we’ve done so for a few years already leveraging cloud providers in China (Qingcloud, Tencent Cloud, Aliyun, AWS) and containers with Kubernetes.
Let me share a few things we learnt along the way.
Picking a way to run Kubernetes
First things first: Kubernetes is just a tool.
It’s not magic. You’ll need to set it up and dedicate resources to maintain it.
You have a few options to run it:
  • Hosted: few clicks - ready to go - you don’t manage anything apart from your pods/services.
  • Turnkey: few clicks - many provider supported - wide control over the k8s cluster.
  • On-premise turnkey: few clicks - enhanced security with your own private network.
  • Custom: you’re on your own - spawn your servers, install everything yourself.
Hosted doesn’t offer enough customization and custom is too hard if you don’t have a dedicated SRE team.
The turnkey approach is a good middle ground that offers the flexibility many company needs.
Outside of China, there’s a plathora of solutions like GCE (Google) or EKS (Amazon).
In China the landscape is a bit narrower, with the leading providers being as Alicloud & Tencent.
Infrastructure as code
Web interfaces of cloud providers are neat… but:
  • They often suffer from cluttered UI with hundreds of options that will confuse even the most seasoned administrator.
  • They don’t encourage repeatability and reuse - “click here, and there” isn’t automation.
Capturing everything in code is a must for us at Wiredcraft, whether it’s software or the infrastructure we need to run it.
We were early adopters of (and contributors to) Ansible; so why would we need another tool?
Ansible or Terraform?
These 2 tools actually complement each other pretty nicely:
  • Ansible:
    • Procedural approach: describing the steps to take to reach the eventual state.
    • Lacks cloud providers support: less than 10 actually.
    • Best fit for configuration management: managing services, managing configurations, managing automation.
    • We use it down the road to tune our setups and complete the configuration; log management, user management, backup…
  • Terraform:
    • Declarative approach: describing the eventual state of the platform rather than the steps to take to reach it.
    • Extensive cloud support (50+): over 50 providers, including Chinese Clouds.
    • Great state management & plan commands: offer an exact list of changes that would be performed on the platform.
    • We use it for the provisioning and orchestration of the infrastructure.
Example: Kubernetes setup on Alicloud
We’ve put together a very straight-forward Terraform script to set up Kubernetes on AliCloud.
It’s easy to customize and runs on a multi availability zone.
And voilà! You have your own k8s cluster running on AliCloud.
You can start deploying containers, using whichever method works for you:: Ansible, helm, …
Terraform as part of our DevOps tool belt
Our DevOps team has fully adopted Terraform, adding it to our list of tools & technologies that help us safely run and scale products for tens of millions of users in China.
It help us keep Ansible (with our very Pipelines) focused on automation while Terraform takes care of orchestration.
We do a lot more to automate ourselves out of our jobs, from statistical anomaly detection of our monitoring to blue-green deployments.
Drop us a line if you’d like to understand how we help Nike, Burberry, Starbucks or Hilton run their digital infrastructure in China, or come to one of our meetups.