Skip to content
This repository has been archived by the owner on Jul 22, 2018. It is now read-only.

Why You Need to Stop Worrying About prodweb001 And Start Loving i 98fb9856

Benjamin Oakes edited this page Nov 13, 2013 · 2 revisions

Speaker: Chris Munns, Amazon Web Services

Server hugging

  • We named servers because we had to find them
  • Fixed servers because there's a dead server which affects you (lost money)
  • Like pets (or babies), name them (after Greek gods, etc)
  • Better: Prodweb01, Prodapi01, etc... but...
  • If the server has a problem, how does it affect you?
  • These are dated habits for the cloud

The new way

  • Sleep through infrastructure recovery
  • Shouldn't get alerted by high CPU, high memory, etc.
  • Bigger picture: latency (it's what actually affects users)
  • Get rid of high CPU alerts!

When there's a problem

  • Looking at logs and graphs means looking at old news

  • Why would you reboot manually?

  • Auto-scale for your needed capacity

  • Shoot the other node in the head: STONITH, it's fine with autoscaling

  • Auto scale a single instance (min = max = 1)

  • Auto-scaling free

  • If you have 2 for redundancy, you can maybe use auto-scaling instead (TODO)

ENI/EIP

  • ENI: Add additional network interface (secondary private ip), free

  • EIP

  • ENI allows network and security appliances in your VPS

(See diagrams)

Use tags for a source of truth

  • DNS only tells you a single thing (hostname), caching issues -- maybe not a good choice for truth (especially if things are transient, changing)

  • Tags are user defined, key-value, queryable

  • DNS: web-03.example.com

  • Tags: i-933f81a4

    • Name: Web
    • Env: Prod
    • Project: Blog
    • Owner: Ben
    • ...

Stop hand-crafting servers

  • Automation

  • Higher-level: Elastic Beanstalk, OpsWorks

  • DIY: CloudFormation

  • Host-based configuration management (HBCM), use Chef etc to automate provisioning

  • vs shell scripting... you're going to redo work that's really common

Service registries

  • Key part of SOA (Service Oriented Arch)
  1. Boot
  2. register with registry
  3. may kick off other changes
  4. tell other intances
  5. deregistered if goes away
  • Zookeeper: OSS service registry
  • minimum of 3 zookeepers recommended

Airbnb

  • Customer story: SmartStack

  • Martin Rhoads

  • Service discovery, help you build SOAs

  • uses zookeeper registry

4 things

  • Service

  • Zookeeper

  • Nerve (checks health and updates Zookeeper)

  • Synapse routes between services (uses haproxy)

  • Solution is transparent

  • Automatic failure handling

  • No DNS

  • Distributed by design

  • Traffic flows directly betwen boxes (no central load balancer or enterprise service bus)

  • haproxy keeps trafic flowing

OSS

  • Install Vagrant
  • git clone ...airbnb/smarstack-cookbook.git
  • vagrant up

Code:

  • airbnb/nerve
  • airbnb/synapse

A crowd-sourced conference wiki!
Working together is better. :)




Clone this wiki locally