Skip to content

Instantly share code, notes, and snippets.

@coreypurcell
Last active August 29, 2015 14:17
Show Gist options
  • Save coreypurcell/98abb4f88dc3b44464cb to your computer and use it in GitHub Desktop.
Save coreypurcell/98abb4f88dc3b44464cb to your computer and use it in GitHub Desktop.

Revisions

  1. coreypurcell revised this gist Mar 23, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion deployment-proposal.md
    Original file line number Diff line number Diff line change
    @@ -16,7 +16,7 @@ in the infastructure project, we still need to upload SSH Keys to bamboo and
    configure the build plan to run the scripts. This also means that any change to
    SSH keys or adding/removing scripts will require us to touch every build plan.
    That's something we can do with a few environments, but would be terrible if we
    had 100+ customers.
    had 100+ customers. NO FIDDLING WITH GUIs.

    ### Proposal - Deployment Amazon CodeDeploy

  2. coreypurcell created this gist Mar 23, 2015.
    143 changes: 143 additions & 0 deletions deployment-proposal.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,143 @@
    # Devops Proposal

    ## Deployment

    * scaling to many production environments
    * metrics about deployments
    * dashboards
    * automating new env creation, no GUIs required

    Deployments are currently initiated and managed using Bamboo's deployment plans.
    This scales well when you only have a couple of environments, but will become
    increasingly difficult to manage once we have many customers. We currently need
    to create a new build plan for each production environment that we want Bamboo
    to automatically deploy. Although the deployment process is isolated to scripts
    in the infastructure project, we still need to upload SSH Keys to bamboo and
    configure the build plan to run the scripts. This also means that any change to
    SSH keys or adding/removing scripts will require us to touch every build plan.
    That's something we can do with a few environments, but would be terrible if we
    had 100+ customers.

    ### Proposal - Deployment Amazon CodeDeploy

    Amazon's CodeDeploy project looks really interesting and should tie in well
    with our infastructure project. We can add to the new project to a deployment
    group or create a new group. CodeDeploy can be integrated with Github and can
    use Githubs branch CI statuses to determine when to deploy, so it can wait for
    master or staging to be green. We will need to spend some time investigating how
    CodeDeploy works and possibly adjust our current deployment strategy to fit in
    well with CodeDeploy.

    ### Alternative - Deployment Server

    Use something like [Heaven](https://github.com/atmos/heaven) to deploy our apps.
    The deployment server can receive a single call/api request from bamboo and then
    start deploying to all environments that are set to track that branch. Bamboo
    only needs to run the CI and record the result, and can trigger the
    deploy directly with a call to the deployer. Using a deployment server, we can
    track deployments much easier and with far more metrics. This gives us a single
    place to make changes for deployments, and never need to touch a GUI. It allows
    to fully script the creation of an environment for a customer. Bamboo does not
    currenlty have an API for creating build plans. We can also make it easier to do
    custom things for "special" customers. Temporarily freezing customers to a
    specific revision can be done in our own dashboard and not having a developer
    log into Bamboo and disable a script.

    ## Monitoring

    * server level monitoring
    * services level monitoring (DB, Work Queues)
    * application level monitoring
    * errors
    * alerting

    It's not in production until it's monitored!!!!

    ### Server Level Monitoring

    We have lots of options. New Relic supports server monitoring. AWS has
    Cloudwatch. Nagios or somethig more modern like Sensu can also do this quite
    well. Whatever we use must be hooked up to something like PagerDuty to allow us
    to easily change the on call person. Sending an email to a distribution list
    leads to Alert Fatigue and often everyone ignoring the problem.

    ### Services Level Monitoring (DB, Redis, Sidekiq)

    New Relic has plugins to monitor most of these and can monitor back end workers
    like sidekiq. Another option is Nagios/Sensu which can monitor most of anything.
    It is important that any tool we have can alert us to queue length and unusual
    errors in the queues.

    ### Application Level Monitoring

    Far and away the easiest and best Rails app monitoring is New Relic. You can get
    a lot of performance timings and similar stats using your own creation of
    Statd/Graphite/Batsd, but you will lose things like slow transaction tracing.
    New Relic is not good at collecting many custom metrics, though. So you can't
    add a lot of stats like you would do with a Statsd/Batsd implementation. New
    Relic also has SQL query performance timing and will suggest indexes, etc. There
    are other alternative providers we could look into, Sklight or Appsignal for
    example, they both have more favorable price schemes. Honeybadger has it's own
    lightweight form of this in their error tracking product.

    ### Errors

    I recommend we use a service for this, honeybadger is my current favorite.
    Sentry and other options are pretty solid for the price. Honeybadger actually
    has performance metrics for your app and could almost replace New Relic with
    transaction tracing, but I've never tried using it for that. New Relic also has
    error capturing and reporting.

    #### Proposal Just Use New Relic

    Continue using newrelic until it becomes cost prohibitive. At that time we can do
    many things like investigate other services or consider building our own. New
    Relic is very opinionated and supports Rails apps out of the box. It covers
    server, application and error monitroing on it's own. Yes it's expensive, but we
    can reevaluate once the cost is very high or we understand our own needs better.

    #### Proposal Build our own now or mix and match

    It is possible to build your own version of new relic. You can utitlize
    ActiveSupport::Notifications to record everything you want to know. You can use
    statsd or even splunk to create the statistics that you want, like p95 or avg or
    median measurements. You can use Rails own auto explain logging to record slow
    queries.

    We can also mix in services with our own monitoring such as Honeybadger for
    errors etc.

    ### Alerting

    Please for the love of everything holy, do not send alerts to an email
    distribution list. Instead send them to a service like PagerDuty, if it's too
    expensive we can use an open source tool that does a similar thing like
    Flapjack, but other services tend to be well integrated with PagerDuty, like New
    Relic and Honeybadger.

    This is vital for establishing an on-call rotation and avoid emails from
    interrupting developers. Or worse developers just start ignoring them.

    ## ChatOps

    * making it so anyone can deploy anything/anywhere
    * automating everything

    The goal of ChatOps is to make everything automated and open to everyone. A PM
    wants to deploy code to an environment, they don't need to know anything but the
    command to our chatbot or how to use our dashboard UI. No logging into Bamboo or
    installing the correct SSH key and setting up a project on you local machine. A
    sales rep wants to setup a demo environment, BOOM! done. No more tickets for the
    developers to implement. This ties in pretty tightly with a deployment server,
    the current bamboo method isn't going to work.

    ## The Downfall of New Relic

    New Relic is expensive and it charges per server. We will have lots of servers
    with our environment per customer. Something like Honeybadger charges per
    application, and we will really only have 4 applications currently.

    As part of chatops we'll need to figure out the best way to configure everything
    on New Relic as well. When change or create projects then New Relic might need
    to change as well, such as adding a new type of service.