DevOps Discipline
Detailed & Complete
Please be seated.
Make sure to ctrl + tab to GitHub Catapult README.md
devopsgroup.io Team
Seth Reeser and Steve Britton
Our related projects
vagrant-digitalocean
vagrant-hostmanager
Catapult
a website and workflow management platform
[R] Welcome back to the conference... We are the devopsgroup.io Catapult team... Hi, my name is Seth Reeser, I'd like to also introduce Steve Britton.
[R] With over 40 years combined hardware and software experience, we've founded technology ventures and traveled 5 of the 7 continents.
[R] With over a quarter-million downloads, we're very proud of our open-source projects.
[R] We're here today to speak about our newest project, Catapult, a website and workflow management platform.
[CUT]
DevOps Discipline: Detailed & Complete
Code samples and multimedia
Fast-paced, time-boxed topics
Source Code Management
Configuration Management
Development and Virtualization
Environment Management
DNS Management
Continuous Integration
Monitoring & Insights
[B] Our session today is ... DEVOPS DISCIPLINE: ... DETAILED & COMPLETE
[R] This session leverages code samples and multimedia to deliver our story
[B] Today we'll talk about...
Source Code Management
Configuration Management
Development and Virtualization
Environment Management
DNS Management
Continuous Integration
Monitoring & Insights
[R] This session will be fast-paced with time-boxed topics and automatically transitioning slides, but don't worry we'll provide links and time for questions at the end
[B] ...and there is also a chime before slides transition
[CUT]
What is Catapult ?
Catapult is a website and workflow management platform
Catapult is an open source, complete, and distributed architecture
Catapult features Gitflow workflow while enforcing exactly matching, branch-driven environments
[R] So... What is Catapult?
[R] Catapult is a website and workflow management platform
[R] Catapult is an open source, complete, and distributed architecture
[R] Catapult features Gitflow workflow while enforcing exactly matching, branch-driven environments
[B] Everything in this presentation is fully automated by Catapult
[B] Today, we'll be taking a look at partial code samples from Catapult, let's dive in
[CUT]
Source Code Management
Verify repository integrity
$(ls -afq .git/refs/heads | wc -l ) == "2"
$(git rev-list HEAD | tail -n 1) != $(git rev-list origin | tail -n 1)
SSH key scans for remote Git
i=1;
until [ $i -ge 10; ]; do
sudo ssh-keyscan bitbucket.org > ~/.ssh/known_hosts
if grep -q "bitbucket\.org" ~/.ssh/known_hosts; then
echo "ssh-keyscan for bitbucket.org successful"
break
else
echo "ssh-keyscan for bitbucket.org failed, retrying!"
fi
i=$[$i+1]
done
[B] Source Code Management
[B] Properly configured repositories lay the foundation for the platform and websites.
[B] To begin, verify that the repository is not empty. This is reflected in the first code sample where we check for an empty repository, then verify integrity based on commit hash.
[B] This verification by hash protects against pitfalls such as a corrupt local copy or a name conflict.
[B] Name conflict in this case would mean that the repository has the same name as another repository but doesn't actually have the same contents.
[R] The second code sample is also a shell script sample. This one handles some issues encountered during automation. We experienced challenges, mostly with BitBucket, around failing ssh key-scans.
[R] As a result of that, we built in a redundant key-scan, which you can see on screen - this runs up to 10 attempts .
Source Code Management
Confirm repository write access
if "#{repo_split_2[0]}" == "github.com"
uri = URI("https://api.github.com/repos/#{repo_split_3[0]}/collaborators/#{@configuration["company"]["github_username"]}")
Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|
request = Net::HTTP::Get.new uri.request_uri
request.basic_auth "#{@configuration["company"]["github_username"]}", "#{@configuration["company"]["github_password"]}"
response = http.request request # Net::HTTPResponse object
if response.code.to_f == 404
catapult_exception("The GitHub repo #{instance["repo"]} does not exist")
elsif response.code.to_f.between?(399,600)
puts " - The GitHub API seems to be down, skipping... (this may impact provisioning and automated deployments)".color(Colors::RED)
else
if response.code.to_f == 204
puts " - Verified your GitHub user #{@configuration["company"]["github_username"]} has write access."
else
catapult_exception("Your GitHub user #{@configuration["company"]["github_username"]} does not have write access to this repository.")
end
end
end
end
[R] The next automation we're going to look at is confirming repository write access.
[R] This Ruby code sample comes from catapult.rb and reflects how we handle this check with Git Hub.
[R] The Bitbucket sample is a little too long to display here. It takes three API calls versus the one for Git Hub.
[B] While we're here, take note that it is best practice to introduce a machine user for automated processes. A machine user is the account used exclusively for automation and never by a human, hence the name machine user.
[B] A benefit to using the machine user is decoupling access from any individual developer or team member. Take a look at the Catapult documentation, in the Services Setup section, outlining how to setup the machine user for BitBucket and Git Hub, the two supported Git repository services.
[B] Also, BitBucket offers free private repositories, while both provide free public repositories.
[R] This sanity check ensures that revoked permissions, broken machine users, or API downtime can be accounted for.
Configuration Management
Single repository for platform configuration
company:
name: ACME CO.
newrelic_api_key: xxxxxxxxxxxxxxx
environments:
dev:
branch: develop
test:
branch: develop
qc:
branch: release
production:
branch: master
websites:
apache:
- domain: acmeco.com
repo: git@github.com:acmeco/wiley.coyote.git
[B] Configuration management.
[B] Catapult uses a dedicated repository for configuration of the entire platform. Much of this is represented in a single YAML file, which is sampled here in the abridged snapshot.
[B] As you can see, YAML is structured but still easy to work with and straight forward.
[B] The first section is the company section. Relevant here are locale, timezone, and service API keys which relate to the organization itself.
[R] Next in the hierarchy comes the environments. Environments are keyed as dev, test, qc, and production.
[B] The branch key you're seeing is a hint to the Gitflow driven approach to managing environments. This is where each environment is tied to a given branch.
[R] The environments section includes details such as server IP addresses and software passwords. Again, in the spirit of Catapult, these are automatically managed on your behalf.
[R] The next section, websites, is where you will likely spend most of your time. This is where websites are added or removed. Here we see a basic addition of acmeCo.com with its respective repository.
[R] Note that repository support in Catapult is implemented to strictly use SSH and not HTTPS. This is based on our experience with performance, reliability, and security.
[B] Other options are available for a website entry. Some of those include software type, software workflow direction, basic HTTP authentication, and the specification of the web root within the repository.
[R] Let's touch on that software workflow you mentioned a little later.
Configuration Management
Secrets are GPG AES 256 encrypted
Verification of encrypted Catapult configuration files:
* GPG Edit Mode is enabled at secrets/configuration-user.yml["settings"]["gpg_edit"], if there are changes to secrets/configuration.yml, secrets/id_rsa, or secrets/id_rsa.pub, they will be re-encrypted.
gpg: original file name='configuration.yml'
* There were no changes to secrets/configuration.yml, no need to encrypt as this would create a new cipher to commit.
gpg: original file name='id_rsa'
* There were no changes to secrets/id_rsa, no need to encrypt as this would create a new cipher to commit.
gpg: original file name='id_rsa.pub'
* There were no changes to secrets/id_rsa.pub, no need to encrypt as this would create a new cipher to commit.
[B] Having just looked at Catapult's platform configuration file, it's important to keep all of your secrets encrypted.
[B] Catapult's secrets include the configuration.yml file and your SSH key pair.
[B] Keeping your secrets centralized allows you to know exactly what needs to be encrypted. Limiting the total number of secrets makes this responsibility more straightforward. Catapult handles this by relying on a single GPG key.
[R] Catapult handles those centralized secrets responsibly with AES 256 encryption.
[R] At the same time, please remember to avoid a false sense of security. Effective security is 99% process.
[B] AES 256 is industry standard for protecting confidential data and military top-secret information.
[B] A favorite approach of ours has been to keep a physical QR code of the GPG key in a safe place. Regardless of your chosen approach, emailing passwords or secrets is never a good idea.
[B] For the credentials tied to each of the distributed services, we strongly recommend the use of a password manager. We're happy users of Dashlane for this purpose.
[B] By the way, this sample on the screen is actually console output, not source code. What's reflected is the output from Catapult handling the encryption and decryption of secrets.
[R] Being open source, Catapult uses many traditional approaches to DevOps and you can look under the hood to see for yourself. This is in contrast to competing platforms, such as Pantheon, Acquia, or Azure.
Development & Virtualization
Exactly matching local development
# sync the repositories folder for local access from the host
config.vm.synced_folder "repositories", "/var/www/repositories", type: "nfs"
# configure the provisioner
config.vm.provision "shell", path: "provisioners/redhat/provision.sh", args: ["dev","#{Catapult::Command.repo}","#{Catapult::Command.configuration_user["settings"]["gpg_key"]}","mysql","#{Catapult::Command.configuration_user["settings"]["software_validation"]}"]
Local separation and platform contribution
File.write('.git/hooks/pre-commit',
'#!/usr/bin/env ruby
if staged.include?("secrets/configuration.yml.gpg")
puts "Please commit secrets/configuration.yml.gpg on the develop branch.
You are on the develop-catapult branch, which is meant for contribution
back to Catapult and should not contain your configuration files."
exit 1
end'
[R] Development and Virtualization
[R] Because of Catapult's exactly matching environments, you'll find that Local Dev mimics upstream targets precisely.
[R] Here we have a code snippet from Catapult's Vagrantfile. The first configuration you'll find is the VirtualBox synced folder, allowing for live file system local development.
[B] The next configuration line specifies the provisioner - notably, the same provisioner used upstream.
[B] As virtualized and automatically provisioned environments, local Dev all the way through production can all quickly and easily be destroyed and rebuilt using Vagrant.
[B] Because Catapult's provisioners use native shell, anyone who has been tasked with server administration will find the commands familiar.
[B] It's also worth noting, since this is a local environment, you are free to do as you please without affecting any other environments.
[R] This is also where you can directly develop and contribute to the Catapult platform. The source of Catapult is in the same folder structure as all of your website repositories. Feel free to copy any of the automations which catch your eye, as well.
[R] With open source in mind, the Ruby code sample demonstrates the automatic creation of a Git pre-commit hook which separates Catapult contribution from your Catapult configuration.
[R] When setting up Catapult, a develop-catapult branch is created for you within your forked repository with the git remote upstream set to the Catapult project so that you can easily create a pull request.
[R] Guidelines for contributing to Catapult are in our documentation.
[R] Releases are driven by the devopsgroup.io Catapult team and occur when accepting new pull requests from contributors like yourself.
Environment Management
Vagrant driven by configuration
# dev => redhat
config.vm.define "#{Catapult::Command.configuration["company"]["name"].downcase}-dev-redhat" do |config|
config.vm.provider :virtualbox do |provider|
# test => redhat
config.vm.define "#{Catapult::Command.configuration["company"]["name"].downcase}-test-redhat" do |config|
config.vm.provider :digitalocean do |provider|
# qc => redhat
config.vm.define "#{Catapult::Command.configuration["company"]["name"].downcase}-qc-redhat" do |config|
config.vm.provider :digitalocean do |provider|
# production => redhat
config.vm.define "#{Catapult::Command.configuration["company"]["name"].downcase}-production-redhat" do |config|
config.vm.provider :digitalocean do |provider|
[B] Environment Management
[B] We just hit on Local Dev Vagrant configuration. Here we have environment configuration. As you can see, it's consistent throughout each environment.
[B] Catapult's virtual machine approach provides the capability for exactly matching virtual machines, regardless of provider.
[R] What this means is we can run a unique provider in Local Dev without violating the principle of exactly matching configuration. This is because the provider only represents the container of the virtual machine.
[R] This is a partial sample of how Catapult builds a dynamic Vagrantfile, driven by configuration. Not shown here are Vagrantfile conventions such as virtual machine resource sizing, network configuration, and operating system, all of which are also dynamically driven by Catapult's configuration.
[R] Local Dev utilizes the VirtualBox virtual machine provider, which is widely-adopted and cross-platform. Another vagrant virtual machine provider is VMWare.
[B] Looking further upstream, we automatically integrate DigitalOcean as our virtual machine provider. Using the DigitalOcean API and sticking to our discipline, all virtual machine operations are handled by Catapult's automation.
[B] This is made possible by using our popular vagrant provider plugin - vagrant-digitalocean. Again, this is another of our freely-available, open source projects. Join the hundreds of thousands of users who have already benefitted from our related open source projects.
Environment Management
Branch driven with Git Flow
LocalDev
Test
QC
Production
develop
develop
release
master
[R] Let's get back to the Git Flow approach that we touched on earlier.
[R] Git Flow is leveraged to drive environments per git branch.
[B] In the practice of full Git Flow, developers branch from develop to begin working on a feature branch. Once ready, the feature branch is merged via pull request into the develop branch.
[B] At that point, the continuous integration discipline is enforced by an automated deployment being kicked off. This is true for every environment except for production, where automated deployments are typically kicked off by schedule.
[R] Here we have an animated representation of the workflow. Again, take note that each environment is branch-driven. Both the platform configuration and website code follow this approach, while still allowing each website to have its own repository and release plan.
Environment Management
Software workflow downstream and upstream
Replacing database website URLs
sed -r --expression="s/:\/\/(www\.)?(dev\.|test\.|qc\.)?(${domain_url_replace})/:\/\/\1${domain_url}/g" \
"/var/www/repositories/apache/${domain}/_sql/$(basename "$file")" > \
"/var/www/repositories/apache/${domain}/_sql/${1}.$(basename "$file")"
Updating the database from latest code base
php index.php migrate
drush updatedb -y
php wp-cli.phar core update-db
[B] Catapult features a unique software workflow including downstream and upstream capabilities. The choice of which to use depends on whether you are building a new website or maintaining an existing website.
[B] There are certain operations necessary when moving database and code together from one environment to another. Catapult handles these for you automatically.
[B] The first example of these operations depicts how Catapult replaces environment-specific URLs within the database. This bash sample is one of the methods used for replacing these URLs, specifically, when the software in use doesn't have a component such as Drush with Drupal or WP-CLI with WordPress.
[R] Another one of these operations is executing database updates. These are introduced by merges to the branch-driven environments. Catapult automatically invokes the appropriate tool for the given software. This is presented in the second example where CodeIgniter, Drupal, and WordPress have relevant tools.
[R] Setting software files folder permissions is another notable automation. An example of software files folders is Drupal's default target for user-provided content, sites/default/files.
[B] Harnessing the simplicity of Catapult, all of the powerful automation we've covered can be alternately configured between downstream or upstream with a single website configuration option.
DNS Management
Domain name driven
* domain:
* example: example.com
* dev.example.com, test.example.com, qc.example.com, example.com
* domain_tld_override:
* example: mycompany.com
* a domain name under your name server authority to append to the top-level-domain (e.g. .com)
* appends the domain_tld_override for Environments
* dev.example.com.mycompany.com, test.example.com.mycompany.com,
qc.example.com.mycompany.com, example.com.mycompany.com
Local DNS management
# configure hosts file on both the host and guest
config.vm.provision :hostmanager
config.hostmanager.aliases = Catapult::Command.dev_redhat_hosts
[B] DNS Management.
[B] Catapult's automation relies on your domain names as the canonical, unique identifiers for each website.
[B] While this approach seems obvious, we've all seen example.dev - or worse yet, example.devcloud.acquia-sites.com.
[R] The first snippet comes from the documentation and reflects how to set the domain name, as well as how to configure a unique Catapult feature, domain TLD override.
[R] The domain TLD override can be used when you do not yet have control over the root domain, or are still in the pre-release stage with your client. Take a look at the resulting domain names reflected in the example and you'll get a better feel for what this provides.
[B] We take advantage of another of our open source vagrant plugins, vagrant-hostmanager, to fully automate the local DNS hosts files. This allows for seamless integration with your local development workstation. Paired with the sync'ed VirtualBox folder this completes the local development experience. The result is a quick cycle time - make a change and see it reflected immediately.
DNS Management
Cloud DNS management
$(curl --silent --show-error --connect-timeout 5 --max-time 5 --write-out "HTTPSTATUS:%{http_code}" \
--request POST "https://api.cloudflare.com/client/v4/zones" \
--header "X-Auth-Email: $(catapult company.cloudflare_email)" \
--header "X-Auth-Key: $(catapult company.cloudflare_api_key)" \
--data "{\"name\":\"${domain_levels[-2]}.${domain_levels[-1]}\",\"jump_start\":false}")
[B] Meanwhile, upstream environments beyond Local Dev leverage the CloudFlare API to automate DNS management. The DNS zones, DNS records, and front-end caching are all handled by Catapult's automation.
[B] Along with DNS, CloudFlare affords you free SSL, DDoS protection, threat detection, and the always on capability. The always on capability will serve your website's most popular pages from cache in the event of an outage.
[R] In this code sample, we leverage curl and the CloudFlare Zones API to create a new DNS root-level zone. Note how the API parameters are driven by Catapult configuration.
[R] Because of Catapult's distributed services approach, specific services can still be accessed individually to get fine-grain metrics such as the CloudFlare screenshot we have here.
[R] In this screenshot, 30 days worth of data are reflected from a live production website. The amount of threats stopped, traffic served over SSL, and the types of threats mitigated are reflected.
Continuous Integration
Commits to environment branches trigger deployments
[B] Continuous Integration.
[B] Changesets are introduced into each environment by pull requests from each branch to the next.
[R] Deployments are holistic and complete. This means deployments include not only the website release, but also environment configuration, server software, DNS, and more.
[R] The first screenshot shows a BitBucket pull request from 'develop' into 'release.' This approach supports managing access to the next upstream environment given that environments are branch-driven.
[R] Once the pull request is merged, continuous integration is put into practice by running an automated deployment. Here we have a screenshot from Bamboo. Also, Bamboo logs all build output, build results, and provides detailed metrics on the deployments.
[B] This process accommodates an abridged or full Git Flow workflow. If you are the sole developer, you could commit directly to the develop branch and enjoy the same automation without a pull request to develop.
[B] However, pull requests are absolutely necessary upstream as a quality control.
[R] The benefits of pull requests are the visual representation of what would be introduced to the environment in addition to providing an opportunity for online code review and controlling who has access to merge code.
[R] Also, any impending merge conflicts will be made clear as part of this workflow.
Monitoring & Insights
Managed server monitoring
sudo rpm --hash --upgrade --verbose https://download.newrelic.com/pub/newrelic/el5/i386/newrelic-repo-5-3.noarch.rpm
sudo yum install -y newrelic-sysmond
[R] Monitoring & Insights
[R] New Relic is integrated for Monitoring & Insights, with both free and paid subscriptions available. All of the following monitors reflect capabilities included in the free subscription.
[R] Automated integration of server monitoring begins by adding the New Relic yum repository, followed by installing the server monitor package, as reflected in this bash code snippet.
[B] As a result, server metrics are reported to New Relic, including CPU, memory, disk, and network resource details. Process profiling is included as well.
[B] Again, as services are approached from a distributed perspective, access to New Relic is available for a deeper dive.
Monitoring & Insights
Managed application performance monitoring
<IfModule php5_module>
php_value newrelic.appname "${domain_environment};$(catapult company.name | tr '[:upper:]' '[:lower:]')-${1}-redhat"
</IfModule>
[B] Now that we're monitoring the system level, we move into the next layer of monitoring which is application performance.
[B] Catapult automatically installs the New Relic APM PHP package. The necessary Apache configuration is also written automatically and is reflected in this code snippet.
[R] An APM monitor is added for each website and is also grouped at the system level. The resulting capability is tracking errors, requests, and the most time consuming transactions which are used to calculate your broader Apdex application performance index.
[R] Apdex is an established, open standard for measuring the performance of software applications. In the screenshot, the Apdex is the top right graph. Within this graph, the blue line is the application apdex and the yellow line is the browser apdex. We'll cover the browser apdex next.
Monitoring & Insights
Managed browser performance monitoring
[B] The browser apdex is also automatically integrated by Catapult. Browser performance monitoring relates directly to your users' experiences and is the next layer of Monitoring & Insights.
[B] Each New Relic APM instance automatically introduces a browser monitor where JavaScript is injected into the HTML HEAD element and gathers similar data to Google Analytics, including page views and demographics.
[R] Depicted here is the user experience broken down by web application load time, network timing, DOM processing, and page rendering speed.
[R] Tying back to individual user experiences, Browser Apdex provides real world insight into whether users are frustrated, tolerating, or satisfied. The yellow line in the upper right graph is where this is reflected.
Monitoring & Insights
Managed synthetic monitoring
$(curl --silent --show-error --connect-timeout 10 --max-time 20 --write-out "HTTPSTATUS:%{http_code}" \
--request GET "https://synthetics.newrelic.com/synthetics/api/v1/monitors" \
--header "X-Api-Key: $(catapult company.newrelic_admin_api_key)")
[B] In the world of monitoring, availability is king.
[B] Peace of mind can be had by monitoring your websites from around the world.
[R] Discover how DNS lookups, SSL negotiations, and network I/O comprise your total response times.
[R] Catapult automatically integrates a select choice of global synthetic monitors via the Synthetics API.
[R] We should probably stop for a quick shout out and thank you to New Relic for including us in their private beta program along with the likes of IBM and Rackspace.
Catapult Dashboard
[R] We'd like to introduce the Catapult dashboard. The Catapult dashboard is meant to be a holistic insight into your entire infrastructure. Being that Catapult uses a distributed service model, there are many APIs that we can use to get data and then display it in a very actionable way.
[R] The Catapult dashboard aligns with the main configuration structure used throughout, which is: company, environments, websites. As you can see, in this screenshot, at the top of the dashboard, we have the global status of your Catapult instance split up by these three main sections.
[R] This gives you a quick indication of your infrastructure's health. Let's first take a detailed look at the company section.
[B] At this highest level, two components are monitored: APIs and Locale. The monitoring of APIs reflects successful integration with each of the distributed services. A failure of service, or the use of incorrect credentials, for example, would be captured here. Locale has your company name, your verified company email address, and your company's timezone information.
Catapult Dashboard
[B] The next section in the Catapult dashboard covers environments. Here you will find Local Dev, Test, QC, and Production and several metrics for each. Let's focus in on the QC column.
The first metric is the version of Catapult running in each environment's branch.
The second metric is a snapshot of current requests per minute and page views per minute and a color-coded icon indicating the realtime application and browser Apdex.
The third metric are deployment metrics that indicate the number of builds deployed, the average time to build, and how recently a build has finished. The colored icon here represents the success or failure of the most recent build.
The fourth and fifth metrics are Virtual Machine health which is represented by CPU, memory, and disk utilizations. This icon is colored to represent the system Apdex. It's worth noting here that Local Dev is an accumulation of all of your Local Dev instances. The least favorable value for any of these tests will be reported in the Local Dev section.
Catapult Dashboard
[R] The final section of the dashboard, websites, provides a listing of websites with useful information such as the Force Auth value for basic authentication, the type of software in use, and whether the workflow is configured to downstream or upstream. Additionally, the web root location is displayed.
[R] Real time HTTP response status codes are provided for each environment, split up by HTTP and HTTPS responses and whether or not the environment is configured with the Force Auth option.
[R] Wrapping up on the Catapult dashboard, we hope it's now apparent the value to be had by having all of this information clearly displayed in one place.
[R] Speaking of value, here's what happens when we get hyped up about a new project.
[B] I don't know about value, but here goes...
Video playing...
Immediately at the end of the video ctrl + tab to GitHub Catapult README.md
Scroll to Supported Software, then to Setup and scroll from there (need about 8 minutes, check time)
Thank You
Catapult is a complete website and workflow management platform built from leading and affordable technologies.
Our mission is to create a lean platform which orchestrates DevOps for website lifecycles with familiar technologies.
Our vision is to afford organizations reduced risk and improved performance while lowering barriers to entry.
Presentation: devopsgroup.io/conference-session
Get started: github.com/devopsgroup-io/catapult
[B] In closing, Catapult is a complete website and workflow management platform built from leading and affordable technologies.
[B] Our mission is to create a lean platform which orchestrates DevOps for website lifecycles with familiar technologies
[R] Our vision is to afford organizations reduced risk and improved performance while lowering barriers to entry
[R] Please note the links to this presentation as well as Git Hub, which is the best place to get started with Catapult.
[R] For more information on Catapult and plans offered, please visit us at devopsgroup.io. Having said that, we'd like to thank you for attending by offering a subscription promo code which we can provide to you after the session close.
[R] It is now time for Q&A. We'll take the first question please.