CI/CD pipeline for DataStructures101 gem

CI/CD, which stand in this post for Continuous Integration/Continuous Delivery, are very hot concepts nowadays in the software industry. There is ample literature around on the web explaining the benefits it brings to organizations and software projects.

What is CI/CD?

In essence, CI/CD can be seen as two close activities working together towards a common goal: reducing risks of the overall software development cycle. For a more extensive discussion, I suggest you this article by Martin Fowler.

Continuous Integration

Continuous Integration, per the Wikipedia definition, stands as the practice of merging all the working copies of developer to a shared mainline copy of the codebase. The number of times the developer copies are integrated can wildly vary depending on the context (in 2011, Amazon was integrating and shipping production code every 11.6 seconds), but the minimum recommended lower bound should be at least once per day.

Continuous Delivery

The idea behind Continuous Delivery is that your software should be ready to deploy at any point of its life cycle. This requires an extensive automation of all the parts involved in the delivery process. As a result, the deployment risks are minimized, the progress can be tracked better by using production-like environment and the quality and speed of feedback increases.

Tools for CI/CD

Since I began to work on the DataStructures101, I pushed myself to follow best practices related to software development life-cycle. As I have iterated the implementation I have been continuously adding new tools to help improve the overall quality of the codebase.

In each of the next sections, I discuss the tools I am using and what their main benefits are. This is not a comprehensive list of the available tools out there, instead it is what I have found suitable for my development workflow in Ruby.

Rubocop for static analysis

Rubocop is a static analyzer for Ruby projects. It follows best practices created by the community and will enforce it on your codebase

Yard for automated docs generation

Yard is a documentation generation tool. It comes with multiple features that ease the documentation process, including templates, tag manipulation, live documentation, etc.

CodeClimate for code quality and coverage

CodeClimate is a quality control tool that automates the process of checking the health of your codebase. It supports a multitude of metrics to validate the software maintainability and to fight the technical debt that accumulates over time. At the same time, CodeClimate doubles as a code coverage tool that can be plugged in our CI process.

Travis as CI/CD pipeline

Travis is a CI server free for open source projects and it is the default choice when creating a new gem with bundler on Linux. It supports multiple languages and different workflows suitable for different projects.

Workflow

So far, we have mentioned several tools that address different problems in the development of the software. But using them independently would miss the whole point of having in place a trusted process that can be predictable and repeatable.

GitHub webhooks

Webhooks are the magic components that notify and make everything work together. Since we are using 2 external services, namely Travis and CodeClimate, we need to allow them to access the git repository and get events notifications from GitHub. These can be changed in the Settings section of your GitHub repo, specifically in the Webhooks and Integration & services tabs.

Travis config

This is where all the goodies come together. Using a travis.yml file, Travis can learn and execute the build process for your CI/CD pipeline. The build process consists of three main steps:

install: install any dependencies required
script: run the build and/or test script
deploy: deploy to defined continuous deployment providers (optional)

The aforementioned steps are highly configurable and can include both before and after hooks as shown below.

sudo: false
language: ruby
rvm:
- 2.4.2
env:
  global:
  - CI=true
  - NOKOGIRI_USE_SYSTEM_LIBRARIES=true
  - CC_TEST_REPORTER_ID=TEST_REPORTERT_TOKEN_VALUE
before_script:
- curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
- chmod +x ./cc-test-reporter
- ./cc-test-reporter before-build
script:
- bundle exec rspec
after_script:
- ./cc-test-reporter after-build --exit-code $TRAVIS_TEST_RESULT
before_deploy:
- cat ./README.md >> ./docs/index.md
deploy:
- provider: pages
  skip_cleanup: true
  github_token: "$GITHUB_TOKEN"
  local_dir: "./docs"
  target_branch: gh-pages
  on:
    branch:
      - develop
      - master
- provider: rubygems
  api_key:
    secure: SECURE_KEY_VALUE
  gem: data_structures_101
  on:
    tags: true
    repo: renehernandez/data_structures_101
    branch: master

In the above configuration, notice that we don’t specify the install step and thus, Travis uses the default one, which is bundle install --jobs=3 --retry=3. In the script step, Travis will run our specs suite and will report the code coverage result to CodeClimate using their test reporter. The configuration has two possible deploy steps. First, it can deploy an update of the lived documentation using GitHub Pages. This only happens if Travis is building either the master or develop branch and it uses the docs folder as root for the docs website. Secondly, if the commit is a tagged commit from master, it will proceed to deploy a new gem version to RubyGems

CodeClimate config

CodeClimate performs the duties of code coverage tool while being part of the CI pipeline and reporting the result of the spec. For the code analysis, CodeClimate uses a .codeclimate.yml configuration, shown below, to specify the enabled types of analyses and to what sections of the gem they should be applied.

engines:
  rubocop:
    enabled: true
ratings:
  paths:
  - lib/**
  - "**.rb"
exclude_paths:
- "*.gemspec"
- "*.md"
- Rakefile
- Gemfile
- docs/**/*
- task/**/*
- coverage/**/*

The different available analyses are named as engines and for our gem, currently we have enabled the rubocop engine to perform static analysis of the Ruby code. If there is no rubocop configuration in the same folder, CodeClimate will resort to use their default configuration with the existing tweaks on the .codeclimate.yml file. Nevertheless, our gem has it own detailed configuration for the cops.

AllCops:
  Exclude:
    - bin/**/*
    - tasks/**/*
  TargetRubyVersion: 2.4
Metrics/MethodLength:
  Max: 30
  CountComments: false
Metrics/AbcSize:
  Exclude:
    - spec/**/*
    - perf/**/*
  Max: 20
Metrics/CyclomaticComplexity:
  Max: 10
Metrics/BlockLength:
  Exclude:
    - spec/**/*
    - perf/**/*
Metrics/LineLength:
  Exclude:
    - spec/**/*
    - perf/**/*
Style/FrozenStringLiteralComment:
  EnforcedStyle: always

Conclusion

In this post, we have seen a relatively simple setup for a RubyGems that includes performing tests, reporting code coverage on those tests and deploying to different providers when certain conditions are met. Although specific for a Ruby gem, it shows general ideas and practices that can be translated to different scenarios. This setup is likely to expand in the future with more analysis tools to validate the code.

Having a solid CI/CD process is always a work in progress. It is very likely that project needs change over time or new requirements are created. However, the gains and peace of mind we get from a tuned building and deploying process greatly outweigh all the work it requires.

Well, this is the end for this reading. Thanks a lot for reading it completely. Hope you find something useful from this iteration of my CI/CD pipeline for the DataStructures101 gem. For me, it has been a rewarding experience that allowed me to apply best practices from a devops perspective.

See you soon and stay tuned for more!!