It’s common to have a repository dedicated to store the Dockerfile
definitions and associated files for base docker images used across different projects in the organization. At work, we have such a repo where to store a variety of such images, e.g: ruby
, node
and golang
images. Over time, the number of images has grown significantly and as a result, the size of GitLab CI (.gitlab-ci.yml
) with very similar jobs definitions (definitely not DRY).
Problem at glance
Let’s see below the structure of the repo and the simplified version of the .gitlab-ci.yml
to better illustrate the situation and its issues:
Repo structure
The repo was structured with multiple directories, each one potentially containing multiple subdirectories, with potential Dockerfile
at the different directory levels.
...
├── docker_1
│ ├── Dockerfile
│ └── entrypoint.sh
├── docker_2
│ ├── current.Dockerfile
│ ├── next.Dockerfile
│ └── site
│ └── index.html
├── docker_3
│ ├── api
│ │ ├── default.conf
│ │ └── Dockerfile
│ └── site
│ ├── Dockerfile
│ └── index.html
...
GitLab CI
For each of the Dockerfile
s, the pipeline defined an associated job that builds the image and pushes it to our internal container registry.
# .gitlab-ci.yml
.build: &build
before_script:
- docker login -u $DOCKERHUB\_USERNAME -p $DOCKERHUB\_PASSWORD
- docker login -u gitlab-ci-token -p $CI\_BUILD\_TOKEN <registry_url>
tags:
- pool
stage: build
only:
variables:
- $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
docker_1:
<<: *build
script:
- docker build -t <registry_url>/docker_1:0.1.0 -f docker_1/Dockerfile docker_1/
- docker push <registry_url>/docker_1:0.1.0
docker_2:
<<: *build
script:
- docker build -t <registry_url>/docker_2:current -f docker_2/current.Dockerfile docker_2/
- docker push <registry_url>/docker_2:current
- docker build -t <registry_url>/docker_2:next -f docker_2/next.Dockerfile docker_2/
- docker push <registry_url>/docker_2:next
docker_3_api:
<<: *build
script:
- docker build -t <registry_url>/docker_3/api:0.0.1 -f docker_3/api/Dockerfile docker_3/api
- docker push <registry_url>/docker_3/api:0.0.1
docker_3_site:
<<: *build
script:
- docker build -t <registry_url>/docker_3/site:0.0.1 -f docker_3/site/Dockerfile docker_3/site
- docker push <registry_url>/docker_3/site:0.0.1
...
Issues
Let’s discuss the main issues stemming from the current setup:
All images are built on the default branch pipeline every time, even when they haven’t changed, thus consuming extra build resources unnecessarily and increasing the execution time for the pipeline, and potentially failing the pipeline for unrelated reasons to the changes.
Images aren’t built on branches, therefore developers and DevOps engineers cannot get automated feedback about their proposed changes until they have merged their changes in the default branch.
For the majority of the docker images definitions, the corresponding
.gitlab-ci.yml
job definition follows the same structure:- Build image
- Push image to internal GitLab registry
Solution
After exploring some alternatives, I decided to leverage built-in GitLab capabilities to automate and simplify the procedure of adding/modifying the docker images.
- Dynamic child pipelines: GitLab offers the capability of dynamically trigger child pipelines from a running pipeline using a yaml file generated from a job and stored as an artifact. See the docs for more details.
- Container Registry API: Provides API access to query and modify the GitLab Container Registry, including finding images names, tags, etc. See the docs for more details.
Implementation
The implementation is formed by several components:
- child pipeline generator script: The script takes care of accessing the Registry API to verify if there a new docker or tag defined in the repository that is not present in the Registry, plus creating the child pipeline with the collected docker images data.
- child pipeline template: Used by the above script to define the child pipeline that will be run by Gitlab. It contains the jobs definition to run in the default branch, as well as non default ones.
- Dockerfile & Gemfile: Specify the dependencies and packaging to build the script and be used as part of the repo pipeline
- Repository pipeline: Use the above components to build the docker image to run the script and trigger the child pipeline.
If you want to run the code below as part of a GitLab repo, you will need to do the following changes:
<registry_url>
: replaced by the URL of your registry (either internal or DockerHub)<API_URL>
: replaced by the API URLCI_TOKEN
: When I originally implemented this I had to use a personal token to be able to interact with the Registry API. In more recent versions, the CI/CD Job token may have enough access to fulfill these requirements.- Define at least one
Dockerfile
. - Define at least one
build.yml
at the same folder level of the aboveDockerfile
.
Child pipeline generator script
The script is main building block. It works as follows:
- Gets the list of all image repositories defined in the GitLab project.
- Parses all existing
build.yml
and*.build.yml
files in the repository and prepares the metadata to generate the jobs in the child pipeline. Abuild.yml
file provides the following information:
name: <image_name>
tag: <image_tag>
- Checks if any of the docker images found in the metadata are not present in the Registry already (either the image name if it is new or the tag if updating an existing image).
- Applies the data to the template to generate the final child pipeline artifact.
#! /usr/bin/env ruby
# frozen_string_literal: true
# generate_custom_images_pipeline.rb
require 'erb'
require 'yaml'
require 'ostruct'
require 'pathname'
require 'gitlab'
def sanitize_name(internal_name, tag)
"#{internal_name}_#{tag}"
end
def full_internal_name(name)
"<registry_url>/#{name}"
end
def prepare_dockerfile_name(build_filename)
prefix = build_filename.to_s.delete_suffix('build.yml')
dockerfile_name = 'Dockerfile'
if prefix != "" # this means that we have a prefixed build.yml e.g <name>.build.yml
# prefix include the last dot
dockerfile_name = "#{prefix}Dockerfile"
end
dockerfile_name
end
Gitlab.configure do |config|
config.endpoint = '<GITLAB_API_URL>'
config.private_token = ENV.fetch('CI_TOKEN')
end
REPO_DIR = Pathname(ENV.fetch('CI_PROJECT_DIR'))
BASE_DIR = Pathname(File.expand_path(File.dirname(__FILE__)))
CHILD_PIPELINE_TEMPLATE_PATH = BASE_DIR.join('pipeline_custom_images.yaml.erb')
CHILD_PIPELINE_OUTPUT_PATH = REPO_DIR.join('pipeline_custom_images.yaml')
PROJECT_ID = ENV.fetch('CI_PROJECT_ID')
puts "Getting list of existing images"
repositories = Gitlab.registry_repositories(PROJECT_ID)
.auto_paginate
.to_h { |repo| [repo.name, repo] }
custom_images = []
puts "Detect custom images with automated build process"
REPO_DIR.glob('**/*build.yml').each do |build_path|
puts "Processing #{build_path}"
dockerfile_name = prepare_dockerfile_name(build_path.basename)
context_path = build_path.parent
dockerfile_path = context_path.join(dockerfile_name)
raise "Expected Dockerfile at #{dockerfile_path} not found" unless dockerfile_path.exist?
build_config = YAML.safe_load(build_path.read)
unless build_config.key?('name') && build_config.key?('tag')
raise "Invalid build config found at #{context_path}. It needs both name and tag"
end
full_internal_name = full_internal_name(build_config['name'])
tag = build_config['tag']
custom_images << OpenStruct.new(
internal_name: build_config['name'],
full_internal_name: full_internal_name,
tag: tag,
sanitized_name: sanitize_name(build_config['name'], tag),
dockerfile_path: dockerfile_path.relative_path_from(REPO_DIR).to_s,
context: context_path.relative_path_from(REPO_DIR).to_s
)
end
non_published_images = []
custom_images.each do |image|
if !repositories.key?(image.internal_name)
puts "#{image.internal_name} image | Not found in container registry. Adding tag"
non_published_images << image
next
end
puts "#{image.internal_name} image | Getting tags from container registry"
repo = repositories[image.internal_name]
begin
Gitlab.registry_repository_tag(repo.project_id, repo.id, image.tag)
rescue Gitlab::Error::NotFound => e
puts "#{image.internal_name} image | Tag #{image.tag} not found. Adding tag"
non_published_images << image
next
end
end
if non_published_images.empty?
puts "No new custom images or tags to build"
else
puts "Custom images to build: #{non_published_images.map(&:sanitized_name)}"
end
puts "Loading pipeline template from #{CHILD_PIPELINE_TEMPLATE_PATH}"
template = ERB.new(CHILD_PIPELINE_TEMPLATE_PATH.read, trim_mode: '<>')
puts "Generate child pipeline at #{CHILD_PIPELINE_OUTPUT_PATH}"
# Evaluate the custom_images_pipeline.yaml.erb template
# with the processed data from the different build.yml
CHILD_PIPELINE_OUTPUT_PATH.open(mode="w") do |f|
f << template.result_with_hash(
non_published_images: non_published_images
)
end
Child pipeline template
The child pipeline declares the job to run for new images or images that have changed. It haves 3 types of jobs:
new_<%= image.sanitized_name %>_branch
: Templatized job to run when the child pipeline is triggered in non-default branches. It builds the Dockerfile, but doesn’t push it to the registry. We use it as to validate that the Dockerfile can be built succesfully before merging to the default branchnew_<%= image.sanitized_name %>_default
: Templatized job to run when the child pipeline is triggered in the default branch. It builds the Dockerfile and the tags with the tag value extracted from thebuild.yml
and pushes it to the registryno_changes
: Fallback job to be use if there are no other jobs to execute in the pipeline (either nobuild.yml
was changed nor a new Dockerfile definition was added). This is required because a valid pipeline requires at least on job to run.
# .template-gitlab-ci.yml.erb
stages:
- build
.build_keys: &build_keys
before_script:
- docker login -u $DOCKERHUB_USERNAME -p $DOCKERHUB_PASSWORD
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN <registry_url>
stage: build
tags:
- pool
.image_in_branch:
extends: .build_keys
only:
refs:
- branches
except:
variables:
- $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
.image_in_default:
extends: .build_keys
only:
variables:
- $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH>)
<% non_published_images.each do |image| %>
new_<%= image.sanitized_name %>_branch:
extends: .image_in_branch
script:
- docker build -t <%= "#{image.full_internal_name}:#{image.tag}" %> -f <%= image.dockerfile_path %> <%= image.context %>
<% end %>
<% non_published_images.each do |image| %>
new_<%= image.sanitized_name %>_default:
extends: .image_in_default
script:
- docker build -t <%= "#{image.full_internal_name}:#{image.tag}" %> -f <%= image.dockerfile_path %> <%= image.context %>
- docker tag <%= "#{image.full_internal_name}:#{image.tag}" %> <%= image.full_internal_name %>:$CI_COMMIT_SHA
- docker push <%= "#{image.full_internal_name}:#{image.tag}" %>
- docker push <%= image.full_internal_name %>:$CI_COMMIT_SHA
<% end %>
<% if non_published_images.length == 0 %>
no_changes:
stage: build
tags:
- pool
script:
- echo "No new custom images or tags are being added/changed"
<% end %>
Dockerfile & Gemfile
This 2 files define how the software is going to be packaged (Dockerfile
) and its dependencies (Gemfile
). For this automation, the only library used is the gitlab gem.
FROM <registry_url>/ruby:2.7.2-alpine
WORKDIR /pipelines
COPY . .
RUN bundle check || bundle install
# frozen_string_literal: true
source 'https://rubygems.org'
# BUNDLE_GEMFILE=Gemfile bundle install
gem 'gitlab'
Repository pipeline
Finally, the repo pipeline ties everything together as follows:
build_child_pipelines_image
: Builds the docker image from theDockerfile
underchild_images_pipeline
folder and tags it aslatest
(only if there are changes, otherwise the job is skipped)ci_yaml_for_custom_images
: Build thegitlab-ci.yml
definition for the dynamic child pipeline by running the script and leveraging the pipeline.erb
template. The built pipeline file (pipeline_custom_images.yaml
) is then saved as artifact to be consume for the subsequent job.trigger_pipeline_for_custom_images
: Trigger the child pipeline usingpipeline_custom_images.yaml
as the its definition.
# .gitlab-ci.yml
stages:
- build_pipeline_image
- build_child_pipelines
- build_downstream
build_child_pipelines_image:
before_script:
- docker login -u $DOCKERHUB_USERNAME -p $DOCKERHUB_PASSWORD
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN <registry_url>
tags:
- pool
stage: build_pipeline_image
only:
variables:
- $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
changes:
- pipelines/*
script:
- docker build -t <registry_location>/child-images-pipeline:latest -f child_images_pipeline/Dockerfile child_images_pipeline/
- docker push <registry_location>/child-images-pipeline:latest
ci_yaml_for_custom_images:
tags:
- pool
stage: build_child_pipelines
artifacts:
paths:
- pipeline_custom_images.yaml
script:
- cd /pipelines && bundle exec ruby ./generate_custom_images_pipeline.rb
image: <registry_location>/child-images-pipeline:latest
trigger_pipeline_for_custom_images:
stage: build_downstream
needs:
- ci_yaml_for_custom_images
trigger:
include:
- artifact: pipeline_custom_images.yaml
job: ci_yaml_for_custom_images
strategy: depend
Next steps
With this solution in place, let’s briefly mention some areas that it could be improved in the future:
- Automatic cleanup of images from the registry if the directory or the
Dockerfile
is removed from the repository. This would require having instrumentation to evaluate the usage of the image from the registry - Break down the ruby logic in the script and add tests to verify behavior and prevent regressions. This would need changes in the repo pipeline to include at least a new testing job.