Automating GitLab pipelines for base docker images

It’s common to have a repository dedicated to store the Dockerfile definitions and associated files for base docker images used across different projects in the organization. At work, we have such a repo where to store a variety of such images, e.g: ruby, node and golang images. Over time, the number of images has grown significantly and as a result, the size of GitLab CI (.gitlab-ci.yml) with very similar jobs definitions (definitely not DRY).

Problem at glance

Let’s see below the structure of the repo and the simplified version of the .gitlab-ci.yml to better illustrate the situation and its issues:

Repo structure

The repo was structured with multiple directories, each one potentially containing multiple subdirectories, with potential Dockerfile at the different directory levels.

├── docker_1
│   ├── Dockerfile
│   └──
├── docker_2
│   ├── current.Dockerfile
│   ├── next.Dockerfile
│   └── site
│       └── index.html
├── docker_3
│	├── api
│	│   ├── default.conf
│	│   └── Dockerfile
│   └── site
│		├── Dockerfile
│       └── index.html

GitLab CI

For each of the Dockerfiles, the pipeline defined an associated job that builds the image and pushes it to our internal container registry.

# .gitlab-ci.yml

.build: &build
	- docker login -u gitlab-ci-token -p $CI\_BUILD\_TOKEN <registry_url>
    - pool
  stage: build

  <<: *build
    - docker build -t <registry_url>/docker_1:0.1.0 -f docker_1/Dockerfile docker_1/
	- docker push <registry_url>/docker_1:0.1.0

  <<: *build
    - docker build -t <registry_url>/docker_2:current -f docker_2/current.Dockerfile docker_2/
	- docker push <registry_url>/docker_2:current
    - docker build -t <registry_url>/docker_2:next -f docker_2/next.Dockerfile docker_2/
	- docker push <registry_url>/docker_2:next

  <<: *build
    - docker build -t <registry_url>/docker_3/api:0.0.1 -f docker_3/api/Dockerfile docker_3/api
	- docker push <registry_url>/docker_3/api:0.0.1

  <<: *build
    - docker build -t <registry_url>/docker_3/site:0.0.1 -f docker_3/site/Dockerfile docker_3/site
	- docker push <registry_url>/docker_3/site:0.0.1


Let’s discuss the main issues stemming from the current setup:

  • All images are built on the default branch pipeline every time, even when they haven’t changed, thus consuming extra build resources unnecessarily and increasing the execution time for the pipeline, and potentially failing the pipeline for unrelated reasons to the changes.

  • Images aren’t built on branches, therefore developers and DevOps engineers cannot get automated feedback about their proposed changes until they have merged their changes in the default branch.

  • For the majority of the docker images definitions, the corresponding .gitlab-ci.yml job definition follows the same structure:

    1. Build image
    2. Push image to internal GitLab registry


After exploring some alternatives, I decided to leverage built-in GitLab capabilities to automate and simplify the procedure of adding/modifying the docker images.

  • Dynamic child pipelines: GitLab offers the capability of dynamically trigger child pipelines from a running pipeline using a yaml file generated from a job and stored as an artifact. See the docs for more details.
  • Container Registry API: Provides API access to query and modify the GitLab Container Registry, including finding images names, tags, etc. See the docs for more details.


The implementation is formed by several components:

  • child pipeline generator script: The script takes care of accessing the Registry API to verify if there a new docker or tag defined in the repository that is not present in the Registry, plus creating the child pipeline with the collected docker images data.
  • child pipeline template: Used by the above script to define the child pipeline that will be run by Gitlab. It contains the jobs definition to run in the default branch, as well as non default ones.
  • Dockerfile & Gemfile: Specify the dependencies and packaging to build the script and be used as part of the repo pipeline
  • Repository pipeline: Use the above components to build the docker image to run the script and trigger the child pipeline.

If you want to run the code below as part of a GitLab repo, you will need to do the following changes:

  • <registry_url>: replaced by the URL of your registry (either internal or DockerHub)
  • <API_URL>: replaced by the API URL
  • CI_TOKEN: When I originally implemented this I had to use a personal token to be able to interact with the Registry API. In more recent versions, the CI/CD Job token may have enough access to fulfill these requirements.
  • Define at least one Dockerfile.
  • Define at least one build.yml at the same folder level of the above Dockerfile.

Child pipeline generator script

The script is main building block. It works as follows:

  1. Gets the list of all image repositories defined in the GitLab project.
  2. Parses all existing build.yml and *.build.yml files in the repository and prepares the metadata to generate the jobs in the child pipeline. A build.yml file provides the following information:
name: <image_name>
tag: <image_tag>
  1. Checks if any of the docker images found in the metadata are not present in the Registry already (either the image name if it is new or the tag if updating an existing image).
  2. Applies the data to the template to generate the final child pipeline artifact.
#! /usr/bin/env ruby
# frozen_string_literal: true
# generate_custom_images_pipeline.rb

require 'erb'
require 'yaml'
require 'ostruct'
require 'pathname'
require 'gitlab'

def sanitize_name(internal_name, tag)

def full_internal_name(name)

def prepare_dockerfile_name(build_filename)
  prefix = build_filename.to_s.delete_suffix('build.yml')
  dockerfile_name = 'Dockerfile'
  if prefix != "" # this means that we have a prefixed build.yml e.g <name>.build.yml
 # prefix include the last dot
    dockerfile_name = "#{prefix}Dockerfile"

Gitlab.configure do |config|
  config.endpoint       = '<GITLAB_API_URL>'
  config.private_token  = ENV.fetch('CI_TOKEN')

REPO_DIR = Pathname(ENV.fetch('CI_PROJECT_DIR'))
BASE_DIR = Pathname(File.expand_path(File.dirname(__FILE__)))
CHILD_PIPELINE_TEMPLATE_PATH = BASE_DIR.join('pipeline_custom_images.yaml.erb')
CHILD_PIPELINE_OUTPUT_PATH = REPO_DIR.join('pipeline_custom_images.yaml')

puts "Getting list of existing images"
repositories = Gitlab.registry_repositories(PROJECT_ID)
  .to_h { |repo| [, repo] }

custom_images = []

puts "Detect custom images with automated build process"
REPO_DIR.glob('**/*build.yml').each do |build_path|
  puts "Processing #{build_path}"
  dockerfile_name = prepare_dockerfile_name(build_path.basename)
  context_path = build_path.parent
  dockerfile_path = context_path.join(dockerfile_name)

  raise "Expected Dockerfile at #{dockerfile_path} not found" unless dockerfile_path.exist?

  build_config = YAML.safe_load(

  unless build_config.key?('name') && build_config.key?('tag')
    raise "Invalid build config found at #{context_path}. It needs both name and tag"

  full_internal_name = full_internal_name(build_config['name'])
  tag = build_config['tag']
  custom_images <<
    internal_name: build_config['name'],
    full_internal_name: full_internal_name,
    tag: tag,
    sanitized_name: sanitize_name(build_config['name'], tag),
    dockerfile_path: dockerfile_path.relative_path_from(REPO_DIR).to_s,
    context: context_path.relative_path_from(REPO_DIR).to_s

non_published_images = []

custom_images.each do |image|
  if !repositories.key?(image.internal_name)
    puts "#{image.internal_name} image | Not found in container registry. Adding tag"
    non_published_images << image

  puts "#{image.internal_name} image | Getting tags from container registry"
  repo = repositories[image.internal_name]

    Gitlab.registry_repository_tag(repo.project_id,, image.tag)
  rescue Gitlab::Error::NotFound => e
    puts "#{image.internal_name} image | Tag #{image.tag} not found. Adding tag"
    non_published_images << image

if non_published_images.empty?
  puts "No new custom images or tags to build"
  puts "Custom images to build: #{}"

puts "Loading pipeline template from #{CHILD_PIPELINE_TEMPLATE_PATH}"
template =, trim_mode: '<>')

puts "Generate child pipeline at #{CHILD_PIPELINE_OUTPUT_PATH}"
# Evaluate the custom_images_pipeline.yaml.erb template
# with the processed data from the different build.yml"w") do |f|
  f << template.result_with_hash(
    non_published_images: non_published_images

Child pipeline template

The child pipeline declares the job to run for new images or images that have changed. It haves 3 types of jobs:

  • new_<%= image.sanitized_name %>_branch: Templatized job to run when the child pipeline is triggered in non-default branches. It builds the Dockerfile, but doesn’t push it to the registry. We use it as to validate that the Dockerfile can be built succesfully before merging to the default branch
  • new_<%= image.sanitized_name %>_default: Templatized job to run when the child pipeline is triggered in the default branch. It builds the Dockerfile and the tags with the tag value extracted from the build.yml and pushes it to the registry
  • no_changes: Fallback job to be use if there are no other jobs to execute in the pipeline (either no build.yml was changed nor a new Dockerfile definition was added). This is required because a valid pipeline requires at least on job to run.
# .template-gitlab-ci.yml.erb

  - build

.build_keys: &build_keys
  - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN <registry_url>
  stage: build
  - pool

  extends: .build_keys
    - branches

  extends: .build_keys

<% non_published_images.each do |image| %>
new_<%= image.sanitized_name %>_branch:
  extends: .image_in_branch
  - docker build -t <%= "#{image.full_internal_name}:#{image.tag}" %> -f <%= image.dockerfile_path %> <%= image.context %>
<% end %>

<% non_published_images.each do |image| %>
new_<%= image.sanitized_name %>_default:
  extends: .image_in_default
  - docker build -t <%= "#{image.full_internal_name}:#{image.tag}" %> -f <%= image.dockerfile_path %> <%= image.context %>
  - docker tag <%= "#{image.full_internal_name}:#{image.tag}" %> <%= image.full_internal_name %>:$CI_COMMIT_SHA
  - docker push <%= "#{image.full_internal_name}:#{image.tag}" %>
  - docker push <%= image.full_internal_name %>:$CI_COMMIT_SHA
<% end %>

<% if non_published_images.length == 0 %>
  stage: build
  - pool
  - echo "No new custom images or tags are being added/changed"
<% end %>

Dockerfile & Gemfile

This 2 files define how the software is going to be packaged (Dockerfile) and its dependencies (Gemfile). For this automation, the only library used is the gitlab gem.

FROM <registry_url>/ruby:2.7.2-alpine

WORKDIR /pipelines

COPY . .

RUN bundle check || bundle install
# frozen_string_literal: true
source ''

# BUNDLE_GEMFILE=Gemfile bundle install
gem 'gitlab'

Repository pipeline

Finally, the repo pipeline ties everything together as follows:

  1. build_child_pipelines_image: Builds the docker image from the Dockerfile under child_images_pipeline folder and tags it as latest (only if there are changes, otherwise the job is skipped)
  2. ci_yaml_for_custom_images: Build the gitlab-ci.yml definition for the dynamic child pipeline by running the script and leveraging the pipeline .erb template. The built pipeline file (pipeline_custom_images.yaml) is then saved as artifact to be consume for the subsequent job.
  3. trigger_pipeline_for_custom_images: Trigger the child pipeline using pipeline_custom_images.yaml as the its definition.
# .gitlab-ci.yml

- build_pipeline_image
- build_child_pipelines
- build_downstream

  - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN <registry_url>
  - pool
  stage: build_pipeline_image
    - pipelines/*
  - docker build -t <registry_location>/child-images-pipeline:latest -f child_images_pipeline/Dockerfile child_images_pipeline/
  - docker push <registry_location>/child-images-pipeline:latest

  - pool
  stage: build_child_pipelines
    - pipeline_custom_images.yaml
  - cd /pipelines && bundle exec ruby ./generate_custom_images_pipeline.rb
  image: <registry_location>/child-images-pipeline:latest

  stage: build_downstream
  - ci_yaml_for_custom_images
    - artifact: pipeline_custom_images.yaml
      job: ci_yaml_for_custom_images
    strategy: depend

Next steps

With this solution in place, let’s briefly mention some areas that it could be improved in the future:

  • Automatic cleanup of images from the registry if the directory or the Dockerfile is removed from the repository. This would require having instrumentation to evaluate the usage of the image from the registry
  • Break down the ruby logic in the script and add tests to verify behavior and prevent regressions. This would need changes in the repo pipeline to include at least a new testing job.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer’s view in any way.