The past week, I have improved our documentation experience at work. The issues we have tackled recently ranged from implementing a search endpoint to scrape documentation from multiple different endpoints such as GitLab Wikis, mkdocs websites, among others. Being capable of processing documentation from different applications fits our goal of making easy for developers to write their docs, and we do this by meeting them where they write documentation, instead of forcing them to follow specific patterns and guidelines on how to create documentation.
The first documentation endpoint that I processed was our Gitlab Wikis. When analyzing the layout of the wikis, I realize that it was going to take us more time to properly limit the scraping content to the GitLab Wiki and avoid processing other types of content in the GitLab website (e.g, code, issues, MR, etc.).
After exploring several alternatives, I settled on generating mkdocs websites from Wiki content, mainly for the following reasons:
- The Wiki’s content is written in Markdown, which makes mkdocs a very good match to generate websites with.
- We were already deploying documentation written in Markdown as mkdocs sites for several internal projects.
Wiki Releaser
To automate transforming a GitLab Wiki into a mkdocs website, I created a new GitLab project which hosts the logic to go from cloning the GitLab repo to trigger a deployment in our Kubernetes infrastructure of the generated mkdocs docker image.
The repository contains only 3 files:
mkdocs.yaml
: Configuration file used by mkdocs to produce the website.Dockerfile
: Specification to generate the resulting docker image with the built mkdocs website.gitlab-ci.yaml
: Pipeline configuration used by GitLab to build and deploy changes to the Wikis as websites
mkdocs.yml config
Our mkdocs websites are built using the readthedocs theme, with several CSS customizations that are included through the override.css
file. As part of the image build process (explained later), the <SITE_NAME>
holder is replaced by the actual name of the site to be deployed.
site_name: <SITE_NAME>
extra_css:
- "override.css"
theme:
name: readthedocs
collapse_navigation: false
hljs_languages:
- yaml
markdown_extensions:
- admonition
- fenced_code
- tables
- toc:
slug: true
Dockerfile
The Dockerfile below is built as part of the CI process (explained in the next section), using the invocation:
docker build --build-arg WIKI_FOLDER="$PROJECT_NAME.wiki" --build-arg SITE_NAME="$SITE_NAME" -t <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:$TAG_REF .
which passes down the cloned wiki folder as the WIKI_FOLDER
argument and the SITE_NAME
variable as an argument for the build image process.
Important points in the Dockerfile definition:
- As mentioned in the mkdocs section above, one step is to replace the
<SITE_NAME>
holder with theSITE_NAME
argument in themkdocs.yml
file. - mdkocs expects an
index.md
file, which will be the root page of the website. GitLab Wikis have aHome.md
(orhome.md
) page as the root instead, so the Dockerfile renames it during the build process.
FROM python:3.7.2 as build
RUN pip install mkdocs==1.1.1 && mkdir /site
WORKDIR /site
ARG WIKI_FOLDER
ARG SITE_NAME
COPY $WIKI_FOLDER /site/docs
COPY override.css /site/docs
COPY mkdocs.yml mkdocs.yml
# Replace <SITE_NAME> holder with SITE_NAME argument value
RUN sed -i "s/<SITE_NAME>/${SITE_NAME}/g" mkdocs.yml
RUN if [ -f ./docs/Home.md ]; then mv ./docs/Home.md ./docs/index.md; else mv ./docs/home.md ./docs/index.md; fi
RUN mkdocs build
FROM nginx:1.17.2-alpine
COPY --from=build /site/site /usr/share/nginx/html
GitLab CI
As shown in the .gitlab-ci.yml
pipeline configuration below, our CI/CD process for the Wiki Releaser project has the following requirements:
- It is executed only when a trigger is received
- The trigger needs to provide
PROJECT_PATH
,PROJECT_NAME
,SITE_NAME
variables, otherwise the build job fails:PROJECT_PATH
: Refers to the GitLab path of the project associated with the wiki in the format of<group>/<project_name>
(e.gdevs/hello_world
).PROJECT_NAME
: Refers to the project name of the GitLab project.SITE_NAME
: Refers to the name that will be shown on the mkdocs site.
- It will perform a shallow clone of the wiki repo (latest commit only and no tags) for performance reasons
- It will use the
CI_JOB_TOKEN
variable to authenticate against the wiki repo for the clone. This token provides the user read access to all projects that would be normally accessible to the user creating that job. - It will generate docker images containing the resulting mkdocs website with the following naming pattern:
<internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:$TAG_REF
(TAG_REF
holds the commit SHA value)<internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:latest
- We use helmfile to handle charts deployment to kubernetes
include:
- project: 'cicd'
file: 'ci/helm_pipeline.yaml'
variables: &variables
HELMFILE: docs/helmfile.yaml
DISABLE_STAGING: 'true'
HELMFILE_ENV: wiki
K8S_ENV: <kubernetes_env>
OTHER_K8S_ENV: -e TAG_REF=latest -e PROJECT_NAME=$PROJECT_NAME -e PROJECT_PATH=$PROJECT_PATH # Variables we pass down to our helmfile deployment logic
.trigger: &trigger
only:
refs:
- triggers
after_script:
- echo "Triggered from $PROJECT_PATH wiki using $PROJECT_NAME project and $SITE_NAME for site"
build:
extends: .build
<<: *trigger
image: <internal_docker_repo>/devops/ci-images/docker-with-git:latest
script:
- if [ $PROJECT_PATH == "" ]; then exit 1; fi
- if [ $PROJECT_NAME == "" ]; then exit 1; fi
- if [ "$SITE_NAME" == "" ]; then exit 1; fi
- git clone --depth=1 --no-tags https://gitlab-ci-token:${CI_JOB_TOKEN}@<gitlab_url>/$PROJECT_PATH.wiki.git
- TAG_REF=$(git -C ./$PROJECT_NAME.wiki rev-parse HEAD)
- docker build --build-arg WIKI_FOLDER="$PROJECT_NAME.wiki" --build-arg SITE_NAME="$SITE_NAME" -t <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:$TAG_REF .
- docker push <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:$TAG_REF
- docker tag <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:$TAG_REF <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:latest
- docker push <internal_docker_repo>/devops/wiki-releaser/$PROJECT_PATH:latest
# Deployment to kubernetes
deploy:
<<: *trigger
environment:
name: $PROJECT_PATH
url: https://$PROJECT_NAME.docs.domain
variables:
<<: *variables
Triggers
As I mentioned in the previous section, the GitLab pipeline is only executed with it is invoked by an incoming trigger from another repo that wants to build its wiki. For it to happen, we need to set up some configuration options in each of the repos.
Release of a new Wiki version
We configure a Pipeline Trigger and we use the generated token as the to allow access to the dependent projects that are invoking this pipeline via the trigger.
Dependent projects
On projects that want to invoke Wiki Releaser to generate mkdocs websites for their wikis, we configure a webhook. This webhook will only be triggered for Wiki Page
events and the hook will be a URL to invoke the Wiki Releaser pipeline, passing along with it, the token specified on the trigger definition in the Wiki Releaser project, plus all the required variables (i.e. PROJECT_NAME
, PROJECT_PATH
and SITE_NAME
).
Conclusion
Let’s recap the content of the post. First, I talked a bit about the importance of documentation and how in my company, we try to follow developers practices instead of enforcing new ones. Then, I dove into the main problem which was how to analyze and extract data from GitLab Wikis and why I set on using mkdocs to generate websites as a solution. Finally, I introduced the Wiki Releaser project, what its different components are and their purposes, and how the triggers tie everything together.
To conclude, thank you so much for reading this post. Hope you enjoyed reading it as much as I did writing it. See you soon and stay tuned for more!!