Yaml Files – Linting and Formatting

Keep the Yaml files in your project tidy is useful. Not only it improves code readability but also helps avoiding misunderstanding and bugs.

Luckily there are many tools available that helps with “linting” and formatting yaml file, I will introduce two of those tools in this blog post. I cover the issue of handling long strings thoroughly in another blog post titled: YAML Files – Formatting Long Strings.

Linting

yamllint is an open-source command line tool written by Adrien Vergé in python and hosted on github under: https://github.com/adrienverge/yamllint/.

You can install it using pip with:

pip install --user yamllint

Once installed, the tool can be invoked directly from the command line with:

yamllint some_file.yaml

The tool checks for the validity of syntax, additionally it also highlights multiple usage of the same key, as well as non-breaking formatting issues such as lines’ width, trailing spaces, indentation, spaces before the comments.

The output would look like the following example depending on the content of the file:

yamllint some_file.yaml
./somme_file.yaml
  24:81     error    line too long (152 > 80 characters)  (line-length)
  26:81     error    line too long (156 > 80 characters)  (line-length)
  39:30     warning  truthy value should be one of [false, true]  (truthy)
  52:81     error    line too long (144 > 80 characters)  (line-length) 

The tool is well maintained and well documented, and the docs can be found under https://yamllint.readthedocs.io/en/stable/.

To use yamllint consistently, it is recommended to add a git hook that invokes yamllint before committing the code to the git repository.

One way to do that is to integrate yamllint with pre-commit a took that magically add git pre-commit hooks to any project. https://pre-commit.com/ is a wonderful tool that I encourage all developers to use to ensure simple mistakes, and bugs are not introduced accidentally to the code. It can be used to enforce code formatting styles. Integrating yamllint with pre-commit is detailed here: https://yamllint.readthedocs.io/en/stable/integration.html.

I you are into automation like me, you would want to know that yamllint exits with 0 when no errors nor warnings have been found and exists with non-zero otherwise.

It is also important to mention that yamllint can be imported as a python package and can be used programmatically to lint yaml files. In fact I use this technique as part of my test-suite to ensure yaml files in a project are passing linting tests for example.

Formatting

The output of yamllint can be used to direct manual formatting. Let’s take the following example:

yamllint some_file.yaml
./somme_file.yaml
  1:1       error    too many blank lines (1 > 0)  (empty-lines)
  2:1       warning  missing document start "---"  (document-start)
  26:14     warning  missing starting space in comment  (comments)
  26:13     warning  missing starting space in comment  (comments)
  28:14     warning  comment not indented like content  (comments-indentation)
  277:81    error    line too long (90 > 80 characters)  (line-length)
  299:27    error    no new line character at the end of file  (new-line-at-end-of-file)

The first two issues indicates that the some_file.yaml does not starts with “—” but with an empty line. To fix this issue, we can replace the empty line with “—” (without the quote). The next warnings concentrates on comments, missing starting space in comment which means the comment token # and the actual comment string after it are not separated by a space. To fix the issue we could manually introduce the space. If we fix the indentation of comments to match the indentation of the content, then we resolve the next warning related to comments at 28:14.

The error in 277:81 is stating that the line is simply too long for the rules set to the yamllint where the limit is 80 characters. This issue is not trivial to fix and that’s why we have dedicated a section for it later in this post, however, the general idea is to break the long string over several lines without introducing newline characters where those are not needed.

The last error can be fixed by adding a new line at the very end of the document.

Manual formatting works. What would be nice if yamllint would fix these issues as they are detected. Like black or other similar tools. Sadly this is not the plan for yamllint.

Luckily, someone thought of this issue already and wrote a hook that integrates with pre-commit and formats broken yaml files. The open-source pre-commit hook is called yamlfmt and can be found here: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt. To use hook add it to the .pre-commit file in combination with yamllint as explained in the documentation

- repo: https://github.com/adrienverge/yamllint.git
  rev: v1.27.1  # or higher tag
  hooks:
      - id: yamllint
        args: [--format, parsable, --strict]

- repo: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt
  rev: 0.2.2  # or other specific tag
  hooks:
      - id: yamlfmt

The auto-formatting hook can fix most issues highlighted by yamllint but not all of them. Notably the long strings errors cannot be resolved with yamlfmt hence the need for more on manual formatting of long strings.

There are other tools that focuses on fixing yaml issues such as yamlfix that can be found under https://github.com/lyz-code/yamlfix, and yamlfixer that can be found under https://github.com/opt-nc/yamlfixer. I might alter or expand this post to include my experience with both tools once I have finished testing them thoroughly.

Handling long strings

Now let us focus on handling long strings. But what does the error actually means? In short it means there are more characters in one line that the rule defined by yamllint allows.

The default value is 80 characters. Any line that has more characters than 80, will trigger the error.

To learn more about handling long string, please refer to my other post: YAML Files – Formatting Long Strings.