
23 Jun 2021 Configuring Validation with an Ansible Role
AEM OpenCloud is an open-source platform for hosting Adobe Experience Manager (AEM) on the AWS Cloud. Its provisioning processes are mainly driven by Ansible playbooks with YAML configuration files containing over 100 parameters. It is highly configurable, but this configurability means that there are a lot of parameters to get right, and many opportunities for typos and the use of unsupported values. These lead to a failure of stack creation, and lengthy turnaround times, waiting for resource cleanup after the failure. We can save time by taking a fail-fast approach, validating the parameters before committing to the provisioning of resources.
In order to detect the problematic variables early, I developed an Ansible role to validate configurations based on schemas, using Cerberus. Cerberus is “a lightweight and extensible data validation library for Python”. This Ansible role can be executed at the beginning of a playbook to ensure that only validated values are used.
In this post, I am introducing a custom Ansible Role, named vars_schema_validator, for validating parameter values against specific rules and restrictions. A user of vars_schema_validator must provide a correct YAML schema file path in a playbook task; we have already created a suitable schema for AEM OpenCloud, but this role is not limited to that. The following diagram shows the system architecture of AWS deployment which leverages Ansible with the variable validator role.

1 Background
Before diving into details, let’s get started with two key concepts.
1.1 What is an Ansible Role?
The Role is a robust feature of Ansible, facilitating reuse and modularization of configuration. A role usually contains a group of variables, tasks, files and handlers with a standardized file structure. It is a similar concept to Puppet Modules. Like python packages, before using them, you need to run installation. Ansible Roles can be managed by Ansible Galaxy. Use ansible-galaxy install <the name of role>
command to download roles from the Galaxy server.
After installing all required roles, you are able to include them in any playbook project. By default, Ansible will search for roles in a roles/
directory relative to the playbook file and in /etc/ansible/roles
. In Ansible 1.4 and later you can configure an additional role_path
to search for roles. For more details refer to Role Search Path.
An example playbook is shown as follows. The role is executed before tasks running in sequence.
---
- name: Playbook example
hosts: localhost
roles:
- role: '/path/to/your/role'
tasks:
- debug: msg=hello
1.2 What is a validation schema?
A validation schema is a YAML file, defining the names of all of the parameters and rules which their values must follow. The YAML schema must meet the validation rules defined in Cerberus. For example, if a system supports only Redhat 7 and Centos 7 operating systems, then the schema for the os_type
parameter could be:
os_type:
type: string
allowed:
- rhel7
- centos7
2 Validating configuration with Validator
This section shows steps to setup the environment and to consume the validator in a playbook with an example. Here are some key features supported by vars_schema_validator:
- support Ansible 2, 3 and 4
- support validation schema reference via a custom keyword
!include
2.1 Installation
- Install required python packages by running:
$ pip install cerberus==1.3.2
$ pip install ansible==4.0.0
- Install the vars_schema_validator role by running:
$ ansible-galaxy install shinesolutions_opensource.vars_schema_validator
2.2 Usage
Add the role to the playbook and specify where the validator will find the schema it is to use:
roles:
- role: 'shinesolutions_opensource.vars_schema_validator'
vars:
schema: 'path/to/your/schema.yaml'
2.3 Example
Let’s look at how we might use vars_schema_validator to validate a simple YAML configuration file, which we’ll call vars.yaml
:
---
global:
name: Little Joe
age: 15
group1:
os_type: windows
url: s3://xxx.xxx.xxx.xxx/xxxxxxx
jvm_mem_opts: -Xms4096m
user: admin
To validate this configuration we need a schema. It is good practice to create our schema in modular pieces, so we can re-use them. For this example, we’ll demonstrate that by breaking out the schema for group1
into a second file. First we create our top-level in schema.yaml
:
---
global:
type: dict
required: true
allow_unknown: false
empty: false
schema:
name:
type: string
age:
type: integer
min: 10
max: 20
group1:
required: true
schema: !include sub_schema.yaml
and then we need to define sub_schema.yaml
to describe group1
:
---
os_type:
type: string
allowed:
- rhel7
- amazon-linux2
- centos7
url:
type: string
regex: (s3://|http://|https://|file://)(.+)
jvm_mem_opts:
type: string
allowed:
- -Xss4m
- -Xms4096m
- -Xmx8192m
user:
type: string
forbidden:
- root
- admin
To run our example validation we need a playbook, so we will create playbook.yml
which loads the vars.yaml
file, and uses our role with the schema.yaml
we created above:
---
- name: Test vars-schema-validator
hosts: localhost
vars_files:
- ./vars.yaml
roles:
- role: 'vars_schema_validator'
vars:
schema: './schema.yaml'
And now we can see the results of our simple example by running playbook.yml:
$ ansible-playbook playbook.yml
The validator fails this configuration and tells us where the problems lie:
TASK [vars_schema_validator : Validate Ansible variables using the provided YAML schema] ***
fatal: [localhost]: FAILED! => changed=false
msg:
global:
- group1:
- os_type:
- unallowed value windows
user:
- unallowed value admin
Two variables violate the rules. One uses a value not in the allowed list. The other uses a forbidden value. For more details, please refer to GitHub – shinesolutions/ansible-role-vars-schema-validator: Ansible role for validating vars against a schema.
3 Conclusion
vars-schema-validator is useful to ensure that your playbooks are running with configurations that satisfy specified schemas. You could discover these mistakes the hard way, running the playbooks, having provisioning fail, and trying again after the resource cleanup completes, but this is faster. The schema can also help with diagnostics, showing a list of accepted values, what values are not acceptable, or even the pattern of an acceptable value. The schemas are self-documenting for developers, which helps with tracking system changes when the schemas are in source control.
4 References
Galaxy.ansible.com. 2021. Galaxy Documentation — Ansible Documentation. [online] Available at: https://galaxy.ansible.com/docs/index.html [Accessed 2 June 2021].
Docs.ansible.com. 2021. Roles — Ansible Documentation. [online] Available at: https://docs.ansible.com/ansible/latest/user_guide/playbooks_reuse_roles.html [Accessed 2 June 2021].
docs.python-cerberus.org. (n.d.). Validation Rules — Cerberus is a lightweight and extensible data validation library for Python. [online] Available at: https://docs.python-cerberus.org/en/stable/validation-rules.html [Accessed 2 Jun. 2021].
En.wikipedia.org. 2021. Fail-fast – Wikipedia. [online] Available at: <https://en.wikipedia.org/wiki/Fail-fast> [Accessed 15 June 2021].
No Comments