Ansible Advanced - Speed UP Playbook Run with async and polling

Ansible Advanced - Speed UP Playbook Run with async and polling

Problem Statement

By design Ansible runs tasks one after other, which keeps the connection to the remote node open until the current running task is completed. And because of this behavior all the subsequent tasks will be blocked. That's not the ideal situation for a DevOps engineer when he run a playbook containing some tasks which takes longer to complete than the SSH session is allowed for. That will eventually cause a timeout.

Few examples of such tasks are -

  • Rebooting the client and waiting for it's comeback for further task running
  • Executing a script which will take longer time to run
  • long-running shell commands or software upgrades

The solution to this situation is - Ansible's Asynchronous mode.

What is Asynchronous mode?

Asynchronous mode allows us to control the playbook execution flow by defining how long-running tasks completes their execution.

To enable Asynchronous mode within Ansible playbook we need to use few parameters such as async, poll.

async - async keyword's value indicates the total time allowed to complete the task. Once that time is over the task will be marked as completed irrespective of the end result. Along with this async also sends the task in the background which can be verified later on its final execution status.

poll - poll keyword allows us to track the status of the job which was invoked by async and running in the background. Its value decides how frequent it would check if the background task is completed or not.

The Poll keyword is auto-enabled whenever you use async and it has a default value as 10 seconds.

When you use poll parameter's value set to positive Ansible will avoid connection timeouts but will still block the next task in your playbook, waiting until the async task either completes, fails or times out.

Enough of theory :) It's time to get our hands dirty!

Example 1 - poll with positive value

Here is my example playbook. I am updating packages on worker nodes and you know this process takes longer. Ansible will wait for 180 seconds for this task to get completed. async will send the task into background and keeps checking for its status every 10 seconds as set by poll value.

---
- name: async and poll example playbook
  hosts: workers
  become: true
  remote_user: ansible_user
  tasks:
    - name: update the system packages
      command: yum update -y
      async: 180 # the total time allowed to complete the package update task
      poll: 10 # Polling Interval in Seconds
      register: package_update

    - name: task-2 to create a test user
      user: name=async_test state=present shell=/bin/bash

Let us execute the playbook.

async $ ansible-playbook -i myinventory async_poll_example1.yml -kK
SSH password:
BECOME password[defaults to SSH password]:

Ansible waits until the first task completes.

async_waits-poll-10-1.png

async_failed-value-less-then-task-completion.png

The first task failed because the task did not complete within the requested time by async which was 180 seconds.

Let me increase poll value to 300 seconds and execute the playbook again.

with_higher_async_value.png

The warning in pink color is because I am using command module to run yum update instead of using yum module itself in first task.

You can use the below code instead the above to avoid that.

- name: update the system packages
      yum: update_cache=yes name=* state=latest
      async: 300 # the total time allowed to complete the package update task
      poll: 10 # Polling Interval in Seconds
      register: package_update

Example 2 - Fire and Forget with poll: 0

As I mentioned above that setting poll value to positive Ansible blocks the next task in your playbook.

To avoid that and continue with our next tasks we can set poll value to 0. This is called Fire and Forget way of running playbooks.

---
- name: async and poll example playbook
  hosts: workers
  become: true
  remote_user: ansible_user
  tasks:
    - name: sleep for 60 seconds
      command: /bin/sleep 60
      async: 80 # the total time allowed to complete the sleep task
      poll: 0 # No need to poll just fire and forget the sleep command
      register: sleeping_node

    - name: task-2 to create a test user
      user: name=async_test-2 state=present shell=/bin/bash

Let us execute the playbook with -v option so that we can see the Job ID of first task to check its status later.

async $ ansible-playbook -i myinventory async_poll_example2.yml -kK -v
Using /etc/ansible/ansible.cfg as config file
SSH password:
BECOME password[defaults to SSH password]:

async_fire-forget-poll-0-sleep.png

You can see from above output that without waiting for the first task to complete Ansible has straightaway ran the second task.

Now let us check the status of first task on both the worker nodes.

async $ ansible -i myinventory worker2 -m async_status -a "jid=903212800377.10413" -b -kK -u ansible_user

async $ ansible -i myinventory worker1 -m async_status -a "jid=681459453994.14119" -b -kK -u ansible_user

If you get error message like could not find job. Then make sure if job was run as become, so should be an async_status. If you try to run async_status for become job without adding become to async_status module, it will fail with this message.

async_status_jid.png

Use async_status within playbook

aync_status module can be used within the playbook as well.

Here is our playbook.

---
- name: async and poll example playbook
  hosts: workers
  become: true
  remote_user: ansible_user
  tasks:
    - name: sleep for 20 seconds
      command: /bin/sleep 20
      async: 30 # the total time allowed to complete the sleep task
      poll: 0 # No need to poll just fire and forget the sleep command
      register: sleeping_node

    - name: task-2 to create a test user
      user: name=async_test-2 state=present shell=/bin/bash

    - name: Checking the Job Status running in background
      async_status:
        jid: "{{ sleeping_node.ansible_job_id }}"
      register: job_result
      until: job_result.finished # Retry within limit until the job status changed to "finished": 1
      retries: 5 # Maximum number of retries to check job status

Let us execute the playbook.

async_status_within_playbook.png

That's all for this article now.

To understand more about Asynchronous actions and polling refer official documentation here .

Hope you like the article. Stay Tuned for more.

Thank you. Happy learning!

Did you find this article valuable?

Support Learn Code Online by becoming a sponsor. Any amount is appreciated!