Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of HTTPAPI Plugin takes more execution time (DCNE-299) #718

Open
jordiasla opened this issue Jan 26, 2025 · 12 comments
Open

Use of HTTPAPI Plugin takes more execution time (DCNE-299) #718

jordiasla opened this issue Jan 26, 2025 · 12 comments
Labels
bug Something isn't working jira-sync Sync this issue to Jira

Comments

@jordiasla
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

  • Use of HTTPAPI Plugin takes more execution time

Affected Module Name(s):

  • aci_bd (but I guess it also apply on all modules)

APIC version and APIC Platform

  • V 6.0(7e) and on-prem

Collection versions

  • cisco.aci 2.10.1

Expected Behavior

To my understanding the ACI HTTPAPI plugin minimizes the time execution of the playbooks since it no longer requires login for every API call. I have used it with a role that creates multiple BDs over a loop and measured the time required for the execution.

Actual Behavior

In reality, when looping over 190 BDs, it took more than 3 minutes to execute whereas it took a bit more that one minute when executing without the ACI HTTPAPI plugin. There is a big chance that I am not using the plugin as required so the playbooks can be found below.

Playbook tasks to Reproduce

The scenario is with one APIC cluster as host and a playbook using a role.

Without ACI HTTPAPI plugin

Inventory

---
lab_fabric:
  hosts:
    lab_fabric_apic_1:
      apic_host: x.x.x.1
    lab_fabric_apic_2:
      apic_host: x.x.x.2
    lab_fabric_apic_3:
      apic_host: x.x.x.3
  vars:
    apic_validate_certs: false
    apic_username: username
    apic_password: password

Playbook

- name: Manage BDs
  hosts: all
  gather_facts: true
  connection: local
  any_errors_fatal: true
  ignore_errors: false
  run_once: true


  roles:
    - role: roles/manage-bds

Role

- name: Declare aci_login
  ansible.builtin.set_fact:
    aci_login: &aci_login
      hostname: '{{ apic_host }}'
      username: '{{ apic_username }}'
      password: '{{ apic_password }}'
      use_ssl: true
      validate_certs: '{{ apic_validate_certs }}'
      output_path: "{{ playbook_dir }}/../files/aci_data_bds_{{ ansible_date_time.iso8601_basic_short }}.json"

- name: Manage BDs
  cisco.aci.aci_bd:
    <<: *aci_login
    bd: '{{ item.name }}'
    description: '{{ item.description | default(omit) }}'
    tenant: '{{ item.tenant }}'
    vrf: '{{ item.vrf }}'
    l2_unknown_unicast: '{{ item.l2_unknown_unicast | default("flood") }}'
    l3_unknown_multicast: '{{ item.l3_unknown_multicast | default(omit) }}'
    multi_dest: '{{ item.multi_dest_flood | default(omit) }}'
    arp_flooding: '{{ item.arp_flood | default("yes") }}'
    enable_routing: '{{ item.unicast_routing | default("no") }}'
    mac: '{{ item.mac | default(omit) }}'
    name_alias: '{{ item.name_alias | default(omit) }}'
    state: '{{ item.status }}'
  loop: '{{ bds }}'
  loop_control:
    pause: 0.05

With ACI HTTPAPI plugin

Inventory

---
lab_fabric:
  hosts:
    cluster_apic:
      ansible_host: x.x.x.1,x.x.x.2,x.x.x.3
  vars:
    apic_validate_certs: false
    ansible_user: username
    ansible_password: password
    ansible_connection: ansible.netcommon.httpapi
    ansible_network_os: cisco.aci.aci

Playbook

- name: Manage BDs
  hosts: all
  gather_facts: true
  connection: local
  any_errors_fatal: true
  ignore_errors: false
  run_once: true


  roles:
    - role: roles/manage-bds

Role

- name: Manage BDs
  cisco.aci.aci_bd:
    validate_certs: '{{ apic_validate_certs }}'
    output_path: '{{ playbook_dir }}/../files/aci_data_bds_{{ ansible_date_time.iso8601_basic_short }}.json'
    use_ssl: true
    bd: '{{ item.name }}'
    description: '{{ item.description | default(omit) }}'
    tenant: '{{ item.tenant }}'
    vrf: '{{ item.vrf }}'
    l2_unknown_unicast: '{{ item.l2_unknown_unicast | default("flood") }}'
    l3_unknown_multicast: '{{ item.l3_unknown_multicast | default(omit) }}'
    multi_dest: '{{ item.multi_dest_flood | default(omit) }}'
    arp_flooding: '{{ item.arp_flood | default("yes") }}'
    enable_routing: '{{ item.unicast_routing | default("no") }}'
    mac: '{{ item.mac | default(omit) }}'
    name_alias: '{{ item.name_alias | default(omit) }}'
    state: '{{ item.status }}'
  loop: '{{ bds }}'

Important Factoids

The reason I used loop control for non HTTPAPI was to overcome the throttling behavior of the APIC controller which returned connection timeout from NGINX. Even so, the task, once completed was three times faster than using HTTPAPI plugin.

@jordiasla jordiasla added the bug Something isn't working label Jan 26, 2025
@akinross akinross added the jira-sync Sync this issue to Jira label Jan 27, 2025
@github-actions github-actions bot changed the title Use of HTTPAPI Plugin takes more execution time Use of HTTPAPI Plugin takes more execution time (DCNE-299) Jan 27, 2025
@akinross
Copy link
Collaborator

akinross commented Jan 27, 2025

Hi @jordiasla,

Thank you for raising this issue. Could you elaborate a bit more how your benchmarking was done and provide some of the logs / outputs of the results? We would need to have a look into recreation too see the actual result and what exactly is causing this.

Also are you sure you are using the httpapi plugin, because in your playbook your connection is set to local.

@akinross
Copy link
Collaborator

Hi @jordiasla,

I ran a few tests already locally with the plugin and without and getting consistently same results:

WITH HTTPAPI PLUGIN

- name: Manage BDs
  cisco.aci.aci_bd:
    tenant: abr_benchmark
    name: "abr_{{ item }}"
    state: present
  loop: "{{ range(0, 190, 1) | list }}"

TASKS RECAP *******************************************************************
Monday 27 January 2025  12:38:47 +0000 (0:08:04.405)       0:08:05.161 ******** 

WITHOUT HTTPAPI PLUGIN

- name: Manage BDs
  cisco.aci.aci_bd:
    <<: *aci_info
    tenant: abr_benchmark
    name: "abr_{{ item }}"
    state: present
  loop: "{{ range(0, 190, 1) | list }}"
  loop_control:
    pause: 0.05

TASKS RECAP *******************************************************************
Monday 27 January 2025  11:58:44 +0000 (0:10:09.640)       0:10:09.677 ******** 

@jordiasla
Copy link
Author

Hi @akinross ,

thanks a lot for reaching out. As I said, I hope the whole issue is not a matter of my misinterpretation of benefits coming from the use of HTTPAPI plugin. The whole execution of the playbook I sent is wrapped in an invoke script. Within this script I use the python 'time' library to note the start and end time of the task, hence I calculate the total runtime. It may not be super accurate but as a magnitude, the time difference is consistent for all the times I executed it.

I saw your results and indeed there is no difference, so I am doing something wrong... Do you see anything strange in my inventory and role definition?

@akinross
Copy link
Collaborator

akinross commented Jan 27, 2025

Hi @jordiasla,

Are you sure you are using the httpapi plugin, because in your playbook your connection is set to local. Precendence order of ansible might overwrite the connection to local in this case.

Furthermore you could leverage validate_certs and ssl from inventory, by setting ansible_httpapi_use_ssl, and
ansible_httpapi_validate_certs. This way you do not have to define them at task level. But do not see where this should make a difference performance wise.

Could you share your invoke script and your entire ansible playbook / role setup to my email? I can take a look at it.

Also could you run your tests again with the following https://docs.ansible.com/ansible/latest/collections/ansible/posix/profile_tasks_callback.html and provide the output?

Finally from my results i would argue that there is a slight difference, with the plugin enabled it is roughly 20% faster. Which I think would make sense also because the login is only executed when the cookie is invalid.

@jordiasla
Copy link
Author

Hi @akinross ,

I have isolated my code just for this execution, again with and without HTTPAPI. I have sent you an e-mail with the details, the behavior more or less remained the same even though I incorporated your suggestions.

Without HTTPAPI

Monday 27 January 2025  17:25:49 +0200 (0:01:35.674)       0:01:37.382 ********

===============================================================================

roles/manage-bds : Manage BDs ------------------------------------------ 95.67s

Gathering Facts --------------------------------------------------------- 1.62s

roles/manage-bds : Declare aci_login ------------------------------------ 0.05s

With HTTPAPI

Monday 27 January 2025  17:30:12 +0200 (0:03:13.676)       0:03:16.114 ********

===============================================================================

roles/manage-bds : Manage BDs ----------------------------------------- 193.68s

Gathering Facts --------------------------------------------------------- 2.38s

It looks like I am clearly missing something :(

I really appreciate your time on this.

@shrsr
Copy link
Collaborator

shrsr commented Jan 27, 2025

@jordiasla I agree with Akini (@akinross). Can you please run your playbook using the HTTPAPI plugin with the following modifications made to your playbook? Setting connection=local enforces Ansible to NOT use the plugin. I would also decrease the number of BDs to a few to save time and make sure the below config is using the plugin first.

- name: Manage BDs
  hosts: all
  any_errors_fatal: true
  ignore_errors: false
  run_once: true


  roles:
    - role: roles/manage-bds

@jordiasla
Copy link
Author

Thanks a lot @shrsr for the recommendation. I actually removed the connection=local before my last post, but the behavior was the same. Also, I use a big number of BDs, to take advantage of the optimization for bulk deployments, where I guess the difference will be more evident. I need to support a preparation for migrations with more that 600 objects in every maintenance window so I really need to speed up things as it takes more than 30 minutes.

@akinross
Copy link
Collaborator

Hi @jordiasla,

I ran your playbooks toward my apic ( single apic cluster for testing ) and still see roughly same results as I had before. I did not run it from your python script.

WITH

TASKS RECAP *******************************************************************
Tuesday 28 January 2025  08:25:36 +0000 (0:09:03.770)       0:09:04.636 *******
===============================================================================
roles/manage-bds : Manage BDs ------------------------------------------543.77s
Gathering Facts --------------------------------------------------------- 0.82s

WITHOUT

TASKS RECAP *******************************************************************
Tuesday 28 January 2025  08:39:08 +0000 (0:11:07.950)       0:11:08.600 *******
===============================================================================
roles/manage-bds : Manage BDs ----------------------------------------- 667.95s
Gathering Facts --------------------------------------------------------- 0.62s
roles/manage-bds : Declare aci_login ------------------------------------ 0.01s

Could you run your tests again without python but invoking directly with ansible-playbook -i inventory/lab_fabric.yaml playbooks/manage-bds.yaml and setting only a single apic in the cluster section?

@jordiasla
Copy link
Author

This is really frustrating... I did it as @akinross proposed and results did not change

WITHOUT

Tuesday 28 January 2025  11:39:11 +0200 (0:01:34.772)       0:01:36.420 ******* 
==========================================================
roles/manage-bds : Manage BDs ------------------------------------------------------- 94.77s
Gathering Facts --------------------------------------------------------------------------- 1.57s
roles/manage-bds : Declare aci_login --------------------------------------------------- 0.04s

WITH

Tuesday 28 January 2025  14:56:07 +0200 (0:03:09.360)       0:03:11.576 ******* 
==========================================================
roles/manage-bds : Manage BDs ----------------------------------------------------- 189.36s
Gathering Facts --------------------------------------------------------------------------- 2.16s

I will try to repeat it in a different host and come back.

@akinross
Copy link
Collaborator

akinross commented Jan 29, 2025

I did some additional testing and I am getting closer results, still HTTPAPI being faster but way less than before:

  • WITH HTTPAPI PLUGIN
TASKS RECAP *******************************************************************
Wednesday 29 January 2025  11:04:47 +0000 (0:01:39.662)       0:01:40.405 *****
===============================================================================
roles/manage-bds : Manage BDs ------------------------------------------ 99.66s
Gathering Facts --------------------------------------------------------- 0.72s
  • WITHOUT HTTPAPI PLUGIN
TASKS RECAP *******************************************************************
Wednesday 29 January 2025  11:20:14 +0000 (0:01:44.566)       0:01:45.182 *****
===============================================================================
roles/manage-bds : Manage BDs ----------------------------------------- 104.57s
Gathering Facts --------------------------------------------------------- 0.58s
roles/manage-bds : Declare aci_login ------------------------------------ 0.01s

The difference for me between this test run and the ones before is location of the APIC, where in this run the APIC is closer compared to the lab in US. I am suspecting that there is a turning point where the API call is faster compared to the initialisation and overhead of the connection object. I had a compare of my API requests ( averages of running multiple requests ) between the two APIC locations which is the following:

US

Post to login: Status: 200 OK Size: 1.75 KB Time: 714 ms
Post create BD: Status: 200 OK Size: 30 Bytes Time: 714 ms

AMS

Post to login: Status: 200 OK Size: 2.47 KB Time: 129 ms
Post create BD: Status: 200 OK Size: 30 Bytes Time: 109 ms

Could you do a check on the API request times via Postman or Thunder Client or any other tool?

@akinross
Copy link
Collaborator

akinross commented Feb 7, 2025

Hi @jordiasla, did you get a chance to look at the above answer?

@jordiasla
Copy link
Author

Hi all,

first of all allow me to apologize on "ghosting" this thread. I had some urgent, work-related issues, so I had to step back from this testing.

What I further did was to check on two additional setups (as localhost) and one more fabric, The results did not change anyhow

With on Linux VM

Friday 07 February 2025  21:15:27 +0200 (0:03:15.258)       0:03:17.521 ******* 
=============================================================================== 
roles/manage-bds : Manage BDs ----------------------------------------- 195.26s
Gathering Facts --------------------------------------------------------- 2.19s

Without on Linux VM

Friday 07 February 2025  21:20:36 +0200 (0:01:34.590)       0:01:36.202 ******* 
=============================================================================== 
roles/manage-bds : Manage BDs ------------------------------------------ 94.59s
Gathering Facts --------------------------------------------------------- 1.54s
roles/manage-bds : Declare aci_login ------------------------------------ 0.04s


With on Windows VM + WSL

Friday 07 February 2025  21:30:14 +0200 (0:03:29.293)       0:03:34.719 ******* 
=============================================================================== 
roles/manage-bds : Manage BDs ----------------------------------------- 209.29s
Gathering Facts --------------------------------------------------------- 5.33s

Without on Windows VM + WSL

Friday 07 February 2025  22:13:55 +0200 (0:03:27.534)       0:03:32.250 ******* 
=============================================================================== 
roles/manage-bds : Manage BDs ----------------------------------------- 207.53s
Gathering Facts --------------------------------------------------------- 4.62s
roles/manage-bds : Declare aci_login ------------------------------------ 0.05s


With on Windows workstation + WSL

Saturday 08 February 2025  00:52:00 +0200 (0:03:54.621)       0:03:58.088 *****
===============================================================================
roles/manage-bds : Manage BDs ----------------------------------------- 234.62s
Gathering Facts --------------------------------------------------------- 3.30s

Without on Windows workstation + WSL

Saturday 08 February 2025  00:57:24 +0200 (0:03:50.420)       0:03:53.004 *****
===============================================================================
roles/manage-bds : Manage BDs ----------------------------------------- 230.42s
Gathering Facts --------------------------------------------------------- 2.52s
roles/manage-bds : Declare aci_login ------------------------------------ 0.02s

Also, the first and last set of tests was against the same ACI fabric. It looks like WSL somehow impacts the execution, but I can still not understand why HTTPAPI is so mush worse in the first set.

I will try to check with postman and come back.

thanks
iordanis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working jira-sync Sync this issue to Jira
Projects
None yet
Development

No branches or pull requests

3 participants