Slow query to AD #1105

santosshen · 2023-11-01T09:06:09Z

from datetime import datetime

from ldap3 import Server, Connection, set_config_parameter, SUBTREE, NONE

from ad.test.info import AD_SERVER_INFO

set_config_parameter('RESTARTABLE_SLEEPTIME', 0.5)
set_config_parameter('RESPONSE_WAITING_TIMEOUT', 2)

LDAP_BASE = AD_SERVER_INFO['LDAP_BASE']
AD_SERVER_IP = AD_SERVER_INFO['AD_SERVER_IP']
AD_USER = AD_SERVER_INFO['AD_USER']
AD_PASSWORD = AD_SERVER_INFO['AD_PASSWORD']

LDAP_SERVER = Server(host=AD_SERVER_IP, port=389, get_info=NONE, use_ssl=False)
LDAP_CONN = Connection(server=LDAP_SERVER, user=AD_USER, password=AD_PASSWORD, auto_bind=True,
                       client_strategy='RESTARTABLE', receive_timeout=2, check_names=False)


def paged_search_object():
    start_time = datetime.now()
    object_list = []
    response = LDAP_CONN.extend.standard.paged_search(
        search_base=LDAP_BASE,
        search_filter='(objectCategory=*)',
        search_scope=SUBTREE,
        attributes=['distinguishedName', 'ou', 'cn']
    )
    for res in response:
        if res.get('attributes'):
            object_list.append(dict(res['attributes']))
    end_time = datetime.now()
    time_elapsed = end_time - start_time
    seconds_elapsed = time_elapsed.total_seconds()
    print(len(object_list))
    print(f'{seconds_elapsed} s')
    return object_list


paged_search_object()
paged_search_object()

use java javax.naming.ldap search 5829 object , 900ms
use python ldap3 search 5829 object , 3.4s

Is there a problem with my query？

The text was updated successfully, but these errors were encountered:

santosshen · 2023-11-01T09:09:01Z

Total time: 5.2746 s
Function: paged_search_object at line 21

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    21                                           @func_line_time
    22                                           def paged_search_object():
    23         1         39.0     39.0      0.0      start_time = datetime.now()
    24         1          7.0      7.0      0.0      object_list = []
    25         2         76.0     38.0      0.0      response = LDAP_CONN.extend.standard.paged_search(
    26         1          3.0      3.0      0.0          search_base=LDAP_BASE,
    27         1          4.0      4.0      0.0          search_filter='(objectCategory=*)',
    28         1          3.0      3.0      0.0          search_scope=SUBTREE,
    29         1          5.0      5.0      0.0          attributes=['distinguishedName', 'ou', 'cn']
    30                                               )
    31      5830   52466375.0   8999.4     99.5      for res in response:
    32      5829      46411.0      8.0      0.1          if res.get('attributes'):
    33      5829     232829.0     39.9      0.4              object_list.append(dict(res['attributes']))
    34         1         39.0     39.0      0.0      end_time = datetime.now()
    35         1         17.0     17.0      0.0      time_elapsed = end_time - start_time
    36         1         16.0     16.0      0.0      seconds_elapsed = time_elapsed.total_seconds()
    37         1        142.0    142.0      0.0      print(len(object_list))
    38         1         68.0     68.0      0.0      print(f'{seconds_elapsed} s')
    39         1          4.0      4.0      0.0      return object_list

Gu-f · 2024-04-18T10:34:53Z

I think this is a problem with the ldap3 library code, which seems to be N pulls N times for the AD service, not batch pulls.
The loop is inside this method:

# /ldap3/extend/standard/PagedSearch.py
    while cookie:
        result = connection.search(search_base,
                                   search_filter,
                                   search_scope,
                                   dereference_aliases,
                                   attributes,
                                   size_limit,
                                   time_limit,
                                   types_only,
                                   get_operational_attributes,
                                   controls,
                                   paged_size,
                                   paged_criticality,
                                   None if cookie is True else cookie)

        if not connection.strategy.sync:
            response, result = connection.get_response(result)
        else:
            if connection.strategy.thread_safe:
                _, result, response, _ = result
            else:
                response = connection.response
                result = connection.result

It seems to be better suited for paging iterators than pulling all at once, but at the same time I had another problem, details #1141

santosshen · 2024-06-06T11:30:57Z

#1147

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow query to AD #1105

Slow query to AD #1105

santosshen commented Nov 1, 2023

santosshen commented Nov 1, 2023

Gu-f commented Apr 18, 2024

santosshen commented Jun 6, 2024

Slow query to AD #1105

Slow query to AD #1105

Comments

santosshen commented Nov 1, 2023

santosshen commented Nov 1, 2023

Gu-f commented Apr 18, 2024

santosshen commented Jun 6, 2024