Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for arrays with small second dimension #9

Open
max9111 opened this issue Nov 18, 2019 · 0 comments
Open

Performance improvement for arrays with small second dimension #9

max9111 opened this issue Nov 18, 2019 · 0 comments

Comments

@max9111
Copy link

max9111 commented Nov 18, 2019

Let's start with a small example (possible speedup by a factor of 20)

Two issues are reducing performance:

  • The compiler doesn't know the second shape of the input arrays (unable to unroll the loop)
  • Calculating the square root is costly and can be avoided in the loop
import numpy as np
import hausdorff
import numba

XA = np.random.rand(10_000,2)
XB = np.random.rand(10_000,2)

@numba.jit(nopython=True, fastmath=True)
def cust_euclidian_hausdorff(XA, XB):
    nA = XA.shape[0]
    nB = XB.shape[0]
    
    cmax = 0.
    for i in range(nA):
        cmin = np.inf
        for j in range(nB):
            d = (XA[i,0]- XB[j,0])**2+(XA[i,1]- XB[j,1])**2
            if d<cmin:
                cmin = d
            if cmin<cmax:
                break
        if cmin>cmax and np.inf>cmin:
            cmax = cmin
    for j in range(nB):
        cmin = np.inf
        for i in range(nA):
            d = (XA[i,0]- XB[j,0])**2+(XA[i,1]- XB[j,1])**2
            if d<cmin:
                cmin = d
            if cmin<cmax:
                break
        if cmin>cmax and np.inf>cmin:
            cmax = cmin
    return np.sqrt(cmax)

#As shown above
%timeit cust_euclidian_hausdorff(u, v)
#21.7 ms ± 171 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

#With calculating sqrt in the loop, instead of comparing squared distances
%timeit cust_euclidian_hausdorff(u, v)
#33.2 ms ± 966 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

#Standard implementation
%timeit hausdorff.hausdorff_distance(u,v)
#413 ms ± 3.24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Possible workarounds

  • Implement unrolled function for arrays with small second dimension
  • Provide a method to generate a specialized function (with size of the second dimension as input, the compiler will do the remaining part)

Example for function generation

distance.py

import numba
import numpy as np
from math import sqrt, pow, cos, sin, asin

def euclidean(n):
    @numba.jit(nopython=True, fastmath=True)
    def inner(array_x, array_y):
        ret = 0.
        for i in range(n):
            ret += (array_x[i]-array_y[i])**2
        return ret

    @numba.jit(nopython=True, fastmath=True)
    def outer(val):
        return sqrt(val)
    return inner, outer

hausdorff.py

import numpy as np
import numba
import hausdorff.distances as distances
from inspect import getmembers

def gen_hausdorff_dist(inner_dist,outer_dist,shape_1):
    @numba.jit(nopython=True, fastmath=True)
    def _hausdorff_dist(XA, XB):
        assert XA.ndim == 2 and XB.ndim == 2, \
            'arrays must be 2-dimensional'
        assert XA.shape[1] == shape_1, \
            'arrays must have predifened shape[1]'
        assert XB.shape[1] == shape_1, \
            'arrays must have predifened shape[1]'
        
        nA = XA.shape[0]
        nB = XB.shape[0]
        cmax = 0.
        for i in range(nA):
            cmin = np.inf
            for j in range(nB):
                d = inner_dist(XA[i,:], XB[j,:])
                if d<cmin:
                    cmin = d
                if cmin<cmax:
                    break
            if cmin>cmax and np.inf>cmin:
                cmax = cmin
        for j in range(nB):
            cmin = np.inf
            for i in range(nA):
                d = inner_dist(XA[i,:], XB[j,:])
                if d<cmin:
                    cmin = d
                if cmin<cmax:
                    break
            if cmin>cmax and np.inf>cmin:
                cmax = cmin
        return outer_dist(cmax)
    return _hausdorff_dist

def gen_hausdorff_distance(shape_1, distance='euclidean'):
    distance_function = getattr(distances, distance)
    inner_dist,outer_dist = distance_function(shape_1)
    
    return gen_hausdorff_dist(inner_dist,outer_dist,shape_1)

Timings

#Only declare it one if possible, redeclaring in a loop will lead to recompilation
hausdorff_distance=hausdorff.gen_hausdorff_distance(u.shape[1])
%timeit hausdorff_distance(u, v)
20.4 ms ± 41 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant