Say I have a Numpy array of N = 10 random float numbers:
import numpy as np
np.random.seed(99)
N = 10
arr = np.random.uniform(0., 10., size=(N,))
print(arr)
out[1]: [6.72278559 4.88078399 8.25495174 0.31446388 8.08049963
5.6561742 2.97622499 0.46695721 9.90627399 0.06825733]
I want to find all unique pairs of numbers that are not different from each other more than a tolerance tol = 1. (i.e. absolute difference <= 1). Specifically, I want to get all unique pairs of indexes. The indexes of each close pair should be sorted, and all close pairs should be sorted by the first index. I managed to write the following working code:
def all_close_pairs(arr, tol=1.):
res = set()
for i, x1 in enumerate(arr):
for j, x2 in enumerate(arr):
if i == j:
continue
if np.isclose(x1, x2, rtol=0., atol=tol):
res.add(tuple(sorted([i, j])))
res = np.array(list(res))
return res[res[:,0].argsort()]
print(all_close_pairs(arr, tol=1.))
out[2]: [[1 5]
[2 4]
[3 7]
[3 9]
[7 9]]
However, in reality I have an array of N = 1000 numbers, and my code becomes extremely slow due to the nested for loops. I believe there are much more efficient ways to do this with Numpy vectorization. Does anyone know the fastest way to do this in Numpy?