Sql Style Inner Join In Python?
Solution 1:
I would suggest a hash discriminator join like method:
l = [('a', 'beta'), ('b', 'alpha'), ('c', 'beta')]
r = [('b', 37), ('c', 22), ('j', 93)]
d = {}
fortin l:
d.setdefault(t[0], ([],[]))[0].append(t[1:])
fortin r:
d.setdefault(t[0], ([],[]))[1].append(t[1:])
from itertools import product
ans = [ (k,) + l + r fork,v in d.items() forl,r inproduct(*v)]
results in:
[('c', 'beta', 22), ('b', 'alpha', 37)]
This has lower complexity closer to O(n+m) than O(nm) because it avoids computing the product(l,r)
and then filtering as the naive method would.
Mostly from: Fritz Henglein's Relational algebra with discriminative joins and lazy products
It can also be written as:
def accumulate(it):
d = {}
for e in it:
d.setdefault(e[0], []).append(e[1:])
return d
l = accumulate([('a', 'beta'), ('b', 'alpha'), ('c', 'beta')])
r = accumulate([('b', 37), ('c', 22), ('j', 93)])
from itertools import product
ans = [ (k,) + l + r for k in l&r for l,r in product(l[k], r[k])]
This accumulates both lists separately (turns [(a,b,...)]
into {a:[(b,...)]}
) and then computes the intersection between their sets of keys. This looks cleaner. if l&r
is not supported between dictionaries replace it with set(l)&set(r)
.
Solution 2:
There is no built in method. Adding package like numpy
will give extra functionalities, I assume.
But if you want to solve it without using any extra packages, you can use a one liner like this:
ar1 = [('a', 'beta'), ('b', 'alpha'), ('c', 'beta')]
ar2 = [('b', 37), ('c', 22), ('j', 93)]
final_ar = [tuple(list(i)+[j[1]]) for i in ar1 for j in ar2 if i[0]==j[0]]
print(final_ar)
Output:
[('b', 'alpha', 37), ('c', 'beta', 22)]
Post a Comment for "Sql Style Inner Join In Python?"