Is There Any Good Way To Optimize The Speed Of This Python Code?

August 11, 2023 Post a Comment

I have a following piece of code, which basically evaluates some numerical expression, and use it to integrate over certain range of values. The current piece of code runs within a

Solution 1:

I profiled the code and found that k_one_third() and k_two_third() are slow. There are some duplicated calculations in the two functions.

By merging the two functions into one function, and decorate it with @numba.jit(parallel=True), I got 4x speedup.

@jit(parallel=True)
def k_one_two_third(x):
    x0 = x ** (1/3)
    x1 = np.exp(-x **2)
    x2 = np.exp(-x)
    one= (2*x1/x0 +4*x2/(x**(1/6)*(x0 +1)))**2
    two = (2*x**(5/2)*x2/(x**3+6) + x1/x**(2/3))**2returnone, two

Solution 2:

As said in the comments large parts of the code should be rewritten to get best performance.

I have only modified the simpson integration and modified @HYRY answer a bit. This speeds up the calculation from 26.15s to 1.76s (15x), by the test-data you provided. By replacing the np.einsums with simple loops this should end up in less than a second. (About 0.4s from the improved integration, 24s from k_one_two_third(x))

Baca Juga

For getting performance using Numba read. The latest Numba version (0.39), the Intel SVML-package and things like fastmath=True makes quite a big impact on your example.

Code

#a bit faster than HYRY's version@nb.njit(parallel=True,fastmath=True,error_model='numpy')defk_one_two_third(x):
  one=np.empty(x.shape,dtype=x.dtype)
  two=np.empty(x.shape,dtype=x.dtype)
  for i in nb.prange(x.shape[0]):
    for j inrange(x.shape[1]):
      for k inrange(x.shape[2]):
        x0 = x[i,j,k] ** (1/3)
        x1 = np.exp(-x[i,j,k] ** 2)
        x2 = np.exp(-x[i,j,k])
        one[i,j,k] = (2*x1/x0 + 4*x2/(x[i,j,k]**(1/6)*(x0 + 1)))**2
        two[i,j,k] = (2*x[i,j,k]**(5/2)*x2/(x[i,j,k]**3 + 6) + x1/x[i,j,k]**(2/3))**2return one, two

#improved integration@nb.njit(fastmath=True)defsimpson_nb(y_in,dx):
  s = y[0]+y[-1]

  n=y.shape[0]//2for i inrange(n-1):
    s += 4.*y[i*2+1]
    s += 2.*y[i*2+2]

  s += 4*y[(n-1)*2+1]
  return(dx/ 3.)*s

@nb.jit(fastmath=True)defspectrum(freq_c, number_bin, frequency, gamma, theta):
    theta_gamma_factor = np.einsum('i,j->ij', theta**2, gamma**2)
    theta_gamma_factor += 1.
    t_g_bessel_factor = 1.-1./theta_gamma_factor
    number = np.concatenate((number_bin, np.zeros((number_bin.shape[0], 1), dtype=number_bin.dtype)), axis=1)
    number_theta_gamma = np.einsum('jk, ik->ijk', theta_gamma_factor**2*1./gamma**3, number)
    final = np.empty((np.size(frequency),np.size(freq_c[:,0]), np.size(theta)))

    #assume that dx is const. on integration#speedimprovement of the scipy.simps is about 4x#numba version to scipy.simps(y,x) is about 60x
    dx=gamma[1]-gamma[0]

    for i inrange(np.size(frequency)):
        b_n_omega_theta_gamma = frequency[i]**2*number_theta_gamma
        eta = theta_gamma_factor**(1.5)*frequency[i]/2.
        eta = np.einsum('jk, ik->ijk', eta, 1./freq_c)

        one,two=k_one_two_third(eta)

        bessel_eta = np.einsum('jl, ijl->ijl',t_g_bessel_factor, one)
        bessel_eta += two

        integrand = np.multiply(bessel_eta, b_n_omega_theta_gamma, out= bessel_eta)

        #reorder arrayfor j inrange(integrand.shape[0]):
          for k inrange(integrand.shape[1]):
            final[i,j, k] = simpson_nb(integrand[j,k,:],dx)
    return final

Python Programming Language

Is There Any Good Way To Optimize The Speed Of This Python Code?

Solution 1:

Solution 2:

Post a Comment for "Is There Any Good Way To Optimize The Speed Of This Python Code?"