How To Fit A Column Of A Dataframe Into Poisson Distribution In Python
I have been trying to find a way to fit some of my columns (that contains user click data) to poisson distribution in python. These columns (e.g., click_website_1, click_website_2
Solution 1:
Here is a quick way to check if your data follows a poisson distribution. You plot the under the assumption that it follows a poisson distribution with rate parameter lambda = data.mean()
import numpy as np
from scipy.misc import factorial
def poisson(k, lamb):
"""poisson pdf, parameter lamb is the fit parameter"""
return (lamb**k/factorial(k)) * np.exp(-lamb)
# lets collect clicks since we are going to need it later
clicks = df["clicks_website_1"]
Here we use the pmf for possion distribution.
Now lets do some modeling, from data (click_website_one) we'll estimate the the poisson parameter using the MLE, which turns out to be just the mean
lamb = clicks.mean()
# plot the pmf using lamb as as an estimate for `lambda`.
# let sort the counts in the columns first.
clicks.sort().apply(poisson, lamb).plot()
Post a Comment for "How To Fit A Column Of A Dataframe Into Poisson Distribution In Python"