I am facing the following problem: I have (large) sample of unevenly distributed points $(X_i,Y_i)$ in a 2D space. I would like to determine the local extremas of the density of the distribution.
I have tried with the function KernelDensity in sklearn.
Does the function KernelDensity allow to estimate the density of the sample in a point outside the sample ?
if yes, i cannot find the right syntax ?
Here is an example:
import numpy as np
import pandas as pd
mean0=[0,0]
cov0=[[1,0],[0,1]]
mean1=[3,3]
cov1=[[1,0.2],[0.2,1]]
A=pd.DataFrame(np.vstack((np.random.multivariate_normal(mean0, cov0, 5000),np.random.multivariate_normal(mean1, cov1, 5000))))
A.columns=['X','Y']
A.describe()
from sklearn.neighbors import KernelDensity
kde = KernelDensity(bandwidth=0.04, metric='euclidean',
kernel='gaussian', algorithm='ball_tree')
kde.fit(A)
If I make this query
kde.score_samples([(0,0)])
i get a negative number, clearly not a density !!
array([-2.88134574])
I don't know if its the right approach. I would like then use that function to use an optimizer to get local extremas. (which library/function would you recommend ?)
is there some easy way to do that in Sagemath ?
thank you very much for your help