data:image/s3,"s3://crabby-images/c55d0/c55d094512e27c6aae274aa50ac71cb140452e76" alt="Numerical Computing with Python"
Curse of dimensionality with 1D, 2D, and 3D example
A quick analysis has been done to see how distance 60 random points are expanding with the increase in dimensionality. Initially, random points are drawn for one-dimension:
# 1-Dimension Plot >>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> one_d_data = np.random.rand(60,1) >>> one_d_data_df = pd.DataFrame(one_d_data) >>> one_d_data_df.columns = ["1D_Data"] >>> one_d_data_df["height"] = 1 >>> plt.figure() >>> plt.scatter(one_d_data_df['1D_Data'],one_d_data_df["height"]) >>> plt.yticks([]) >>> plt.xlabel("1-D points") >>> plt.show()
If we observe the following graph, all 60 data points are very nearby in one-dimension:
data:image/s3,"s3://crabby-images/20a31/20a319036365e52af642890dc145550819912e55" alt=""
Here we are repeating the same experiment in a 2D space, by taking 60 random numbers with x and y coordinate space and plotted them visually:
# 2- Dimensions Plot >>> two_d_data = np.random.rand(60,2) >>> two_d_data_df = pd.DataFrame(two_d_data) >>> two_d_data_df.columns = ["x_axis","y_axis"] >>> plt.figure() >>> plt.scatter(two_d_data_df['x_axis'],two_d_data_df["y_axis"]) >>> plt.xlabel("x_axis");plt.ylabel("y_axis") >>> plt.show()
By observing the 2D graph we can see that more gaps have been appearing for the same 60 data points:
data:image/s3,"s3://crabby-images/127f4/127f4369295786e9788b4dd96ab9b02844f74b3e" alt=""
Finally, 60 data points are drawn for 3D space. We can see a further increase in spaces, which is very apparent. This has proven to us visually by now that with the increase in dimensions, it creates a lot of space, which makes a classifier weak to detect the signal:
# 3- Dimensions Plot >>> three_d_data = np.random.rand(60,3) >>> three_d_data_df = pd.DataFrame(three_d_data) >>> three_d_data_df.columns = ["x_axis","y_axis","z_axis"] >>> from mpl_toolkits.mplot3d import Axes3D >>> fig = plt.figure() >>> ax = fig.add_subplot(111, projection='3d') >>> ax.scatter(three_d_data_df['x_axis'],three_d_data_df["y_axis"],three_d_data_df ["z_axis"]) >>> plt.show()
data:image/s3,"s3://crabby-images/22701/2270136cf593e29a869136812d655893ef663363" alt=""