Detecting distribution skew

could you elaborate more on this topic

Hello @Ashish_Sharma6
Skewness occurs when there’s no symmetry distribution! (a long tail on one side)
To detect skewness in the data, several methods can be used like:

  • Visualization: Skewed distributions will exhibit asymmetry and an uneven concentration of data points on one side. eg in histograms
  • Skewness Coefficient: Skewness can be quantified using a skewness coefficient. Commonly used coefficients include Pearson’s skewness coefficient and Fisher-Pearson standardized moment coefficient. A positive value indicates right skewness, while a negative value indicates left skewness.
  • Kurtosis: Kurtosis is a measure of the tail of a distribution. High kurtosis values indicate heavy tails, which can be associated with skewness.

Handling Distribution Skew:

  • Data Transformation: Applying transformations such as logarithm, square root, or cube root can reduce the impact of skewness and make the data more symmetric.

  • Outlier Removal: Removing extreme outliers can help mitigate the effect of skewness on the data distribution.

  • Data Scaling: Scaling the data can normalize the distribution and improve model performance.

If you’d like to dive deeper into the topic checkout this new specialization: Mathematics for Machine Learning and Data Science Specialization

Regards
Isaak

Of note, one can immediately learn the third course of this specialization without having taken the previous two.

Hello @Tom_Pham
The fact that you can start with the third course without taking the previous two offers some flexibility to learners who might already have a background in certain areas of mathematics.

If you don’t have a solid background in mathematics I would advise you to follow the sequence due to the interdependence of mathematical concepts, as it assumes knowledge from the preceding linear algebra and calculus courses.

Regards Isaak

1 Like