0

states clustering

K-Means clustering of Indian states using socioeconomic indicators with PCA visualization.

This project clusters the 28 Indian states into three meaningful groups using key socioeconomic indicators such as literacy rate, per capita income, population density, and unemployment rate.

The analysis applies K-Means clustering with proper data normalization and uses PCA to visualize high-dimensional data in 2D. Each cluster is interpreted using statistical summaries to highlight regional development patterns, making the results useful for policy analysis, economic insights, and further machine learning studies.

Tech used: Python, pandas, scikit-learn, matplotlib, seaborn, Jupyter Notebook.

Reference

Census of India 2011 — Official population census used as the source for literacy rate and population density statistics across Indian states. NSDP Per Capita Income (2023–24) — State-wise per capita income data sourced from Net State Domestic Product (NSDP) estimates. RBI Unemployment Statistics (2022–23) — Unemployment rate data averaged from RBI-reported rural and urban employment statistics.