Cluster Analysis: – harshamth522.sites.umassd.edu

“Cluster analysis, often referred to as segmentation or taxonomy analysis, serves as an exploratory method aimed at uncovering underlying structures in data. In the realm of Data Analytics, we frequently encounter substantial datasets characterized by inherent similarities. To facilitate organization, we categorize this data into groups, or ‘clusters,’ based on their resemblance. Cluster analysis encompasses a range of methodologies, broadly categorized into hierarchical and non-hierarchical methods.”

“Hierarchical methods in cluster analysis encompass two main categories: Agglomerative methods and Divisive Methods. Agglomerative methods initiate with individual observations in separate clusters and systematically merge the most similar clusters. This continues until all subjects are consolidated into a single cluster, with the selection of the optimal cluster count from multiple solutions. In Divisive methods, all observations initially belong to a single cluster and are divided into separate clusters using a reverse approach compared to agglomerative methods. Agglomerative methods are the more prevalent choice and will be the main focus of this discussion.”

“Non-hierarchical methods are commonly referred to as ‘K-means Clustering.’ With this method, we segment a collection of (n) observations into k clusters. K-means clustering is particularly useful when predefined group labels are unavailable, and our goal is to assign akin data points into the predetermined number of groups (K).”

Leave a Reply Cancel reply