14  Cluster jerárquico

14.1 Cluster jerárquico

source('functions.R')
set.seed(311265)
data <- read_spss('datos_para_cluster.sav')

bw8hcluster(data,
            vars= c("X1","X2","X3","X4","X5","X6","X7"),
          weight_var = "NULL",
            id_var = 'ID',
            standardize = FALSE,
            distance = "squared_euclidean", 
            method = "ward",
            dist_matrix = TRUE,
            agglom_schedule = TRUE,
            dendrogram = TRUE,
            membership = "range",
            n_clusters = NULL, # si single
            min_clusters = 2, # si range       
            max_clusters = 4, # si range      
            run_nbclust = TRUE, #check
            elbow = TRUE, # check
            silhouette = TRUE, # check
            json_path = "hcluster_export.json", #check
            # --- FILTROS DE RENDIMIENTO (Fáciles de modificar) ---
            limit_dist_matrix = 100,  # Máximo N para imprimir matriz de distancias
            limit_agglom_steps = 50,  # Máximo de últimas etapas a mostrar en la agenda
            publish = TRUE)
NULL

Análisis de Clúster Jerárquico

Método: WARD | Medida: SQUARED_EUCLIDEAN | Estandarizado: FALSE

Ponderación aplicada mediante la variable: NULL (Redondeo de frecuencias)

1. Agenda de Aglomeración

Nota: Se muestran únicamente las últimas 50 etapas de fusión por motivos de rendimiento y relevancia analítica.


2. Matriz de Proximidades


3. Pertenencia de Clúster


4. Validación: Método del Codo (Elbow)

5. Validación: Análisis de Silueta (Silhouette)


6. Diagnóstico de Número Óptimo (NbClust)

*** : The Hubert index is a graphical method of determining the number of clusters.
                In the plot of Hubert index, we seek a significant knee that corresponds to a 
                significant increase of the value of the measure i.e the significant peak in Hubert
                index second differences plot. 
 
*** : The D index is a graphical method of determining the number of clusters. 
                In the plot of D index, we seek a significant knee (the significant peak in Dindex
                second differences plot) that corresponds to a significant increase of the value of
                the measure. 
 
******************************************************************* 
* Among all indices:                                                
* 14 proposed 2 as the best number of clusters 
* 5 proposed 3 as the best number of clusters 
* 5 proposed 4 as the best number of clusters 

                   ***** Conclusion *****                            
 
* According to the majority rule, the best number of clusters is  2 
 
 
******************************************************************* 


7. Dendrograma