source('functions.R')14 Cluster jerárquico
14.1 Cluster jerárquico
set.seed(311265)
data <- read_spss('datos_para_cluster.sav')
bw8hcluster(data,
vars= c("X1","X2","X3","X4","X5","X6","X7"),
weight_var = "NULL",
id_var = 'ID',
standardize = FALSE,
distance = "squared_euclidean",
method = "ward",
dist_matrix = TRUE,
agglom_schedule = TRUE,
dendrogram = TRUE,
membership = "range",
n_clusters = NULL, # si single
min_clusters = 2, # si range
max_clusters = 4, # si range
run_nbclust = TRUE, #check
elbow = TRUE, # check
silhouette = TRUE, # check
json_path = "hcluster_export.json", #check
# --- FILTROS DE RENDIMIENTO (Fáciles de modificar) ---
limit_dist_matrix = 100, # Máximo N para imprimir matriz de distancias
limit_agglom_steps = 50, # Máximo de últimas etapas a mostrar en la agenda
publish = TRUE)NULL
Análisis de Clúster Jerárquico
Método: WARD | Medida: SQUARED_EUCLIDEAN | Estandarizado: FALSE
Ponderación aplicada mediante la variable: NULL (Redondeo de frecuencias)
1. Agenda de Aglomeración
Nota: Se muestran únicamente las últimas 50 etapas de fusión por motivos de rendimiento y relevancia analítica.
2. Matriz de Proximidades
3. Pertenencia de Clúster
4. Validación: Método del Codo (Elbow)
5. Validación: Análisis de Silueta (Silhouette)
6. Diagnóstico de Número Óptimo (NbClust)
*** : The Hubert index is a graphical method of determining the number of clusters.
In the plot of Hubert index, we seek a significant knee that corresponds to a
significant increase of the value of the measure i.e the significant peak in Hubert
index second differences plot.
*** : The D index is a graphical method of determining the number of clusters.
In the plot of D index, we seek a significant knee (the significant peak in Dindex
second differences plot) that corresponds to a significant increase of the value of
the measure.
*******************************************************************
* Among all indices:
* 14 proposed 2 as the best number of clusters
* 5 proposed 3 as the best number of clusters
* 5 proposed 4 as the best number of clusters
***** Conclusion *****
* According to the majority rule, the best number of clusters is 2
*******************************************************************