Home / Summaries / Class notes - BDS: psychometrics / clusters-clustering-characteristics

Categorical Latent Variables - r assignment

Q: How can you fit a normal mixture model for 2:12 potential clusters, and plot the BIC values for each model?

Library('mclust') clustbic <- mclustBIC(data, G = 2:12) # fit models with 2 to 12 clusters clustbic plot(clustbic) # plot the bic's the output tells you what the three best models are. The mclust package uses a maximum-BIC strategy, so the higher BIC in this case is a better fit.

Q: After you did your exploratory fitting to find the amount of clusters, how can you fit the best fitting model to your data?

Clustbic <- mclustBIC(data, G = 2:12) clusfit <- Mclust(data, x = Clustbic) # fits the best fitting model

Q: When you've fitted your model, how can you obtain some interesting statistics from the model?

clusfit$parameters$pro # these are the class probabilities clusfit$z # these are the posterior probabilities of the individuals clusfit$classification # what is the cluster each individual is classified to. clusfit$parameters$mean # describes the mean score per question per cluster

Q: How can you identify whether the clusters are well separated?

# perform dimensionality reduction to plot the clusters clustred <- MclustDR(clusfit) plot(clustred, what = 'boundaries', ngrid = 200) plot(clustred, what = 'density', dimens = 1)

Q: What is a strategy to identify what the effects of individual characteristics are on the clustering?

In a dataframe that contains characteristics for every individual (characteristics as columns, individuals as rows), add the clustering as a column. Then you can filter certain characteristics and calculate what percentage of people in a cluster has this characteristic. this way you can filter for main, 2-way, 3-way, etc. Effects. raters %>% group_by(cluster) %>% count(age_group) %>% mutate(clust_tot = sum(n)) %>% mutate(clust_prop = n/clust_tot) %>% arrange(desc(clust_prop)) this allows you to see what characteristics cause people to get clustered to certain clusters.

6 important questions on Categorical Latent Variables - r assignment

How can you fit a normal mixture model for 2:12 potential clusters, and plot the BIC values for each model?

Library("mclust")

clustbic <- mclustBIC(data, G = 2:12) # fit models with 2 to 12 clusters
clustbic
plot(clustbic) # plot the bic's

the output tells you what the three best models are.

The mclust package uses a maximum-BIC strategy, so the higher BIC in this case is a better fit.

After you did your exploratory fitting to find the amount of clusters, how can you fit the best fitting model to your data?

Clustbic <- mclustBIC(data, G = 2:12)

clusfit <- Mclust(data, x = Clustbic) # fits the best fitting model

When you've fitted your model, how can you obtain some interesting statistics from the model?

clusfit$parameters$pro # these are the class probabilities
clusfit$z # these are the posterior probabilities of the individuals
clusfit$classification # what is the cluster each individual is classified to.
clusfit$parameters$mean # describes the mean score per question per cluster

How can you identify whether the clusters are well separated?

# perform dimensionality reduction to plot the clusters
clustred <- MclustDR(clusfit)

plot(clustred, what = "boundaries", ngrid = 200)
plot(clustred, what = "density", dimens = 1)

What is a strategy to identify what the effects of individual characteristics are on the clustering?

In a dataframe that contains characteristics for every individual (characteristics as columns, individuals as rows), add the clustering as a column.
Then you can filter certain characteristics and calculate what percentage of people in a cluster has this characteristic.
this way you can filter for main, 2-way, 3-way, etc. Effects.

raters %>%
group_by(cluster) %>%
count(age_group) %>%
mutate(clust_tot = sum(n)) %>%
mutate(clust_prop = n/clust_tot) %>%
arrange(desc(clust_prop))

this allows you to see what characteristics cause people to get clustered to certain clusters.

What is the effect of clustering only those for which a cluster has a conditional probability of .8 or more, and NA otherwise?

This causes your clusters to become better separated.
otherwise you're also clustering individuals which don't have a clear cluster (since the conditional probabilities for all clusters are equal)
clustering these people causes the essence of the cluster to be diluted.

The question on the page originate from the summary of the following study material:

BDS: psychometrics

View summary

A unique study and practice tool
Never study anything twice again
Get the grades you hope for
100% sure, 100% understanding

Remember faster, study better. Scientifically proven.

Categorical Latent Variables - r assignment

6 important questions on Categorical Latent Variables - r assignment

How can you fit a normal mixture model for 2:12 potential clusters, and plot the BIC values for each model?

After you did your exploratory fitting to find the amount of clusters, how can you fit the best fitting model to your data?

When you've fitted your model, how can you obtain some interesting statistics from the model?

How can you identify whether the clusters are well separated?

What is a strategy to identify what the effects of individual characteristics are on the clustering?

What is the effect of clustering only those for which a cluster has a conditional probability of .8 or more, and NA otherwise?

Summaries related to CTT, DeVellis 2006

Class notes - BDS: psychometrics

Cognitive Psychology

Research Methods in Psychology Evaluating a …

Psychology A Concise Introduction

An Introduction to Developmental Psychology

Abnormal Psychology

Statistics The Art and Science of Learning f…

Organizational Behavior

A conceptual introduction to psychometrics

Electrical Engineering: Concepts and Applica…

Class notes - scientific and statistical rea…

Class notes - Psychological assessment