Start from random groups of similar size. — make_Z

A large number ($n$) of observations are assigned randomly into ($xq$) clusters. It is recommended to repeat Multimix runs with a number of different seeds to search for a log-likelihood maximum.

make_Z_random(D, seed = NULL)

Arguments

D: an object of class multimixSettings -- see data_organise for more information.
seed: a positive integer to use as a random number seed.

Value

a matrix of dimension $n\times q$ where $n$ is the number of observations in D$dframe

and $q$ is the number of clusters in the model as specified by D$numClusters.

Details

Also consider making additional clusters from observations with low probabilities of belonging to any cluster in a previous clustering.

Examples

data(cancer.df)
D = data_organise(cancer.df, numClusters = 2)
Z = make_Z_random(D)
table(Z)
#> Z
#>   0   1 
#> 475 475