Prepare data for use with multimix
data_organise(
dframe,
numClusters,
numIter = 1000,
cdep = NULL,
lcdep = NULL,
minpstar = 1e-09
)a data frame containing the data set you wish to model.
the clusters you wish to fit.
the maximum number of steps to that the EM agorithm will run before terminating.
a list of multivariate normal cells.
a list of location cells.
Minimum denominator for application of Bayes Rule.
An object of class multimixSettings which is a list
with the following elements:
cdep --- a list of multivariate normal cells.
clink --- column numbers of univariate normal variables.
cprods --- a list over MVN cells containing a matrix of
pair-wise products of columns in the cell, columns
ordered by pair.index.
cvals --- a list over MVN cells containing a matrix of columns of variables in the cell
cvals2 --- a list over MVN cells containing a matrix of squared columns of variables in the cell
dframe --- the data.frame of variables
discvar --- logical: the variable is takes values of either TRUE or FALSE
dlevs --- for discrete cells: number of levels
dlink --- column numbers of univariate discrete variables
dvals --- a list over discrete cells of level indicator matrices
lc --- logical: is continuous variable belonging to OT cell TRUE/FALSE
lcdep --- a list of OT cells
lcdisc --- column numbers of discrete variables in OT cells
lclink --- column numbers of continuous variables in OT cells
lcprods --- a list over OT cells containing a matrix of pair-wise products of continuous columns in the cell, columns ordered by pair.index
lcvals --- a list over OT cells containing a matrix of continuous columns of variables in the cell
lcvals2 --- a list over OT cells containing a matrix of squared continuous columns of variables in the cell
ld --- logical: is discrete variable belonging to OT cell TRUE/FALSE
ldlevs --- for discrete variables in OT cells: number of levels
ldlink --- a column numbers of OT discrete variables
ldvals --- a list over OT cells of level indicator matrices
ldxc --- a list over OT cells whose members are lists over levels of matrices of the cell continuous variables whose columns are multiplied by the level indicator column
mc --- logical: is continuous variable not in OT cell TRUE/FALSE
md --- logical: is discrete variable not in OT cell TRUE/FALSE
minpstar --- minimum denominator for appliction of Bayes' Rule
n --- number of observations
numIter --- the maximum number of steps to that the EM agorithm will run before terminating
oc --- logical: is continuous variable in univariate cell TRUE/FALSE
olink --- column numbers of continuous univariate cells
op --- length(olink)
ovals --- n by op matrix of continuous univariate variables
ovals2 --- n by op matrix of squared continuous univariate variables
numClusters --- the number of clusters in the model.
data(cancer.df)
D = data_organise(cancer.df, numClusters = 2)