\name{mona}
\title{
Monothetic Analysis
}
\usage{
mona(x)


}
\arguments{
\item{x}{
data matrix or dataframe in which each row corresponds to an observation,
and each column corresponds to a variable. All variables must be binary.
A limited number of missing values (NAs) is allowed. Every observation
must have at least one value different from NA. No variable should have
half of its values missing. There must be at least one variable which has
no missing values. A variable with all its non-missing values identical,
is not allowed.


}}
\value{
an object of class \code{"mona"} representing the clustering.
See mona.object for details.


}
\description{
\code{mona} is fully described in chapter 7 of Kaufman and Rousseeuw (1990).
It is "monothetic" in the sense that each division is based on a
single (well-chosen) variable, whereas most other hierarchical methods
(including \code{agnes} and \code{diana}) are "polythetic", i.e. they use
all variables together.


The \code{mona}-algorithm constructs a hierarchy of clusterings,
starting with one large
cluster. Clusters are divided until all objects in the same cluster have
identical values for all variables.
At each stage, all clusters are divided according to the values of one
variable. A cluster is divided into one cluster with all objects having
value 1 for that variable, and another cluster with all objects having
value 0 for that variable.


The variable used for splitting a cluster is the variable with the maximal
total association to the other variables, according to the objects in the
cluster to be splitted. The association between variables f and g
is given by a(f,g)*d(f,g) - b(f,g)*c(f,g), where a(f,g), b(f,g), c(f,g),
and d(f,g) are the numbers in the contingency table of f and g.
[That is, a(f,g) (resp. d(f,g)) is the number of objects for which f and g
both have value 0 (resp. value 1); b(f,g) (resp. c(f,g)) is the number of
objects for which f has value 0 (resp. 1) and g has value 1 (resp. 0).]
The total association of a variable f is the sum of its associations to all
variables.


This algorithm does not work with missing values, therefore the data are
revised, e.g. all missing values are filled in. To do this, the same measure
of association between variables is used as in the algorithm. When variable
f has missing values, the variable g with the largest absolute association
to f is looked up. When the association between f and g is positive,
any missing value of f is replaced by the value of g for the same
object. If the association between f and g is negative, then any missing
value of f is replaced by the value of 1-g for the same
object.


}
\section{BACKGROUND}{
Cluster analysis divides a dataset into groups (clusters) of objects that
are similar to each other. Hierarchical methods like \code{agnes}, \code{diana}, and
\code{mona} construct a hierarchy of clusterings, with the number of clusters
ranging from one to the number of objects. Partitioning methods like \code{pam},
\code{clara}, and \code{fanny} require that the number of clusters be given by
the user.


}
\references{
Kaufman, L. and Rousseeuw, P.J. (1990). Finding Groups in Data: An
Introduction to Cluster Analysis. Wiley, New York.


}
\seealso{
\code{\link{mona.object}}, \code{\link{plot.mona}}.


}
\examples{
mona1 <- mona(catalyst[1:3])
print(mona1)
plot(mona1)


}
\keyword{all}
\keyword{cluster}
% Converted by Sd2Rd version 0.2-a3.
