=================================================================
#                    HAPLO STATS News File
#                     
#        This file documents software changes up to version 1.3.0
#
#        format is as such
#        ---------------------
#        function.name:  title for issue
#        explanation of issue, status, and recommendations 

=========================================================
#### changes made between release 1.2.5 and 1.3.0  #####
=========================================================

--------------------------
seqhap: sequential haplotype selection in a set of loci
For choosing loci for haplotype associations, as 
described in Yu and Schaid, 2007.  The method performs three tests 
for association of a binary trait over a set of bi-allelic loci. 
When evaluating each locus, loci close to it are added in a sequential
manner based on the Mantel-Haenszel test. 

--------------------------
geno1to2: convert geno from 1- to 2-column
convert 1-column minor-allele-count matrix to two-column 
allele codes

---------------------
plot.haplo.score.slide: handle near-zero pvalues
For asymptotic pvalues near zero, set to epsilon.
For simulated, set to 0.5 divided by the number of simulations performed

-----------------------
haplo.design: create design matrix for haplotypes
In response to many requests made for getting columns for haplotype
effects to use in glm, survival, or other regression models, we
created a function to set up this kind of design matrix.  There are
issues surrounding the use of these effect columns, as outlined in the
user manual.  

----------------------
Ginv: svd problems continue
The Matrix library svd function has changed for Splus 8.0.1.
Therefore, revert back to the default svd function in getting the
generalized inverse.


=========================================================
#### changes made between release 1.2.0 and 1.2.5  #####
=========================================================

-----------------------------------
haplo.glm: Iterative steps efficiency
In consecutive steps of the IRWLS steps in haplo.glm, the starting
values for re-fitting the glm model were not updated to be the most
recently updated values.  This now saves about 20% of run time in
haplo.glm.

-----------------------------------
haplo.score: haplo.effect allow additive, dominant, recessive 
A new option to make haplo.score more flexible.  Previously the scores
for haplotypes were computed assuming an additive effect for all
haplotypes.  A new parameter, haplo.effect, is in place to allow
either additive, dominant, or recessive effects.  

-----------------------------------
haplo.score:  min.count parameter
The cut-off for selecting haplotypes to score is either by a minimum
frequency, skip.haplo, or a new option, min.count.  The min.count is
based on the same idea as that used in haplo.glm, where the minimum 
expected count of haplotypes in the population is enough such that 
accurate estimates of parameters and standard errors are computed.  The
min.count became needed when haplo.effect was added because under
the dominant or recessive models, the number of persons actually
having a haplotype effect could be fewer than the expected count
over the population (i.e., haplotype pair h1/h2 is coded as 0 for
both under recessive model, and h1/h1 is coded as 1 under dominant).  

---------------------------------------
haplo.em:  improved reliability of C routines
Previously problems had been observed with running haplo.em and
haplo.glm on linux 64-bit machines, because of issues with the storage
of integers in R.  In R, all integers are stored as int, which are
stored differently on 64-bit and 32-bit machines.  We get around this
problem by using all int types for integers, which are only used for
indices of other data structures.  We find out the max value for integers
on the system, and if the indices are going to exceed the max, issue a
warning from C.  

--------------------------------------
haplo.glm and Ginv: improvement of standard error calculations
Under some extreme circumstances, such as haplo.glm modeling
haplotypes with rare frequencies, or a high amount of variance in the
response, the standard error estimates were unreliable.
The issue came out in the Ginv function in haplo.stats, which needed a
smaller epsilon to decide on the rank of the information matrix. 


=========================================================
#### changes made between release 1.1.1 and 1.2.0  #####
=========================================================

---------------------
haplo.em:  fixed memory leak
Versions up to 1.1.1 had either one or two memory leaks in haplo.em.
They are fixed.

---------------------
All .C functions: Long Integers warning for 64-bit machine
Due to problems with long integers between 32-bit and 64-bit machines
using R, all integers used in C functions will use unsigned integers.

---------------------
haplo.glm:  haplo.effect="recessive"
the estimation stops if no columns are left in the model.matrix for 
homozygotes with the haplotype, and for haplotypes that do not have 
any subjects with a posterior probability of being homozygous for the 
haplotype, those subjects are grouped into the baseline effect.
Guidelines for rare haplotypes are explained further in the manual.

---------------------
haplo.glm:  na.action, when not specified got set to something besides
the intended 'na.geno.keep'.  Now the default setting works.

---------------------
haplo.cc: New Function for Case-Control Analysis
New function added to combine methods of haplo.score,
haplo.group and haplo.glm into one set of output for Case-Control
data.  Choose haplotypes for analysis by haplo.min.count only, not a 
frequency cut-off.

------------------
haplo.score: skip.haplo new default
Default for skip.haplo is now 5/(nrow(geno)*2)

------------------
haplo.glm: haplo.freq.min and haplo.min.count control parameters
Haplotypes used in the glm are still chosen by haplo.freq.min, but
the default is based on a minimum expected count of 5 in the
sample. The better choice for selecting haplotypes is
haplo.min.count. The issue is documented in the manual and help files.

-----------------
haplo.score: max-stat simulated p-value
A better description of this is included in the manual and help file

------------------
haplo.em.control and haplo.em:  defaults for control parameters
changed
The default for control parameter:
max.iter=5000, changed from 500
insert.batch.size = 6, changed from 4
 
------------------
locus
The genetics package for R has a function named locus which does not
agree with locus from haplo.stats.  We do not plan to change it, so be
aware of the possible clash if you use these two packages

------------------
haplo.scan:  new function
For analyzing a genome region with case-control
data.  Search for a trait-locus by sliding a fixed-width window over 
each marker locus and scanning all possible haplotype lengths within 
the window


=================================================================
### changes made prior to release 1.1.1  #####
=================================================================

----------------------
haplo.glm: Warnings for non-integer weights
glm.fit for R does not allow non-integer weights for subjects, whereas
S-PLUS does.  Use a glm.fit.nowarn function for R to ignore warnings.

---------------------
haplo.glm: Character Alleles
Local settings for strings as factors causes confusion for keeping
orinial character allele values.  To ensure consistency of allele
codes, use setupGeno() and then in the haplo.glm call, use allele.lev
as documented in the manual and help files.

---------------------
haplo.score.slide: add to package
Run haplo.score on all contiguous subsets of size n.slide from the 
loci in a genotype matrix (geno).

---------------------
haplo.score: simulations controlled for precision
Employ simulation precision criteria for p-values, adopted from
Besag and Clifford [1991].  Control simulations with 
score.sim.control.

