This function provides a wrapper for the fastPHASE executable in order to fit an HMM to either unphased genotype data or phased haplotype data. The software fastPHASE will fit the HMM to the genotype data and write the corresponding parameter estimates in four separate files. Since fastPHASE is not an R package, this executable must be downloaded separately by the user. Visit http://scheet.org/software.html for more information on how to obtain fastPHASE.

runFastPhase(fp_path, X_file, out_path = NULL, K = 12, numit = 25,
  phased = FALSE, seed = 1)

Arguments

fp_path

a string with the path to the directory with the fastPHASE executable.

X_file

a string with the path of the genotype input file containing X in fastPHASE format (as created by writeXtoInp).

out_path

a string with the path of the directory in which the parameter estimates will be saved (default: NULL). If this is equal to NULL, a temporary file in the R temporary directory will be used.

K

the number of hidden states for each haplotype sequence (default: 12).

numit

the number of EM iterations (default: 25).

phased

whether the data are already phased (default: FALSE).

seed

the random seed for the EM algorithm (default: 1).

Value

A string containing the path of the directory in which the parameter estimates were saved. This is useful to find the data when the default option for `out_path` is used and the output is written in an R temporary directory.

Details

The software fastPHASE saves the parameter estimates in four separate files whose names begin with the string contained in 'out_path' and end with:

  • "_rhat.txt"

  • "_alphahat.txt"

  • "_thetahat.txt"

  • "_origchars"

The HMM for the genotype data can then be loaded from these files by calling loadHMM.

References

Scheet P, Stephens M (2006). “A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.” Am. J. Hum. Genet., 78, 629--644. doi: 10.1086/502802 .

See also

Other fastPHASE: loadHMM, writeXtoInp

Examples

fp_path = "~/bin/fastPHASE" # Path to the fastPHASE executable # Run fastPHASE on unphased genotypes # Specify the path to the genotype input file in ".inp" format. # An example file containing unphased genotypes can be found in the package installation folder. X_file = system.file("extdata", "genotypes.inp", package = "SNPknock") fp_outPath = runFastPhase(fp_path, X_file)
#> SNPknock could find the fastPHASE executable: '~/bin/fastPHASE' does not exist. #> If you have not downloaded it yet, you can obtain fastPHASE from: http://scheet.org/software.html
# Run fastPHASE on phased haplotypes # An example file containing phased haplotypes can be found in the package installation folder. H_file = system.file("extdata", "haplotypes.inp", package = "SNPknock") fp_outPath = runFastPhase(fp_path, H_file, phased=TRUE)
#> SNPknock could find the fastPHASE executable: '~/bin/fastPHASE' does not exist. #> If you have not downloaded it yet, you can obtain fastPHASE from: http://scheet.org/software.html