Data management of large-scale whole-genome sequence variant calls (Development version only)
UTILITIES
seqAlleleCount()
and seqGetAF_AC_Missing()
return NA instead of zero when all genotypes are missing at a siteseqGDS2VCF()
does not output the FORMAT column if there is no selected sample (e.g., site-only VCF files)seqGetData(, "$chrom_pos2")
is similar to seqGetData(, "$chrom_pos")
except the duplicates with the suffix ("_1", "_2" or >2)NEW FEATURES
seqGDS2BED()
can convert to PLINK BED files with the best-guess genotypes when there are only numeric dosages in the GDS fileseqEmptyFile()
outputs an empty GDS fileseqUnitCreate()
, seqUnitSubset()
and seqUnitMerge()
seqFilterPush()
and seqFilterPop()
seqGet2bGeno()
and seqGetAF_AC_Missing()
seqGetData(, "$dosage_sp")
for a sparse matrix of dosagesseqAlleleFreq()
, seqAlleleCount()
, seqMissing()
seqMulticoreSetup()
for setting a multicore cluster according to a numeric value assigned to the argument 'parallel'seqGDS2VCF()
, seqGDS2SNP()
, seqGDS2BED()
, seqVCF2GDS()
, seqSummary()
, seqCheck()
and seqMerge()
seqMissing()
, seqAlleleCount()
and seqAlleleFreq()
summary.SeqUnitListClass()
seqSNP2GDS()
if SNP dosage GDS is the inputseqUnitApply()
works correctly with selected samples if 'parallel' is a non-fork clusterseqVCF2GDS()
and seqVCF_Header()
work correctly if the VCF header has white spaceseqGDS2BED()
with selected samples for sex and phenotype informationseqGDS2VCF()
if there is no integer genotypeseqSetFilter()
for unsorted sample and variant indicesseqSetFilterAnnotID()
for unsorted variant indexseqSetFilterPos()
: new options 'ref' and 'alt', 'multi.pos=TRUE' by defaultseqAddValue()
for packing an indexing variableseqSetFilter()
to enable or disable the warningseqNewVarData()
and seqListVarData()
for variable-length dataseqApply()
and seqBlockApply()
seqGetData()
always have names if there are more than one input variable namesseqGDS2VCF()
should output "." instead of NA in the FILTER columnseqGetData()
should support factor when '.padNA=TRUE' or '.tolist=TRUE'seqGDS2VCF()
with factor variablesseqSummary(gds, "$filter")
should return a data frame with zero row if 'annotation/filter' is not a factorNEW FEATURES
seqAddValue()
UTILITIES
seqBED2GDS()
vignette("SeqArray")
can work directlyseqParallel()
BUG FIXES
seqBED2GDS(, verbose=FALSE)
should have no displayCHANGES
seqVCF2GDS()
and seqVCF_Header()
seqDigest()
requires the digest packageseqSNP2GDS()
imports dosage GDS filesseqVCF_Header()
allows a BCF file as an inputseqRecompress()
seqCheck()
for checking the data integrity of a SeqArray GDS fileseqGDS2SNP()
exports dosage GDS filesseqVCF2GDS()
and seqVCF_Header()
are able to import site-only VCF files (i.e., VCF with no sample)seqVCF2GDS()
and seqBCF2GDS()
since reading from connections in text mode is buffered for R >= v3.5.0Reading from connections in text mode is buffered for >= R_3.5.0.
No use buff
in the new version (>=3.5.0) of R_ext/Connections.h:
struct Rconn {
...
unsigned char *buff;
size_t buff_len, buff_stored_len, buff_pos;
};
Install:
library(devtools)
install_github("zhengxwen/SeqArray", ref="1d5ab05fa8ae8b754feab62f41ab00a182d54793")
R_GetConnection()
to accelerate text import and exportlibrary("devtools")
install_github("zhengxwen/SeqArray", ref="v1.11.18")