VCFRecord

Wrapper around bcf1_t

Because it uses bcf1_t internally, it must conform to the BCF2 part of the VCFv4.2 specs, rather than the loosey-goosey VCF specs. i.e., INFO, CONTIG, FILTER records must exist in the header.

TODO: Does this need to be kept in a consistent state? Ideally, VCFWriter would reject invalid ones, but we are informed that it is invalid (e.g. if contig not found) while building this struct; bcf_write1 will actually segfault, unfortunately. I'd like to avoid expensive validate() calls for every record before writing if possible, which means keeping this consistent. However, not sure what to do if error occurs when using the setters herein?

2019-01-23 struct->class to mirror SAMRecord -- faster if reference type?

2019-01-23 WIP: getters for chrom, pos, id, ref, alt are complete (untested)

After parsing a BCF or VCF line, bcf1_t must be unpacked. (not applicable when building bcf1_t from scratch) Depending on information needed, this can be done to various levels with performance tradeoff. Unpacking symbols:

More...

Constructors

this
this(T* h, bcf1_t* b, int MAX_UNPACK)
this(VCFHeader* vcfhdr, string chrom, int pos, string id, string _ref, string alt, float qual, SS filter)
this(VCFHeader* vcfhdr, string line, int MAX_UNPACK)

VCFRecord

Destructor

~this
~this()

disable copying to prevent double-free (which should not come up except when writeln'ing) dtor

Members

Functions

add
void add(const(char)[] tag, T data)

add INFO or FORMAT key:value pairs to a record add a single datapoint OR vector of values, OR, values to each sample (if tagType == FORMAT)

addFilter
int addFilter(string f)

Add a filter; from htslib: "If flt_id is PASS, all existing filters are removed first. If other than PASS, existing PASS is removed."

addFormat
void addFormat(string tag, T[] data)

Update FORMAT (sample info; column 9+) * * Templated on data type, calls one of bc_update_format_{int32,float,string,flag}

addFormat
void addFormat(string tag, T data)

Update FORMAT (sample info; column 9+) * * Templated on data type, calls one of bc_update_format_{int32,float,string,flag}

addID
int addID(const(char)[] id)

Append an ID (column 3) to the record. NOTE: htslib performs duplicate checking

addInfo
void addInfo(string tag, T data)
void addInfo(string tag, T[] data)

Update INFO (pan-sample info; column 8) * * Add a tag:value to the INFO column * NOTE: tag must already exist in the header * * Templated on data type, calls one of bcf_update_info_{int32,float,string,flag} * Both singletons and arrays are supported.

alleles
void alleles(string a)

Set alleles; comma-separated list

alleles
void alleles(string[] a)

Set alleles; array

allelesAsArray
string[] allelesAsArray()

All alleles getter (array)

altAllelesAsArray
string[] altAllelesAsArray()

Alternate alleles getter version 1: ["A", "ACTG", ...]

altAllelesAsString
string altAllelesAsString()

Alternate alleles getter version 2: "A,ACTG,..."

chrom
string chrom()

Get chromosome (CHROM)

chrom
void chrom(const(char)[] c)

Set chromosome (CHROM)

filter
string filter()

Get FILTER column (nothing in htslib sadly)

filter
void filter(string f)

Set the FILTER column to f

filter
void filter(string[] fs)

Set the FILTER column to f0,f1,f2... TODO: determine definitiely whether "." is replaced with "PASS"

hasFilter
bool hasFilter(string filter)

Determine whether FILTER is present. log warning if filter does not exist. "PASS" and "." can be used interchangeably.

id
string id()

Get ID string

id
int id(const(char)[] id)

Sets new ID string; comma-separated list allowed but no dup checking performed

pos
long pos()

Get position (POS, column 2) * * NB: internally BCF is uzing 0 based coordinates; we only show +1 when printing a VCF line with toString (which calls vcf_format)

pos
void pos(long p)

Set position (POS, column 2)

qual
float qual()

Get variant quality (QUAL, column 6)

qual
void qual(float q)

Set variant quality (QUAL, column 6)

refAllele
string refAllele()

Reference allele getter

refLen
long refLen()

REF allele length

removeAllFilters
void removeAllFilters()

Remove all entries in FILTER

removeFilter
int removeFilter(string f)

Remove a filter by name

removeFilter
int removeFilter(int fid)

Remove a filter by numeric id

setAlleles
void setAlleles(string _ref, string alt)

Set alleles; alt can be comma separated

setAlleles
void setAlleles(string _ref, string alt, ...)

Set alleles; min. 2 alleles (ref, alt1); unlimited alts may be specified

setRefAllele
void setRefAllele(const(char)* r)

Set REF allele only param r is \0-term Cstring TODO: UNTESTED

toString
string toString()

Return a string representation of the VCFRecord (i.e. as would appear in .vcf) As a bonus, there is a kstring_t memory leak

Variables

line
bcf1_t* line;

Undocumented in source.

vcfheader
VCFHeader* vcfheader;

Undocumented in source.

Detailed Description

BCF UN ALL

all

BCF UN SHR

all shared information (BCF_UN_STR|BCF_UN_FLT|BCF_UN_INFO)

BCF_UN_STR / BCF_UN_FLT | / BCF_UN_INFO | | / ____________________________ BCF_UN_FMT V V V / | | | #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003 ...

Meta