VCFRecord

Wrapper around bcf1_t

Because it uses bcf1_t internally, it must conform to the BCF2 part of the VCFv4.2 specs, rather than the loosey-goosey VCF specs. i.e., INFO, CONTIG, FILTER records must exist in the header.

TODO: Does this need to be kept in a consistent state? Ideally, VCFWriter would reject invalid ones, but we are informed that it is invalid (e.g. if contig not found) while building this struct; bcf_write1 will actually segfault, unfortunately. I'd like to avoid expensive validate() calls for every record before writing if possible, which means keeping this consistent. However, not sure what to do if error occurs when using the setters herein?

2019-01-23 struct->class to mirror SAMRecord -- faster if reference type?

2019-01-23 WIP: getters for chrom, pos, id, ref, alt are complete (untested)

After parsing a BCF or VCF line, bcf1_t must be unpacked. (not applicable when building bcf1_t from scratch) Depending on information needed, this can be done to various levels with performance tradeoff. Unpacking symbols:

More...

Constructors

this
this(T* h, bcf1_t* b, int MAX_UNPACK)
this(VCFHeader* vcfhdr, string chrom, int pos, string id, string _ref, string alt, float qual, SS filter)
this(VCFHeader* vcfhdr, string line, int MAX_UNPACK)

VCFRecord

Destructor

~this
~this()

disable copying to prevent double-free (which should not come up except when writeln'ing) dtor

Members

Functions

add
void add(const(char)[] tag, T data)

add INFO or FORMAT key:value pairs to a record add a single datapoint OR vector of values, OR, values to each sample (if tagType == FORMAT)

addFilter
int addFilter(string f)

Add a filter; from htslib: "If flt_id is PASS, all existing filters are removed first. If other than PASS, existing PASS is removed."

addFormat
void addFormat(string tag, T[] data)

auto bcf_update_format_int32(const(bcf_hdr_t) *hdr, bcf1_t *line, const(char) *key, int *values, int n) // @suppress(dscanner.style.undocumented_declaration)

addFormat
void addFormat(string tag, T data)

bc_update_format_{int32,flat,string,flag}

addID
int addID(const(char)[] id)

Append an ID (htslib performs duplicate checking)

addInfo
void addInfo(string tag, T data)

Add a tag:value to the INFO column -- tag must already exist in the header

addInfo
void addInfo(string tag, T[] data)

ditto This handles a vector of values for the tag

hasFilter
bool hasFilter(string filter)

Determine whether FILTER is present. log warning if filter does not exist. "PASS" and "." can be used interchangeably.

removeAllFilters
void removeAllFilters()

Remove all entries in FILTER

removeFilter
int removeFilter(string f)

Remove a filter by name

removeFilter
int removeFilter(int fid)

Remove a filter by numeric id

setAlleles
void setAlleles(string _ref, string alt)

Set alleles; alt can be comma separated

setAlleles
void setAlleles(string _ref, string alt, ...)

Set alleles; min. 2 alleles (ref, alt1); unlimited alts may be specified

setRefAllele
void setRefAllele(const(char)* r)

Set REF allele only param r is \0-term Cstring TODO: UNTESTED

toString
string toString()
Undocumented in source. Be warned that the author may not have intended to support it.

Properties

alleles
string alleles [@property setter]

Set alleles; comma-separated list

alleles
string[] alleles [@property setter]

Set alleles; array

allelesAsArray
string[] allelesAsArray [@property getter]

All alleles getter (array)

altAllelesAsArray
string[] altAllelesAsArray [@property getter]

Alternate alleles getter version 1: ["A", "ACTG", ...]

altAllelesAsString
string altAllelesAsString [@property getter]

Alternate alleles getter version 2: "A,ACTG,..."

chrom
string chrom [@property getter]

///// FIXED FIELDS //////// Get chromosome (CHROM)

chrom
const(char)[] chrom [@property setter]

Set chromosome (CHROM)

filter
string filter [@property getter]

Get FILTER column (nothing in htslib sadly)

filter
string filter [@property setter]

Set the FILTER column to f

filter
string[] filter [@property setter]

Set the FILTER column to f0,f1,f2... TODO: determine definitiely whether "." is replaced with "PASS"

id
string id [@property getter]

Get ID string

id
const(char)[] id [@property setter]

Sets new ID string; comma-separated list allowed but no dup checking performed

pos
int pos [@property getter]

Get position (POS)

pos
int pos [@property setter]

Set position (POS)

qual
float qual [@property getter]

Get variant quality (QUAL)

qual
float qual [@property setter]

Set variant quality (QUAL)

refAllele
string refAllele [@property getter]

Reference allele getter

refLen
int refLen [@property getter]

REF allele length

Variables

line
bcf1_t* line;

htslib structured record TODO: change to 'b' for better internal consistency? (vcf.h/c actually use line quite a bit in fn params)

vcfheader
VCFHeader* vcfheader;

corresponding header (required);

Detailed Description

BCF UN ALL

all

BCF UN SHR

all shared information (BCF_UN_STR|BCF_UN_FLT|BCF_UN_INFO)

BCF_UN_STR / BCF_UN_FLT | / BCF_UN_INFO | | / ____________________________ BCF_UN_FMT V V V / | | | #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003 ...

Meta