VCFRecord
disable copying to prevent double-free (which should not come up except when writeln'ing) dtor
add INFO or FORMAT key:value pairs to a record add a single datapoint OR vector of values, OR, values to each sample (if tagType == FORMAT)
Add a filter; from htslib: "If flt_id is PASS, all existing filters are removed first. If other than PASS, existing PASS is removed."
auto bcf_update_format_int32(const(bcf_hdr_t) *hdr, bcf1_t *line, const(char) *key, int *values, int n) // @suppress(dscanner.style.undocumented_declaration)
bc_update_format_{int32,flat,string,flag}
Append an ID (htslib performs duplicate checking)
Add a tag:value to the INFO column -- tag must already exist in the header
ditto This handles a vector of values for the tag
Determine whether FILTER is present. log warning if filter does not exist. "PASS" and "." can be used interchangeably.
Remove all entries in FILTER
Remove a filter by name
Remove a filter by numeric id
Set alleles; alt can be comma separated
Set alleles; min. 2 alleles (ref, alt1); unlimited alts may be specified
Set REF allele only param r is \0-term Cstring TODO: UNTESTED
Set alleles; comma-separated list
Set alleles; array
All alleles getter (array)
Alternate alleles getter version 1: ["A", "ACTG", ...]
Alternate alleles getter version 2: "A,ACTG,..."
///// FIXED FIELDS //////// Get chromosome (CHROM)
Set chromosome (CHROM)
Get FILTER column (nothing in htslib sadly)
Set the FILTER column to f
Set the FILTER column to f0,f1,f2... TODO: determine definitiely whether "." is replaced with "PASS"
Get ID string
Sets new ID string; comma-separated list allowed but no dup checking performed
Get position (POS)
Set position (POS)
Get variant quality (QUAL)
Set variant quality (QUAL)
Reference allele getter
REF allele length
htslib structured record TODO: change to 'b' for better internal consistency? (vcf.h/c actually use line quite a bit in fn params)
corresponding header (required);
all
all shared information (BCF_UN_STR|BCF_UN_FLT|BCF_UN_INFO)
BCF_UN_STR / BCF_UN_FLT | / BCF_UN_INFO | | / ____________________________ BCF_UN_FMT V V V / | | | #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003 ...
Wrapper around bcf1_t
Because it uses bcf1_t internally, it must conform to the BCF2 part of the VCFv4.2 specs, rather than the loosey-goosey VCF specs. i.e., INFO, CONTIG, FILTER records must exist in the header.
TODO: Does this need to be kept in a consistent state? Ideally, VCFWriter would reject invalid ones, but we are informed that it is invalid (e.g. if contig not found) while building this struct; bcf_write1 will actually segfault, unfortunately. I'd like to avoid expensive validate() calls for every record before writing if possible, which means keeping this consistent. However, not sure what to do if error occurs when using the setters herein?
2019-01-23 struct->class to mirror SAMRecord -- faster if reference type?
2019-01-23 WIP: getters for chrom, pos, id, ref, alt are complete (untested)
After parsing a BCF or VCF line, bcf1_t must be unpacked. (not applicable when building bcf1_t from scratch) Depending on information needed, this can be done to various levels with performance tradeoff. Unpacking symbols: