For backwards compatibility
////////////////////////////////////////////////////////
@param iter Iterator to free
Iterator with multiple regions *
Flags for hts_idx_load3() ( and also sam_idx_load3(), tbx_idx_load3() )
////////////////////////////////////////////////////////
Compression type
Specific format (SAM, BAM, CRAM, BCF, VCF, TBI, BED, etc.)
File I/O * Broad format category (sequence data, variant data, index, regions, etc.)
Mostly CRAM only, but this could also include other format options
Profile options for encoding; primarily used at present in CRAM but also usable in BAM as a synonym for deflate compression levels.
REQUIRED_FIELDS
Endianness *
Compute the level of a bin in a binning index
! @abstract Determine whether a given htsFile contains a valid EOF block @return 3 for a non-EOF checkable filetype; 2 for an unseekable file type where EOF cannot be checked; 1 for a valid EOF block; 0 for if the EOF marker is absent when it should be present; -1 (with errno set) on failure @discussion Check if the BGZF end-of-file (EOF) marker is present
! @abstract Close a file handle, flushing buffered data for output streams @param fp The file handle to be closed @return 0 for success, or negative if an error occurred.
! @abstract Determine format by peeking at the start of a file @param fp File opened for reading, positioned at the beginning @param fmt Format structure that will be filled out on return @return 0 for success, or negative if an error occurred.
! @abstract Introspection on the features enabled in htslib, string form
! @abstract Introspection on the features enabled in htslib
! @abstract Get a human-readable description of the file format @param fmt Format structure holding type, version, compression, etc. @return Description string, to be freed by the caller after use.
! @ abstract Returns a string containing the file format extension. @ param format Format structure containing the file type. @ return A string ("sam", "bam", etc) or "?" for unknown formats.
Wrapper function for free(). Enables memory deallocation across DLL boundary. Should be used by all applications, which are compiled with a different standard library than htslib and call htslib methods that return dynamically allocated data.
! @abstract Returns the file's format information @param fp The file handle @return Read-only pointer to the file's htsFormat.
! @abstract Read a line (and its \n or \r\n terminator) from a file @param fp The file handle @param delimiter Unused, but must be '\n' (or KS_SEP_LINE) @param str The line (not including the terminator) is written here @return Length of the string read; -1 on end-of-file; <= -2 on error
! @abstract Open an existing stream as a SAM/BAM/CRAM/VCF/BCF/etc file @param fn The already-open file handle @param mode Open mode, as per hts_open()
@param idx Index structure to free
@param idx Index @param final_offset Last file offset @return 0 on success; non-zero on failure.
@param idx Index @return One of HTS_FMT_CSI, HTS_FMT_BAI or HTS_FMT_TBI
@param idx The index @param l_meta Pointer to where the length of the extra data is stored @return Pointer to the extra data if present; NULL otherwise
@param idx Index @return Unplaced reads count
@param n Initial number of targets @param fmt Format, one of HTS_FMT_CSI, HTS_FMT_BAI or HTS_FMT_TBI @param offset0 Initial file offset @param min_shift Number of bits for the minimal interval @param n_lvls Number of levels in the binning index @return An initialised hts_idx_t struct on success; NULL on failure
@param fn BAM/BCF/etc filename, to which .bai/.csi/etc will be added or the extension substituted, to search for an existing index file. In case of a non-standard naming, the file name can include the name of the index file delimited with HTS_IDX_DELIM. @param fmt One of the HTS_FMT_* index formats @return The index, or NULL if an error occurred.
@param fn Input BAM/BCF/etc filename @param fnidx The input index filename @return The index, or NULL if an error occurred.
@param fn Input BAM/BCF/etc filename @param fnidx The input index filename @param fmt One of the HTS_FMT_* index formats @param flags Flags to alter behaviour (see description) @return The index, or NULL if an error occurred.
@param idx Index @return The number of targets
@param idx Index @param tid Target id @param beg Range start (zero-based) @param end Range end (zero-based, half-open) @param offset File offset @param is_mapped Range corresponds to a mapped read @return 0 on success; -1 on failure
@param idx Index to be written @param fn Input BAM/BCF/etc filename, to which .bai/.csi/etc will be added @param fmt One of the HTS_FMT_* index formats @return 0 if successful, or negative if an error occurred.
@param idx Index to be written @param fn Input BAM/BCF/etc filename @param fnidx Output filename, or NULL to add .bai/.csi/etc to @a fn @param fmt One of the HTS_FMT_* index formats @return 0 if successful, or negative if an error occurred.
@param idx Index @paramout n Location to store the number of targets @param getid Callback function to get the name for a target ID @param hdr Header from indexed file @return An array of pointers to the names on success; NULL on failure
@param idx The index @param l_meta Length of data @param meta Pointer to the extra data @param is_copy If not zero, a copy of the data is taken @return 0 on success; -1 on failure (out of memory).
@param idx Index @param tid Target identifier @param name Target name @return Index number of name in names list on success; -1 on failure.
@param iter Iterator to free
@param fp Input file handle @param iter Iterator @param r Pointer to record placeholder @return >= 0 on success, -1 when there is no more data, < -1 on error
@param fp Input file handle @param iter Iterator @param r Pointer to record placeholder @param data Data passed to the readrec callback @return >= 0 on success, -1 when there is no more data, < -1 on error
@param idx Index @param tid Target ID @param beg Start of region @param end End of region @param readrec Callback to read a record from the input file @return An iterator on success; NULL on failure
@param idx Index @param reg Region specifier @param getid Callback function to return the target ID for a name @param hdr Input file header @param itr_query Callback function returning an iterator for a numeric tid, start and end position @param readrec Callback to read a record from the input file @return An iterator on success; NULL on error
@param idx Index @param reglist Region list @param count Number of items in region list @param getid Callback to convert names to target IDs @param hdr Indexed file header (passed to getid) @param itr_specific Filetype-specific callback function @param readrec Callback to read an input file record @param seek Callback to seek in the input file @param tell Callback to return current input file location @return An iterator on success; NULL on failure
@discussion * Normally HTSlib cleans up automatically when your program exits, * whether that is via exit(3) or returning from main(). However if you * have dlopen(3)ed HTSlib and wish to close it before your main program * exits, you must call hts_lib_shutdown() before dlclose(3).
! @abstract Open a sequence data (SAM/BAM/CRAM) or variant data (VCF/BCF) or possibly-compressed textual line-orientated file @param fn The file name or "-" for stdin/stdout. For indexed files with a non-standard naming, the file name can include the name of the index file delimited with HTS_IDX_DELIM @param mode Mode matching / rwa[bcefFguxz0-9]* / @discussion With 'r' opens for reading; any further format mode letters are ignored as the format is detected by checking the first few bytes or BGZF blocks of the file. With 'w' or 'a' opens for writing or appending, with format specifier letters: b binary format (BAM, BCF, etc) rather than text (SAM, VCF, etc) c CRAM format g gzip compressed u uncompressed z bgzf compressed [0-9] zlib compression level and with non-format option letters (for any of 'r'/'w'/'a'): e close the file on exec(2) (opens with O_CLOEXEC, where supported) x create the file exclusively (opens with O_EXCL, where supported) Note that there is a distinction between 'u' and '0': the first yields plain uncompressed output whereas the latter outputs uncompressed data wrapped in the zlib format. @example rwb .. compressed BCF, BAM, FAI rwbu .. uncompressed BCF rwz .. compressed VCF rw .. uncompressed VCF
! @abstract Open a SAM/BAM/CRAM/VCF/BCF/etc file @param fn The file name or "-" for stdin/stdout @param mode Open mode, as per hts_open() @param fmt Optional format specific parameters @discussion See hts_open() for description of fn and mode. // TODO Update documentation for s/opts/fmt/ Opts contains a format string (sam, bam, cram, vcf, bcf) which will, if defined, override mode. Opts also contains a linked list of hts_opt structures to apply to the open file handle. These can contain things like pointers to the reference or information on compression levels, block sizes, etc.
Parses arg and appends it to the option list.
Applies an hts_opt option list to a given htsFile.
Frees an hts_opt list.
The number may be expressed in scientific notation, and optionally may contain commas in the integer part (before any decimal point or E notation). @param str String to be parsed @param strend If non-NULL, set on return to point to the first character in @a str after those forming the parsed number @param flags Or'ed-together combination of HTS_PARSE_* flags @return Converted value of the parsed number.
Accepts a string file format (sam, bam, cram, vcf, bam) optionally followed by a comma separated list of key=value options and splits these up into the fields of htsFormat struct.
Tokenise options as (key(=value)?,)*(key(=value)?)? NB: No provision for ',' appearing in the value! Add backslashing rules?
@param str String to be parsed @param beg Set on return to the 0-based start of the region @param end Set on return to the 1-based end of the region @return Pointer to the colon or '\0' after the reference sequence name, or NULL if @a str could not be parsed.
@param str String to be parsed @param beg Set on return to the 0-based start of the region @param end Set on return to the 1-based end of the region @return Pointer to the colon or '\0' after the reference sequence name, or NULL if @a str could not be parsed.
@param str String to be parsed @param tid Set on return (if not NULL) to be reference index (-1 if invalid) @param beg Set on return to the 0-based start of the region @param end Set on return to the 1-based end of the region @param getid Function pointer. Called if not NULL to set tid. @param hdr Caller data passed to getid. @param flags Bitwise HTS_PARSE_* flags listed above. @return Pointer to the byte after the end of the entire region specifier (including any trailing comma) on success, or NULL if @a str could not be parsed.
! @abstract Parse comma-separated list or read list from a file @param list File name or comma-separated list @param is_file @param _n Size of the output array (number of items read) @return NULL on failure or pointer to newly allocated array of strings
@param argv Char array of target:interval elements, e.g. chr1:2500-3600, chr1:5100, chr2 @param argc Number of items in the array @param r_count Pointer to the number of items in the resulting region list @param hdr Header for the sam/bam/cram file @param getid Callback to convert target names to target ids. @return A region list on success, NULL on failure
@param reglist Region list @param count Number of items in the list
@hideinitializer Macro to expand a dynamic array of a given type
@hideinitializer Macro to expand a dynamic array, zeroing any newly-allocated memory
! @abstract Adds a cache of decompressed blocks, potentially speeding up seeks. This may not work for all file types (currently it is bgzf only). @param fp The file handle @param n The size of cache, in bytes
! @abstract Sets a filter expression @return 0 for success, negative on failure @discussion To clear an existing filter, specifying expr as NULL.
! @abstract Sets a specified CRAM option on the open file handle. @param fp The file handle open the open file. @param opt The CRAM_OPT_* option. @param ... Optional arguments, dependent on the option used. @return 0 for success, or negative if an error occurred.
! @abstract Create extra threads to aid compress/decompression for this file @param fp The file handle @param p A pool of worker threads, previously allocated by hts_create_threads(). @return 0 for success, or negative if an error occurred.
! @abstract Create extra threads to aid compress/decompression for this file @param fp The file handle @param n The number of worker threads to create @return 0 for success, or negative if an error occurred. @notes This function creates non-shared threads for use solely by fp. The hts_set_thread_pool function is the recommended alternative.
! @abstract Get the htslib version number @return For released versions, a string like "N.N.N"; or git describe output if using a library built within a Git repository.
@param ref Reference sequence @param l_ref Length of reference @param query Query sequence @param l_query Length of query sequence @param iqual Query base qualities @param c Alignment parameters @paramout state Output alignment @paramout q Phred scaled posterior probability of statei being wrong @return Phred-scaled likelihood score, or INT_MIN on failure.
hts_file_type() - Convenience function to determine file type
Build params
Whether ./configure was used or vanilla Makefile
Transport specific
Compression options
Whether --enable-plugins was used
! These HTS_IDX_* macros are used as special tid values for hts_itr_query()/etc, producing iterators operating as follows: - HTS_IDX_NOCOOR iterates over unmapped reads sorted at the end of the file - HTS_IDX_START iterates over the entire file - HTS_IDX_REST iterates from the current position to the end of the file - HTS_IDX_NONE always returns "no more alignment records" When one of these special tid values is used, beg and end are ignored. When REST or NONE is used, idx is also ignored and may be NULL.
! @abstract Compile-time HTSlib version number, for use in #if checks @return For released versions X.Y.Z, an integer of the form XYYYZZ; useful for preprocessor conditionals such as #if HTS_VERSION >= 101000 // Check for v1.10 or later
! @abstract Table for converting a 4-bit encoded nucleotide to about 2 bits. Returns 0/1/2/3 for 1/2/4/8 (i.e., A/C/G/T), or 4 otherwise (0 or ambiguous).
! @abstract Table for converting a 4-bit encoded nucleotide to an IUPAC ambiguity code letter (or '=' when given 0).
? index key
Revised MAQ error model *
@brief File handle returned by hts_open() etc. This structure should be considered opaque by end users. There should be no need to access most fields directly in user code, and in cases where it is desirable accessor functions such as hts_get_format() are provided.
A combined thread pool and queue allocation size. The pool should already be defined, but qsize may be zero to indicate an appropriate queue size is taken from the pool.
@brief File iterator that can handle multiple target regions. This structure should be considered opaque by end users. It does both the stepping inside the file and the filtering of alignments. It can operate in single or multi-region mode, and depending on this, it uses different fields.
MD5 implementation *
Options for cache, (de)compression, threads, CRAM, etc.
64-bit start, end coordinate pair tracking max (internally used in hts.c)
Region list used in iterators (NB: apparently confined to single contig/tid)
Probabilistic banded glocal alignment * See https://doi.org/10.1093/bioinformatics/btr076 *
@file htslib/hts.h Format-neutral I/O, indexing, and iterator API functions.