Index FASTA or FASTQ files and extract subsequence.
The fai file index columns for FASTA are:
- chromosome name
- chromosome length: number of bases
- offset: number of bytes to skip to get to the first base
from the beginning of the file, including the length
of the sequence description string (>chr ..\n)
- line length: number of bases per line (excluding \n)
- binary line length: number of bytes, including \n
The index for FASTQ is similar to above:
- chromosome name
- chromosome length: number of bases
- sequence offset: number of bytes to skip to get to the first base
from the beginning of the file, including the length
of the sequence description string (@chr ..\n)
- line length: number of bases per line (excluding \n)
- binary line length: number of bytes, including \n
- quality offset: number of bytes to skip from the beginning of the file
to get to the first quality value in the indexed entry.
The FASTQ version of the index uses line length and binary line length
for both the sequence and the quality values, so they must be line
wrapped in the same way.
Opaque structure representing FASTA index
@file
Index FASTA or FASTQ files and extract subsequence.
The fai file index columns for FASTA are: - chromosome name - chromosome length: number of bases - offset: number of bytes to skip to get to the first base from the beginning of the file, including the length of the sequence description string (>chr ..\n) - line length: number of bases per line (excluding \n) - binary line length: number of bytes, including \n
The index for FASTQ is similar to above: - chromosome name - chromosome length: number of bases - sequence offset: number of bytes to skip to get to the first base from the beginning of the file, including the length of the sequence description string (@chr ..\n) - line length: number of bases per line (excluding \n) - binary line length: number of bytes, including \n - quality offset: number of bytes to skip from the beginning of the file to get to the first quality value in the indexed entry.
The FASTQ version of the index uses line length and binary line length for both the sequence and the quality values, so they must be line wrapped in the same way. Opaque structure representing FASTA index