Class BlockCompressedOutputStream

  • All Implemented Interfaces:
    LocationAware, Closeable, Flushable, AutoCloseable
    Direct Known Subclasses:
    TerminatorlessBlockCompressedOutputStream

    public class BlockCompressedOutputStream
    extends OutputStream
    implements LocationAware
    Writer for a file that is a series of gzip blocks (BGZF format). The caller just treats it as an OutputStream, and under the covers a gzip block is written when the amount of uncompressed as-yet-unwritten bytes reaches a threshold. The advantage of BGZF over conventional gzip is that BGZF allows for seeking without having to scan through the entire file up to the position being sought. Note that the flush() method should not be called by client unless you know what you're doing, because it forces a gzip block to be written even if the number of buffered bytes has not reached threshold. close(), on the other hand, must be called when done writing in order to force the last gzip block to be written. c.f. http://samtools.sourceforge.net/SAM1.pdf for details of BGZF file format.
    • Method Detail

      • setDefaultCompressionLevel

        public static void setDefaultCompressionLevel​(int compressionLevel)
        Sets the GZip compression level for subsequent BlockCompressedOutputStream object creation that do not specify the compression level.
        Parameters:
        compressionLevel - 1 <= compressionLevel <= 9
      • getDefaultCompressionLevel

        public static int getDefaultCompressionLevel()
      • setDefaultDeflaterFactory

        public static void setDefaultDeflaterFactory​(DeflaterFactory deflaterFactory)
        Sets the default DeflaterFactory that will be used for all instances unless specified otherwise in the constructor. If this method is not called the default is a factory that will create the JDK Deflater.
        Parameters:
        deflaterFactory - non-null default factory.
      • getDefaultDeflaterFactory

        public static DeflaterFactory getDefaultDeflaterFactory()
      • maybeBgzfWrapOutputStream

        public static BlockCompressedOutputStream maybeBgzfWrapOutputStream​(File location,
                                                                            OutputStream output)
        Parameters:
        location - May be null. Used for error messages, and for checking file termination.
        output - May or not already be a BlockCompressedOutputStream.
        Returns:
        A BlockCompressedOutputStream, either by wrapping the given OutputStream, or by casting if it already is a BCOS.
      • addIndexer

        public void addIndexer​(OutputStream outputStream)
        Adds a GZIIndexer to the block compressed output stream to be written to the specified output stream. See GZIIndex for details on the index. Note that the stream will be written to disk entirely when close() is called.
        Throws:
        RuntimeException - this method is called after output has already been written to the stream.
      • write

        public void write​(byte[] bytes)
                   throws IOException
        Writes b.length bytes from the specified byte array to this output stream. The general contract for write(b) is that it should have exactly the same effect as the call write(b, 0, b.length).
        Overrides:
        write in class OutputStream
        Parameters:
        bytes - the data
        Throws:
        IOException
      • write

        public void write​(byte[] bytes,
                          int startIndex,
                          int numBytes)
                   throws IOException
        Writes len bytes from the specified byte array starting at offset off to this output stream. The general contract for write(b, off, len) is that some of the bytes in the array b are written to the output stream in order; element b[off] is the first byte written and b[off+len-1] is the last byte written by this operation.
        Overrides:
        write in class OutputStream
        Parameters:
        bytes - the data
        startIndex - the start offset in the data
        numBytes - the number of bytes to write
        Throws:
        IOException
      • flush

        public void flush()
                   throws IOException
        WARNING: flush() affects the output format, because it causes the current contents of uncompressedBuffer to be compressed and written, even if it isn't full. Unless you know what you're doing, don't call flush(). Instead, call close(), which will flush any unwritten data before closing the underlying stream.
        Specified by:
        flush in interface Flushable
        Overrides:
        flush in class OutputStream
        Throws:
        IOException
      • write

        public void write​(int bite)
                   throws IOException
        Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
        Specified by:
        write in class OutputStream
        Parameters:
        bite -
        Throws:
        IOException
      • getFilePointer

        public long getFilePointer()
        Encode virtual file pointer Upper 48 bits is the byte offset into the compressed stream of a block. Lower 16 bits is the byte offset into the uncompressed stream inside the block.
      • getPosition

        public long getPosition()
        Description copied from interface: LocationAware
        The current offset, in bytes, of this stream/writer/file. Or, if this is an iterator/producer, the offset (in bytes) of the END of the most recently returned record (since a produced record corresponds to something that has been read already). See class javadoc for more. Note that for BGZF files, this does not represent an actually file position, but a virtual file pointer.
        Specified by:
        getPosition in interface LocationAware