Class CBZip2OutputStream

java.lang.Object
java.io.OutputStream
org.apache.tools.bzip2.CBZip2OutputStream
All Implemented Interfaces:
java.io.Closeable, java.io.Flushable, java.lang.AutoCloseable, BZip2Constants

public class CBZip2OutputStream
extends java.io.OutputStream
implements BZip2Constants
An output stream that compresses into the BZip2 format (without the file header chars) into another stream.

The compression requires large amounts of memory. Thus you should call the close() method as soon as possible, to force CBZip2OutputStream to release the allocated memory.

You can shrink the amount of allocated memory and maybe raise the compression speed by choosing a lower blocksize, which in turn may cause a lower compression ratio. You can avoid unnecessary memory allocation by avoiding using a blocksize which is bigger than the size of the input.

You can compute the memory usage for compressing by the following formula:

 <code>400k + (9 * blocksize)</code>.
 

To get the memory required for decompression by CBZip2InputStream use

 <code>65k + (5 * blocksize)</code>.
 
Memory usage by blocksize
Blocksize Compression
memory usage
Decompression
memory usage
100k 1300k 565k
200k 2200k 1065k
300k 3100k 1565k
400k 4000k 2065k
500k 4900k 2565k
600k 5800k 3065k
700k 6700k 3565k
800k 7600k 4065k
900k 8500k 4565k

For decompression CBZip2InputStream allocates less memory if the bzipped input is smaller than one block.

Instances of this class are not threadsafe.

TODO: Update to BZip2 1.0.1

  • Field Summary

    Fields 
    Modifier and Type Field Description
    protected static int CLEARMASK
    This constant is accessible by subclasses for historical purposes.
    protected static int DEPTH_THRESH
    This constant is accessible by subclasses for historical purposes.
    protected static int GREATER_ICOST
    This constant is accessible by subclasses for historical purposes.
    protected static int LESSER_ICOST
    This constant is accessible by subclasses for historical purposes.
    static int MAX_BLOCKSIZE
    The maximum supported blocksize == 9.
    static int MIN_BLOCKSIZE
    The minimum supported blocksize == 1.
    protected static int QSORT_STACK_SIZE
    This constant is accessible by subclasses for historical purposes.
    protected static int SETMASK
    This constant is accessible by subclasses for historical purposes.
    protected static int SMALL_THRESH
    This constant is accessible by subclasses for historical purposes.
    protected static int WORK_FACTOR
    This constant is accessible by subclasses for historical purposes.
  • Constructor Summary

    Constructors 
    Constructor Description
    CBZip2OutputStream​(java.io.OutputStream out)
    Constructs a new CBZip2OutputStream with a blocksize of 900k.
    CBZip2OutputStream​(java.io.OutputStream out, int blockSize)
    Constructs a new CBZip2OutputStream with specified blocksize.
  • Method Summary

    Modifier and Type Method Description
    static int chooseBlockSize​(long inputLength)
    Chooses a blocksize based on the given length of the data to compress.
    void close()  
    protected void finalize()
    Overridden to close the stream.
    void finish()  
    void flush()  
    int getBlockSize()
    Returns the blocksize parameter specified at construction time.
    protected static void hbMakeCodeLengths​(char[] len, int[] freq, int alphaSize, int maxLen)
    This method is accessible by subclasses for historical purposes.
    void write​(byte[] buf, int offs, int len)  
    void write​(int b)

    Methods inherited from class java.io.OutputStream

    nullOutputStream, write

    Methods inherited from class java.lang.Object

    clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • MIN_BLOCKSIZE

      public static final int MIN_BLOCKSIZE
      The minimum supported blocksize == 1.
      See Also:
      Constant Field Values
    • MAX_BLOCKSIZE

      public static final int MAX_BLOCKSIZE
      The maximum supported blocksize == 9.
      See Also:
      Constant Field Values
    • SETMASK

      protected static final int SETMASK
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • CLEARMASK

      protected static final int CLEARMASK
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • GREATER_ICOST

      protected static final int GREATER_ICOST
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • LESSER_ICOST

      protected static final int LESSER_ICOST
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • SMALL_THRESH

      protected static final int SMALL_THRESH
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • DEPTH_THRESH

      protected static final int DEPTH_THRESH
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • WORK_FACTOR

      protected static final int WORK_FACTOR
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.
      See Also:
      Constant Field Values
    • QSORT_STACK_SIZE

      protected static final int QSORT_STACK_SIZE
      This constant is accessible by subclasses for historical purposes. If you don't know what it means then you don't need it.

      If you are ever unlucky/improbable enough to get a stack overflow whilst sorting, increase the following constant and try again. In practice I have never seen the stack go above 27 elems, so the following limit seems very generous.

      See Also:
      Constant Field Values
  • Constructor Details

    • CBZip2OutputStream

      public CBZip2OutputStream​(java.io.OutputStream out) throws java.io.IOException
      Constructs a new CBZip2OutputStream with a blocksize of 900k.

      Attention: The caller is responsible to write the two BZip2 magic bytes "BZ" to the specified stream prior to calling this constructor.

      Parameters:
      out - * the destination stream.
      Throws:
      java.io.IOException - if an I/O error occurs in the specified stream.
      java.lang.NullPointerException - if out == null.
    • CBZip2OutputStream

      public CBZip2OutputStream​(java.io.OutputStream out, int blockSize) throws java.io.IOException
      Constructs a new CBZip2OutputStream with specified blocksize.

      Attention: The caller is responsible to write the two BZip2 magic bytes "BZ" to the specified stream prior to calling this constructor.

      Parameters:
      out - the destination stream.
      blockSize - the blockSize as 100k units.
      Throws:
      java.io.IOException - if an I/O error occurs in the specified stream.
      java.lang.IllegalArgumentException - if (blockSize < 1) || (blockSize > 9).
      java.lang.NullPointerException - if out == null.
      See Also:
      MIN_BLOCKSIZE, MAX_BLOCKSIZE
  • Method Details

    • hbMakeCodeLengths

      protected static void hbMakeCodeLengths​(char[] len, int[] freq, int alphaSize, int maxLen)
      This method is accessible by subclasses for historical purposes. If you don't know what it does then you don't need it.
      Parameters:
      len - char[]
      freq - char[]
      alphaSize - int
      maxLen - int
    • chooseBlockSize

      public static int chooseBlockSize​(long inputLength)
      Chooses a blocksize based on the given length of the data to compress.
      Parameters:
      inputLength - The length of the data which will be compressed by CBZip2OutputStream.
      Returns:
      The blocksize, between MIN_BLOCKSIZE and MAX_BLOCKSIZE both inclusive. For a negative inputLength this method returns MAX_BLOCKSIZE always.
    • write

      public void write​(int b) throws java.io.IOException
      Specified by:
      write in class java.io.OutputStream
      Throws:
      java.io.IOException
    • finalize

      protected void finalize() throws java.lang.Throwable
      Overridden to close the stream.
      Overrides:
      finalize in class java.lang.Object
      Throws:
      java.lang.Throwable
    • finish

      public void finish() throws java.io.IOException
      Throws:
      java.io.IOException
    • close

      public void close() throws java.io.IOException
      Specified by:
      close in interface java.lang.AutoCloseable
      Specified by:
      close in interface java.io.Closeable
      Overrides:
      close in class java.io.OutputStream
      Throws:
      java.io.IOException
    • flush

      public void flush() throws java.io.IOException
      Specified by:
      flush in interface java.io.Flushable
      Overrides:
      flush in class java.io.OutputStream
      Throws:
      java.io.IOException
    • getBlockSize

      public final int getBlockSize()
      Returns the blocksize parameter specified at construction time.
      Returns:
      int
    • write

      public void write​(byte[] buf, int offs, int len) throws java.io.IOException
      Overrides:
      write in class java.io.OutputStream
      Throws:
      java.io.IOException