Python API

class tamp.Compressor(f, *, window: int = 10, literal: int = 8, dictionary: bytearray | None = None, lazy_matching: bool = False, extended: bool = True, dictionary_reset: bool = False, append: bool = False)[source]

Compresses data to a file or stream.

Parameters:

f (Union[str, Path, FileLike]) -- Path/FileHandle/Stream to write compressed data to.
window (int) -- Size of window buffer in bits. Higher values will typically result in higher compression ratios and higher computation cost. A same size buffer is required at decompression time. Valid range: [8, 15]. Defaults to 10 (1024 byte buffer).
literal (int) -- Number of used bits in each byte of data. The default 8 bits can store all data. A common other value is 7 for storing ascii characters where the most-significant-bit is always 0. Smaller values result in higher compression ratios for no additional computation cost. Valid range: [5, 8].
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At decompression time, the same initialized buffer must be provided. window must agree with the dictionary size. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()
lazy_matching (bool) -- Use roughly 50% more cpu to get 0~2% better compression.
append (bool) -- Initialize for appending to an existing stream instead of writing a header. Writes a FLUSH token that, combined with the previous stream's trailing FLUSH, triggers a dictionary reset in the decompressor. Requires dictionary_reset=True.

write(data: bytes | bytearray) → int[source]

Compress data to stream.

Parameters:: data (Union[bytes, bytearray]) -- Data to be compressed.
Returns:: Number of compressed bytes written. May be zero when data is filling up internal buffers.
Return type:: int

flush(write_token: bool = True) → int[source]

Flushes all internal buffers.

This compresses any data remaining in the input buffer, and flushes any remaining data in the output buffer to disk.

Parameters:: write_token (bool) -- If appropriate, write a FLUSH token. Defaults to True.
Returns:: Number of compressed bytes flushed to disk.
Return type:: int

reset_dictionary() → int[source]

Reset the dictionary and internal state, writing a double-FLUSH signal.

This allows the decompressor to re-initialize its dictionary at the corresponding point in the stream. Useful for long streams where the data characteristics change significantly.

Returns:: Number of compressed bytes written.
Return type:: int
Raises:: ValueError -- If the compressor was not initialized with dictionary_reset=True.

close() → int[source]

Flushes all internal buffers and closes the output file or stream, if tamp opened it.

Returns:: Number of compressed bytes flushed to disk.
Return type:: int

__enter__() → Compressor[source]

Use Compressor as a context manager.

with tamp.Compressor("output.tamp") as f:
    f.write(b"foo")

__exit__(exc_type, exc_value, traceback)[source]: Calls close() on contextmanager exit.

class tamp.TextCompressor(f, *, window: int = 10, literal: int = 8, dictionary: bytearray | None = None, lazy_matching: bool = False, extended: bool = True, dictionary_reset: bool = False, append: bool = False)[source]

Compresses text to a file or stream.

Parameters:

f (Union[str, Path, FileLike]) -- Path/FileHandle/Stream to write compressed data to.
window (int) -- Size of window buffer in bits. Higher values will typically result in higher compression ratios and higher computation cost. A same size buffer is required at decompression time. Valid range: [8, 15]. Defaults to 10 (1024 byte buffer).
literal (int) -- Number of used bits in each byte of data. The default 8 bits can store all data. A common other value is 7 for storing ascii characters where the most-significant-bit is always 0. Smaller values result in higher compression ratios for no additional computation cost. Valid range: [5, 8].
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At decompression time, the same initialized buffer must be provided. window must agree with the dictionary size. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()
lazy_matching (bool) -- Use roughly 50% more cpu to get 0~2% better compression.
append (bool) -- Initialize for appending to an existing stream instead of writing a header. Writes a FLUSH token that, combined with the previous stream's trailing FLUSH, triggers a dictionary reset in the decompressor. Requires dictionary_reset=True.

write(data: str) → int[source]

Compress data to stream.

Parameters:: data (Union[bytes, bytearray]) -- Data to be compressed.
Returns:: Number of compressed bytes written. May be zero when data is filling up internal buffers.
Return type:: int

__enter__() → Compressor[source]

Use Compressor as a context manager.

with tamp.Compressor("output.tamp") as f:
    f.write(b"foo")

__exit__(exc_type, exc_value, traceback)[source]: Calls close() on contextmanager exit.

close() → int[source]

Flushes all internal buffers and closes the output file or stream, if tamp opened it.

Returns:: Number of compressed bytes flushed to disk.
Return type:: int

flush(write_token: bool = True) → int[source]

Flushes all internal buffers.

This compresses any data remaining in the input buffer, and flushes any remaining data in the output buffer to disk.

Parameters:: write_token (bool) -- If appropriate, write a FLUSH token. Defaults to True.
Returns:: Number of compressed bytes flushed to disk.
Return type:: int

reset_dictionary() → int[source]

Reset the dictionary and internal state, writing a double-FLUSH signal.

This allows the decompressor to re-initialize its dictionary at the corresponding point in the stream. Useful for long streams where the data characteristics change significantly.

Returns:: Number of compressed bytes written.
Return type:: int
Raises:: ValueError -- If the compressor was not initialized with dictionary_reset=True.

tamp.compress(data: bytes | str, *, window: int = 10, literal: int = 8, dictionary: bytearray | None = None, lazy_matching: bool = False, extended: bool = True) → bytes[source]

Single-call to compress data.

Parameters:

data (Union[str, bytes]) -- Data to compress.
window (int) -- Size of window buffer in bits. Higher values will typically result in higher compression ratios and higher computation cost. A same size buffer is required at decompression time. Valid range: [8, 15]. Defaults to 10 (1024 byte buffer).
literal (int) -- Number of used bits in each byte of data. The default 8 bits can store all data. A common other value is 7 for storing ascii characters where the most-significant-bit is always 0. Valid range: [5, 8].
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At decompression time, the same initialized buffer must be provided. window must agree with the dictionary size. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()
lazy_matching (bool) -- Use roughly 50% more cpu to get 0~2% better compression.
extended (bool) -- Use extended compression format. Defaults to True.

Returns:

Compressed data

Return type:

bytes

class tamp.Decompressor(f, *, dictionary: bytearray | None = None)[source]

Decompresses a file or stream of tamp-compressed data.

Can be used as a context manager to automatically handle file opening and closing:

with tamp.Decompressor("compressed.tamp") as f:
    decompressed_data = f.read()

Parameters:

f (Union[file, str]) -- File-like object to read compressed bytes from.
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At compression time, the same initialized buffer must be provided. Must be at least 2**window bytes long for the stream's window (read from the header). The first 2**window bytes are used in place as the decompression window; bytes past that are never read or written. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()

readinto(buf: bytearray) → int[source]

Decompresses data into provided buffer.

Parameters:: buf (bytearray) -- Buffer to decode data into.
Returns:: Number of bytes decompressed into buffer.
Return type:: int

read(size: int = -1) → bytearray[source]

Decompresses data to bytes.

Parameters:: size (int) -- Maximum number of bytes to return. If a negative value is provided, all data will be returned. Defaults to -1.
Returns:: Decompressed data.
Return type:: bytearray

close()[source]: Closes the input file or stream, if tamp opened it.

__enter__()[source]

Use Decompressor as a context manager.

with tamp.Decompressor("output.tamp") as f:
    decompressed_data = f.read()

__exit__(exc_type, exc_value, traceback)[source]: Calls close() on contextmanager exit.

class tamp.TextDecompressor(f, *, dictionary: bytearray | None = None)[source]

Decompresses a file or stream of tamp-compressed data into text.

Parameters:

f (Union[file, str]) -- File-like object to read compressed bytes from.
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At compression time, the same initialized buffer must be provided. Must be at least 2**window bytes long for the stream's window (read from the header). The first 2**window bytes are used in place as the decompression window; bytes past that are never read or written. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()

read(size: int = -1) → str[source]

Decompresses data to text.

Parameters:: size (int) -- Maximum number of bytes to return. If a negative value is provided, all data will be returned. Defaults to -1.
Returns:: Decompressed text.
Return type:: str

__enter__()[source]

Use Decompressor as a context manager.

with tamp.Decompressor("output.tamp") as f:
    decompressed_data = f.read()

__exit__(exc_type, exc_value, traceback)[source]: Calls close() on contextmanager exit.

close()[source]: Closes the input file or stream, if tamp opened it.

readinto(buf: bytearray) → int[source]

Decompresses data into provided buffer.

Parameters:: buf (bytearray) -- Buffer to decode data into.
Returns:: Number of bytes decompressed into buffer.
Return type:: int

tamp.decompress(data: bytes, *, dictionary: bytearray | None = None) → bytearray[source]

Single-call to decompress data.

Parameters:

data (bytes) -- Tamp-compressed data to decompress.
dictionary (Optional[bytearray]) -- Use the given initialized buffer inplace. At compression time, the same initialized buffer must be provided. Decompression stream's window must agree with the dictionary size. If providing a pre-allocated buffer, but with default initialization, it must first be initialized with initialize_dictionary()

Returns:

Decompressed data.

Return type:

bytearray

tamp.open(f, mode='rb', **kwargs)

Opens a file for compressing/decompressing.

Example usage:

with tamp.open("file.tamp", "w") as f:
    # Opens a compressor in text-mode
    f.write("example text")

with tamp.open("file.tamp", "r") as f:
    # Opens a decompressor in text-mode
    assert f.read() == "example text"

Parameters:

f (Union[str, Path]) -- PathLike object to open.
mode (str) --
Opening mode. Must be some combination of {"r", "w", "b"}.
- Read-text-mode ("r") will return a tamp.TextDecompressor. Read data will be str.
- Read-binary-mode ("rb") will return a tamp.Decompressor. Read data will be bytes.
- Write-text-mode ("w") will return a tamp.TextCompressor. str must be provided to write().
- Write-binary-mode ("wb") will return a tamp.Compressor. bytes must be provided to write().
kwargs -- Passed along to class constructor.

Returns:

File-like object for compressing/decompressing.

tamp.initialize_dictionary(source, seed=None, literal=8)[source]

Initialize Dictionary.

The character table used for seeding depends on the literal bit width: for literal=7 or 8, common english text and markup characters are used; for literal=5 or 6, common english letters (" etaoinshrdlcumw") downshifted to the target bit width are used instead.

For v1 backwards compatibility, pass literal=8 (the default) when the extended header flag is not set.

Parameters:

size (Union[int, bytearray]) -- If a bytearray, will populate it with initial data. If an int, will allocate and initialize a bytearray of indicated size.
literal (int) -- Number of literal bits (5-8). Selects the appropriate seed character table. Defaults to 8.

Returns:

Initialized window dictionary.

Return type:

bytearray

exception tamp.ExcessBitsError[source]: Provided data has more bits than expected literal bits.