6. SNAPPY#

group SNAPPY_API

A light-weight compression algorithm. It is designed for speed of compression and decompression, rather than for the utmost in space savings.

For getting better compression ratios when you are compressing data with long repeated sequences or compressing data that is similar to other data, while still compressing fast, you might look at first using BMDiff and then compressing the output of BMDiff with Snappy.

Generic compression/decompression routines.

size_t Compress(Source *source, Sink *sink)#

Compress the bytes read from “*source” and append to “*sink”. Return the number of bytes written.

Parameters

Direction

Description

source

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen);where inBuf is the pointer to original data and inBufLen is the size of inBuf

sink

in,out

A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.

Returns:

Result

Description

Success

Return the number of bytes written.

Failure

Return 0 upon failure or NULL parameters are passed

bool GetUncompressedLength(Source *source, uint32_t *result)#

Find the uncompressed length of the given stream, as given by the header. Note that the true length could deviate from this; the stream could e.g. be truncated.

Parameters

Direction

Description

source

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf

result

out

Uncompressed length of the given stream is stored here.

Note

Also note that this leaves “*source” in a state that is unsuitable for further operations, such as RawUncompress(). You will need to rewind or recreate the source yourself before attempting any further calls.

Returns:

Result

Description

Success

If the data inside the source is uncorrupted it will return true.

Failure

It will return false if the data inside the source is corrupted.

bool Uncompress(Source *compressed, Sink *uncompressed)#

Decompresses “compressed” to “*uncompressed”.

Parameters

Direction

Description

compressed

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .

uncompressed

in,out

A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.

Returns:

Result

Description

Success

Returns true if successful.

Failure

Returns false if the decompression fails.

size_t UncompressAsMuchAsPossible(Source *compressed, Sink *uncompressed)#

This routine decompresses as much of the “compressed” as possible into sink. It returns the number of valid bytes added to sink (extra invalid bytes may have been added due to errors; the caller should ignore those). The emitted data typically has length GetUncompressedLength(), but may be shorter if an error is encountered.

Parameters

Direction

Description

compressed

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .

uncompressed

in,out

A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.

Returns:

Result

Description

Success

It returns the number of valid bytes added to sink (extra invalid bytes may have been added due to errors; the caller should ignore those)

Failure

Returns 0 if the message is corrupted and could not be decompressed or NULL parameters are passed.

Higher-level string based routines.

Higher-level string based routines (should be sufficient for most users)

size_t Compress(const char *input, size_t input_length, std::string *compressed)#

Sets “*compressed” to the compressed version of “input[0,input_length-1]”. Original contents of *compressed are lost.

Parameters

Direction

Description

input

in

This is the buffer where the data we want to compress is accessible.

input_length

in

Length of the input buffer.

compressed

in,out

This is a buffer in which compressed data is stored.

Attention

REQUIRES: “input[]” is not an alias of “*compressed”.

Returns:

Result

Description

Success

Return the number of bytes written.

Failure

Return 0 upon failure or NULL parameters are passed

bool Uncompress(const char *compressed, size_t compressed_length, std::string *uncompressed)#

Decompresses “compressed[0,compressed_length-1]” to “*uncompressed”. Original contents of “*uncompressed” are lost.

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data.

compressed_length

in

This is the length of the compressed buffer.

uncompressed

out

Uncompressed data is stored in this buffer.

Attention

REQUIRES: “compressed[]” is not an alias of “*uncompressed”.

Returns:

Result

Description

Success

If the data inside the compressed is successfully decompressed it will return true.

Failure

It will return false if the decompression fails.

Lower-level character array based routines.

These May be useful for efficiency reasons in certain circumstances.

void RawCompress(const char *input, size_t input_length, char *compressed, size_t *compressed_length)#

Takes the data stored in “input[0..input_length]” and stores it in the array pointed to by “compressed”.

“*compressed_length” is set to the length of the compressed output.

Parameters

Direction

Description

input

in

This is the buffer where the data we want to compress is accessible.

input_length

in

Length of the input buffer.

compressed

out

This is a buffer in which compressed data is stored.

compressed_length

out

The length of the data after compression is stored in this.

Attention

REQUIRES: “compressed” must point to an area of memory that is at least “MaxCompressedLength(input_length)” bytes in length.

Note

- Example:

char  output = new char[snappy::MaxCompressedLength(input_length)];\n
size_t output_length;\n
RawCompress(input, input_length, output, &output_length);\n
... Process(output, output_length) ...\n
delete [] output;\n

Returns:

void

bool RawUncompress(const char *compressed, size_t compressed_length, char *uncompressed)#

Given data in “compressed[0..compressed_length-1]” generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to uncompressed[0..GetUncompressedLength(compressed)-1] .

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data.

compressed_length

in

This is the length of the compressed buffer.

uncompressed

out

Uncompressed data is stored in this buffer.

Returns:

Result

Description

Success

Returns true if successful.

Failure

Returns false if the message is corrupted and could not be decrypted.

bool RawUncompress(Source *compressed, char *uncompressed)#

Given data from the byte source ‘compressed’ generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to uncompressed[0..GetUncompressedLength(compressed,compressed_length)-1] .

Parameters

Direction

Description

compressed

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .

uncompressed

out

Uncompressed data is stored in this buffer.

Returns:

Result

Description

Success

Returns true if successful.

Failure

Returns false if the message is corrupted and could not be decrypted.

bool RawUncompressToIOVec(const char *compressed, size_t compressed_length, const struct iovec *iov, size_t iov_cnt)#

Given data in “compressed[0..compressed_length-1]” generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to the iovec “iov”. The number of physical buffers in “iov” is given by iov_cnt and their cumulative size must be at least GetUncompressedLength(compressed). The individual buffers in “iov” must not overlap with each other.

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data.

compressed_length

in

This is the length of the compressed buffer.

iov

in,out

The struct iovec defines one vector element. Normally, this structure is used as an array of multiple elements.

iov_cnt

in,out

This is the number of iovec structures in the array of iov .

Returns:

Result

Description

Success

Returns true if successful.

Failure

Returns false if the message is corrupted and could not be decrypted.

bool RawUncompressToIOVec(Source *compressed, const struct iovec *iov, size_t iov_cnt)#

Given data from the byte source ‘compressed’ generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to the iovec “iov”. The number of physical buffers in “iov” is given by iov_cnt and their cumulative size must be at least GetUncompressedLength(compressed). The individual buffers in “iov” must not overlap with each other.

Parameters

Direction

Description

compressed

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .

iov

in,out

The struct iovec defines one vector element. Normally, this structure is used as an array of multiple elements.

iov_cnt

out

This is the number of iovec structures in the array of iov .

Returns:

Result

Description

Success

Returns true if successful.

Failure

Returns false if the message is corrupted and could not be decrypted.

Helper Functions.

size_t MaxCompressedLength(size_t source_bytes)#

This function determines the maximal size of the compressed representation of input data that is “source_bytes” bytes in length.

Parameters

Direction

Description

source_bytes

in

The size of source in bytes.

Returns:

Result

Description

Success

Returns the maximal size of the compressed representation of input data that is “source_bytes” bytes in length.

bool GetUncompressedLength(const char *compressed, size_t compressed_length, size_t *result)#

Get the Uncompressed Length object.

This operation takes O(1) time.

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data.

compressed_length

in

This is the length of the compressed buffer.

result

out

This is the pointer to type size_t where the uncompressed length is stored.

Attention

REQUIRES: “compressed[]” was produced by RawCompress() or Compress().

Returns:

Result

Description

Success

Returns true on successful parsing.

Failure

Returns false on parsing error.

bool GetUncompressedLengthFromMTCompressedBuffer(const char *compressed, size_t compressed_length, size_t *result)#

Get the Uncompressed Length object from the AOCL multithreaded compressor’s compressed buffer.

This operation takes O(1) time.

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data. (along with the RAP frame)

compressed_length

in

This is the length of the compressed buffer (including the RAP frame).

result

out

This is the pointer to type size_t where the uncompressed length is stored.

Attention

REQUIRES: “compressed[]” was produced by RawCompress() or Compress() IN AOCL’s MULTITHREADED MODE.

Returns:

Result

Description

Success

Returns true on successful parsing.

Failure

Returns false on parsing error.

bool IsValidCompressedBuffer(const char *compressed, size_t compressed_length)#

Returns true iff the contents of “compressed[]” can be uncompressed successfully. Does not return the uncompressed data. Takes time proportional to compressed_length, but is usually at least a factor of four faster than actual decompression.

Parameters

Direction

Description

compressed

in

This is a buffer which contains compressed data.

compressed_length

in

This is the length of the compressed buffer.

Returns:

Result

Description

Success

Returns true iff the contents of “compressed[]” can be uncompressed successfully.

Failure

Returns false if error.

bool IsValidCompressed(Source *compressed)#

Returns true iff the contents of “compressed” can be uncompressed successfully. Does not return the uncompressed data. Takes time proportional to *compressed length, but is usually at least a factor of four faster than actual decompression. On success, consumes all of *compressed. On failure, consumes an unspecified prefix of *compressed.

Parameters

Direction

Description

compressed

in,out

A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .

Returns:

Result

Description

Success

Returns true iff the contents of “compressed” can be uncompressed successfully.

Failure

Returns false if error.

Variables

static DOXYGEN_SHOULD_SKIP_THIS constexpr int kBlockLog   = 16
static constexpr size_t kBlockSize = 1 << kBlockLog#
static constexpr int kMaxHashTableBits = 14#
static constexpr size_t kMaxHashTableSize = 1 << kMaxHashTableBits#
static constexpr int kMinHashTableBits = 8#
static constexpr size_t kMinHashTableSize = 1 << kMinHashTableBits#