6. SNAPPY#
- group SNAPPY_API
A light-weight compression algorithm. It is designed for speed of compression and decompression, rather than for the utmost in space savings.
For getting better compression ratios when you are compressing data with long repeated sequences or compressing data that is similar to other data, while still compressing fast, you might look at first using BMDiff and then compressing the output of BMDiff with Snappy.
Generic compression/decompression routines.
-
size_t Compress(Source *source, Sink *sink)#
Compress the bytes read from “*source” and append to “*sink”. Return the number of bytes written.
Parameters
Direction
Description
source
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen);where inBuf is the pointer to original data and inBufLen is the size of inBuf
sink
in,out
A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.
- Returns:
Result
Description
Success
Return the number of bytes written.
Failure
Return 0 upon failure or NULL parameters are passed
-
bool GetUncompressedLength(Source *source, uint32_t *result)#
Find the uncompressed length of the given stream, as given by the header. Note that the true length could deviate from this; the stream could e.g. be truncated.
Parameters
Direction
Description
source
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf
result
out
Uncompressed length of the given stream is stored here.
Note
Also note that this leaves “*source” in a state that is unsuitable for further operations, such as RawUncompress(). You will need to rewind or recreate the source yourself before attempting any further calls.
- Returns:
Result
Description
Success
If the data inside the source is uncorrupted it will return true.
Failure
It will return false if the data inside the source is corrupted.
-
bool Uncompress(Source *compressed, Sink *uncompressed)#
Decompresses “compressed” to “*uncompressed”.
Parameters
Direction
Description
compressed
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .
uncompressed
in,out
A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.
- Returns:
Result
Description
Success
Returns true if successful.
Failure
Returns false if the decompression fails.
-
size_t UncompressAsMuchAsPossible(Source *compressed, Sink *uncompressed)#
This routine decompresses as much of the “compressed” as possible into sink. It returns the number of valid bytes added to sink (extra invalid bytes may have been added due to errors; the caller should ignore those). The emitted data typically has length GetUncompressedLength(), but may be shorter if an error is encountered.
Parameters
Direction
Description
compressed
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .
uncompressed
in,out
A Sink is an interface that consumes a sequence of bytes, you can initialize it by calling snappy::UncheckedByteArraySink(dest); where dest is the pointer to the destination buffer.
- Returns:
Result
Description
Success
It returns the number of valid bytes added to sink (extra invalid bytes may have been added due to errors; the caller should ignore those)
Failure
Returns 0 if the message is corrupted and could not be decompressed or NULL parameters are passed.
Higher-level string based routines.
Higher-level string based routines (should be sufficient for most users)
-
size_t Compress(const char *input, size_t input_length, std::string *compressed)#
Sets “*compressed” to the compressed version of “input[0,input_length-1]”. Original contents of *compressed are lost.
Parameters
Direction
Description
input
in
This is the buffer where the data we want to compress is accessible.
input_length
in
Length of the input buffer.
compressed
in,out
This is a buffer in which compressed data is stored.
- Attention
REQUIRES: “input[]” is not an alias of “*compressed”.
- Returns:
Result
Description
Success
Return the number of bytes written.
Failure
Return 0 upon failure or NULL parameters are passed
-
bool Uncompress(const char *compressed, size_t compressed_length, std::string *uncompressed)#
Decompresses “compressed[0,compressed_length-1]” to “*uncompressed”. Original contents of “*uncompressed” are lost.
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data.
compressed_length
in
This is the length of the compressed buffer.
uncompressed
out
Uncompressed data is stored in this buffer.
- Attention
REQUIRES: “compressed[]” is not an alias of “*uncompressed”.
- Returns:
Result
Description
Success
If the data inside the compressed is successfully decompressed it will return true.
Failure
It will return false if the decompression fails.
Lower-level character array based routines.
These May be useful for efficiency reasons in certain circumstances.
-
void RawCompress(const char *input, size_t input_length, char *compressed, size_t *compressed_length)#
Takes the data stored in “input[0..input_length]” and stores it in the array pointed to by “compressed”.
“*compressed_length” is set to the length of the compressed output.
Parameters
Direction
Description
input
in
This is the buffer where the data we want to compress is accessible.
input_length
in
Length of the input buffer.
compressed
out
This is a buffer in which compressed data is stored.
compressed_length
out
The length of the data after compression is stored in this.
- Attention
REQUIRES: “compressed” must point to an area of memory that is at least “MaxCompressedLength(input_length)” bytes in length.
Note
- Example:
char output = new char[snappy::MaxCompressedLength(input_length)];\n size_t output_length;\n RawCompress(input, input_length, output, &output_length);\n ... Process(output, output_length) ...\n delete [] output;\n
- Returns:
void
-
bool RawUncompress(const char *compressed, size_t compressed_length, char *uncompressed)#
Given data in “compressed[0..compressed_length-1]” generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to uncompressed[0..GetUncompressedLength(compressed)-1] .
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data.
compressed_length
in
This is the length of the compressed buffer.
uncompressed
out
Uncompressed data is stored in this buffer.
- Returns:
Result
Description
Success
Returns true if successful.
Failure
Returns false if the message is corrupted and could not be decrypted.
-
bool RawUncompress(Source *compressed, char *uncompressed)#
Given data from the byte source ‘compressed’ generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to uncompressed[0..GetUncompressedLength(compressed,compressed_length)-1] .
Parameters
Direction
Description
compressed
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .
uncompressed
out
Uncompressed data is stored in this buffer.
- Returns:
Result
Description
Success
Returns true if successful.
Failure
Returns false if the message is corrupted and could not be decrypted.
-
bool RawUncompressToIOVec(const char *compressed, size_t compressed_length, const struct iovec *iov, size_t iov_cnt)#
Given data in “compressed[0..compressed_length-1]” generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to the iovec “iov”. The number of physical buffers in “iov” is given by iov_cnt and their cumulative size must be at least GetUncompressedLength(compressed). The individual buffers in “iov” must not overlap with each other.
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data.
compressed_length
in
This is the length of the compressed buffer.
iov
in,out
The struct iovec defines one vector element. Normally, this structure is used as an array of multiple elements.
iov_cnt
in,out
This is the number of iovec structures in the array of iov .
- Returns:
Result
Description
Success
Returns true if successful.
Failure
Returns false if the message is corrupted and could not be decrypted.
-
bool RawUncompressToIOVec(Source *compressed, const struct iovec *iov, size_t iov_cnt)#
Given data from the byte source ‘compressed’ generated by calling the Snappy::Compress routine, this routine stores the uncompressed data to the iovec “iov”. The number of physical buffers in “iov” is given by iov_cnt and their cumulative size must be at least GetUncompressedLength(compressed). The individual buffers in “iov” must not overlap with each other.
Parameters
Direction
Description
compressed
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .
iov
in,out
The struct iovec defines one vector element. Normally, this structure is used as an array of multiple elements.
iov_cnt
out
This is the number of iovec structures in the array of iov .
- Returns:
Result
Description
Success
Returns true if successful.
Failure
Returns false if the message is corrupted and could not be decrypted.
Helper Functions.
-
size_t MaxCompressedLength(size_t source_bytes)#
This function determines the maximal size of the compressed representation of input data that is “source_bytes” bytes in length.
Parameters
Direction
Description
source_bytes
in
The size of source in bytes.
- Returns:
Result
Description
Success
Returns the maximal size of the compressed representation of input data that is “source_bytes” bytes in length.
-
bool GetUncompressedLength(const char *compressed, size_t compressed_length, size_t *result)#
Get the Uncompressed Length object.
This operation takes O(1) time.
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data.
compressed_length
in
This is the length of the compressed buffer.
result
out
This is the pointer to type size_t where the uncompressed length is stored.
- Attention
REQUIRES: “compressed[]” was produced by RawCompress() or Compress().
- Returns:
Result
Description
Success
Returns true on successful parsing.
Failure
Returns false on parsing error.
-
bool GetUncompressedLengthFromMTCompressedBuffer(const char *compressed, size_t compressed_length, size_t *result)#
Get the Uncompressed Length object from the AOCL multithreaded compressor’s compressed buffer.
This operation takes O(1) time.
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data. (along with the RAP frame)
compressed_length
in
This is the length of the compressed buffer (including the RAP frame).
result
out
This is the pointer to type size_t where the uncompressed length is stored.
- Attention
REQUIRES: “compressed[]” was produced by RawCompress() or Compress() IN AOCL’s MULTITHREADED MODE.
- Returns:
Result
Description
Success
Returns true on successful parsing.
Failure
Returns false on parsing error.
-
bool IsValidCompressedBuffer(const char *compressed, size_t compressed_length)#
Returns true iff the contents of “compressed[]” can be uncompressed successfully. Does not return the uncompressed data. Takes time proportional to compressed_length, but is usually at least a factor of four faster than actual decompression.
Parameters
Direction
Description
compressed
in
This is a buffer which contains compressed data.
compressed_length
in
This is the length of the compressed buffer.
- Returns:
Result
Description
Success
Returns true iff the contents of “compressed[]” can be uncompressed successfully.
Failure
Returns false if error.
-
bool IsValidCompressed(Source *compressed)#
Returns true iff the contents of “compressed” can be uncompressed successfully. Does not return the uncompressed data. Takes time proportional to *compressed length, but is usually at least a factor of four faster than actual decompression. On success, consumes all of *compressed. On failure, consumes an unspecified prefix of *compressed.
Parameters
Direction
Description
compressed
in,out
A Source is an interface that yields a sequence of bytes, you can initialize it by calling snappy::ByteArraySource(inBuf,inBufLen); where inBuf is the pointer to original data and inBufLen is the size of inBuf .
- Returns:
Result
Description
Success
Returns true iff the contents of “compressed” can be uncompressed successfully.
Failure
Returns false if error.
Variables
- static DOXYGEN_SHOULD_SKIP_THIS constexpr int kBlockLog = 16
-
static constexpr size_t kBlockSize = 1 << kBlockLog#
-
static constexpr int kMaxHashTableBits = 14#
-
static constexpr size_t kMaxHashTableSize = 1 << kMaxHashTableBits#
-
static constexpr int kMinHashTableBits = 8#
-
static constexpr size_t kMinHashTableSize = 1 << kMinHashTableBits#
-
size_t Compress(Source *source, Sink *sink)#