binary-tools

JavaScript binary tools for any environment.
Copyright (c) 2023-2025 Rafael da Silva Rocha.
https://rochars.com/binary-tools

  • Blazing fast!
  • Zero dependencies!
  • No polyfills!
  • Compatible with any environment with JavaScript support*
  • Available both as a ES6 module and ES3/CommonJS
  • Native support to 64-bit, 128-bit, 256-bit, 512-bit and 1024-bit integers!
  • Native support to bfloat16 brain floating point format
  • Native support to 16-bit half-precision floating point format
  • Native support to 24-bit, 40-bit and 48-bit integers
  • Support user-defined integer types (like Uint12)
  • Support ASCII and ISO 8859-1 strings and characters
  • Support UTF-8, UTF-16 and UTF-32 strings
  • Tailored for real-life use cases

* Certain features (like BigInts) will only work in environments that support them. This lib is not a polyfill.

Tests

You can run the tests in your browser:
https://rochars.com/binary-tools/test/dist/browser

This library is tested using datasets with thousands of values generated with Python's struct module + other sources. Smaller types are fully tested with all possible values, signed and unsigned. Other types are tested with up to tens of thousands test values per type plus special cases.

Unicode strings are tested using datasets with all the code points in the Unicode code space, along with other tests with strings and complex emoji combinations.

Install

Download it from https://rochars.com. It may be obtained under many different open source licenses. All functions are available in all versions.

You may also use a package manager:

npm install https://rochars.com/[email protected]

Or load it as a module from a CDN:

import * as btools from "https://rochars.com/[email protected]/index.js";

My CDN is is powered by the kind folks at Cloudflare. Use it.

Using in Node.js

The examples below assume you already installed binary-tools as a dependency in your project.

const btools = require('binary-tools');

// Pack a signed 8-bit integer, returns a
// array with the number represented as bytes
let packed = btools.pack('h', -32765);
// packed == [3, 128]

let unpacked = btools.unpack('h', [255, 127]);
// unpacked = [32767]

// Pack a signed 16-bit integer to a existing byte buffer
// Start writing on index '4' of the buffer
// 'false' set clamp=false, wich is the default behaviour
// when 'clamp' is ommited.
let buffer = new Uint8Array(12);
btools.packTo('h', [1077], buffer, false, 4);

Or use ES6 imports:

import * as btools from 'binary-tools';

let packed = btools.pack('h', -32765);

Using in the Browser

Use the binary-tools.js file in the /dist folder:

<script src="./binary-tools/dist/binary-tools.js"></script>
<script>
  var packed = btools.pack('f', 2.1474836);
</script>

If you are targeting only modern environments you may use the ./index.js file in the root folder instead of ./dist/binary-tools.js. The ./index.js file have ES6-style exports.

Browser compatibility

The file ./dist/binary-tools.js is transpiled to ES3 and is compatible with IE6+. It should work in all modern browsers as well.

Numbers larger than 53-bit are only available on engines that support ECMAScript 2020.

Cross-browser tests powered by Browserstack:

Using in Deno / ES6 module

Use the ./index.js file in the root folder:

import * as btools from "./binary-tools/index.js";

let packed = btools.pack('h', -32765);
// packed == [3, 128]

Or load it from a CDN:

import * as btools from "https://rochars.com/[email protected]/index.js";

Quick Reference

Difference of pack/unpack and write/read functions

The packing functions pack() and packTo() pack single values or structs, either returning an array with the values as bytes (in the case of pack()) or writing the packed values directly to a byte buffer (the case of packTo()).

Unpacking with unpack() or unpackTo() reads a single value or a struct from a buffer and return the values on an array (in the case of unpack()) or write the values to a Array or TypedArray (in the case of unpackTo()).

packing and unpacking functions can work with format strings made of many types (like 'hlhlhddd' or 'hlhlhtx4h').


The writing functions write zero or many values of a single type to a byte buffer (in the case of writeTo()) or return an array with the values represented as bytes (in the case of write()).

Reading with read() or readTo() read zero or many values of a single type from a byte buffer and return the values in an array (in the case of read()), or write the values directly to a TypedArray or Array (in the case of writeTo()).

reading and writing functions only work with a single type at a time (like 'h' or '>f'). Any other type present in the format string after the first one will be ignored ('>Hhh' will be handled as '>H', for example). Byte order operators work the same they do in the packing/unpacking functions.

Reading and writing functions do not use repeat count numbers. Repeat count numbers are just for packing and unpacking functions. read() and readTo() use optional start and end params to specify a slice of the input buffer for reading. If no start and end are specified, then the reading will cover the entire buffer. If just the start param is specified, then it will read from that index until the end of the buffer.

write() and writeTo() always write the entire input array of values, and writeTo() have a optional index param that specify a index in the output buffer to start writing.


In summary:

pack() and unpack() are for multiple types at the same time, write() and read() are for many values of a single type.

If you are working with long sequences of values of the same type, read/write functions are much faster, and probably what you should use.

If you are working with file headers or other types of pre-defined structures of many different types, or a small number of values, then pack/unpack functions are more practical, and probably what you should use.


Types, operators and format strings

Format strings specify the types being packed/unpacked and how to pack or unpack them. Format strings for packing or unpacking structures may look like this:

  • 'HhHdH'

where each character represent a type of data.

The operators

A format string may be preceded by operators. A format string with a operator may look like this:

  • '>HhHdH'

where the > is a byte order operator that indicates big endian. Operators may only appear at the start of the format string.

The size and alignment operators:

The size and alignment operators are used to both align the data and set the proper size of some types.

The operators are:

  • '' - no operator; default sizes will be used and no alignment will be made
  • '~' - the same as no operator
  • '@' - non-standard types will have padding as to align to the closest C type that fits its size. If a value is provided for the @ operator then data will be aligned according to that value.
  • '#' - Similar to @, but uses a different set of rules.

The @ and # operators may be used in combination with a value to better specify the architecture. The value must appear immediately before the operator in the format string and must be a multiple of 8.

// No value for @; will only enforce all data is aligned
// as standard C types
btools.pack('@Ibhb', [1, 2, 3, 4]);

// Set 32-bit; enforce all data is aligned as standard
// C types and align the data to fit a 32-bit architecture
btools.pack('32@Ibhb', [1, 2, 3, 4]);

// Set 64-bit; enforce all data is aligned as standard
// C types and align the data to fit a 64-bit architecture
btools.pack('64@Ibhb', [1, 2, 3, 4]);

If the @ or # operators are used with a value then data will be aligned according to the defined architecture. The size of some standard types may change depending of the value used for the operator. As of version 2.0.x, only types 'l' and 'L' have sizes that may vary according to the architecture:

// 32-bit architecture and @ operator; type L will use 4 bytes
btools.pack('32@Ld', [1, 2]);

// 64-bit architecture and @ operator; type L will use 8 bytes
// Note that we need to change the type to BigInt:
btools.pack('64@Ld', [1n, 2]);

// With the # operator types l and L always uses 4 bytes:
btools.pack('32#Ld', [1, 2]);
btools.pack('64#ld', [1, 2]);

Structs will always be padded in the end as needed to fit the value of the operator.

// Defining a 32-bit architecture:
btools.pack('32@Ibhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0,   2, 0, 3, 0,   4, 0, 0, 0]
// In Python this would be the same as doing
// struct.pack("@Ibhb0i", 1, 2, 3, 4) on a 32-bit system.
// Note the 0i in the end of the Python's format string to
// enforce padding until the full size is reached

// The same data, now with a 64-bit architecture:
btools.pack('64@Ibhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0, 2, 0, 3, 0,   4, 0, 0, 0, 0, 0, 0, 0]

The % operator can be used at the end of the format string to prevent padding at the end:

// Using % to prevent padding at the end:
btools.pack('64@bhb%', [1, 1, 1]);
// will output [1, 0, 1, 0, 1]

// With no %
btools.pack('64@bhb', [1, 1, 1]);
// will output [1, 0, 1, 0, 1, 0, 0, 0]

The & is used to force padding at the end, wich is already the default behavior. Usint the & is the same as using nothing:

// Using & to make padding explicit:
btools.pack('64@bhb&', [1, 1, 1]);
// will output [1, 0, 1, 0, 1, 0, 0, 0]

The presence of the @ or # operators on a format string will force non-standard types to be aligned according to the closest standard C type that fits its size, but their size won't change.

Since alignment is based on the behavior of C compilers, alignement operators are normally used with types that are either compatible with C types or at least have a C type that is a close relative:

// 32-bit architecture:
btools.pack('32@Ibhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0,   2, 0, 3, 0,   4, 0, 0, 0]

// 64-bit architecture:
btools.pack('64@Ibhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0, 2, 0, 3, 0,   4, 0, 0, 0, 0, 0, 0, 0]

// Set width to 16-bit will enforce even values between members:
btools.pack('16@Ibhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0,   2, 0,   3, 0,   4, 0]

// Without operator, types are packed with no padding at all and
// no sizes are changed:
btools.pack('Ibhb', [1, 2, 3, 4]);
btools.pack('~Ibhb', [1, 2, 3, 4]); // the ~ operator is the same as no operator
// will both output [1, 0, 0, 0,   2,   3, 0,   4]

The alignment operators may be used with the non-standard types too. They cause non-standard type to be aligned as the closest C type with a size that fits:

// 32-bit architecture, using a 3-byte type:
btools.pack('32@tbhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0,   2, 0, 3, 0,   4, 0, 0, 0]
// the 3-byte 't' type was padded to be aligned in a 4-byte boundary

// 64-bit architecture:
btools.pack('64@tbhb', [1, 2, 3, 4]);
// will output [1, 0, 0, 0, 2, 0, 3, 0,   4, 0, 0, 0, 0, 0, 0, 0]

// Set width to 8-bit will just enforce system-specific sizes;
// even if no alignment is done, the 3-byte type is padded to fit 4 bytes
btools.pack('8@tbhb', [1, 2, 3, 4]);
btools.pack('@tbhb', [1, 2, 3, 4]); // no value for @, the same as 8@
// will both output [1, 0, 0, 0,   2,   3, 0,   4]

// Without operator, types are packed with no padding at all and
// no sizes are changed:
btools.pack('tbhb', [1, 2, 3, 4]);
btools.pack('~tbhb', [1, 2, 3, 4]); // the ~ operator is the same as no operator
// will both output [1, 0, 0,   2,   3, 0,   4]

Note that even if the alignment of non-standard types is adjusted, their sizes wont change. When using those non-standard types it is assumed that the applications consuming the data agree on the standards in use when reading and writing the non-standard types.

Writing and reading

The @ and # operators can be used with writing and reading functions, too; in that case they will enforce that non-standard types are aligned as standard types when writing, and will consider the standard type alignment when reading:

// Writing an array of values of a 3-byte type:
btools.write('@t', [1, 2, 3, 4]);
// will output [1, 0, 0, 0,   2, 0, 0, 0,   3, 0, 0, 0,   4, 0, 0, 0]
// the 3-byte 't' type was padded to be aligned in a 4-byte boundary

// Reading an array of values of a 3-byte type:
btools.read('@t', [1, 0, 0, 0,   2, 0, 0, 0,   2, 0, 0, 0,   4, 0, 0, 0]);
// will output [1, 2, 3, 4]
// the padding was considered when reading a 3-byte type
Alignement

With the @ operator data will be aligned according to these standards:

  • single-byte types will be 1-byte aligned
  • two-byte types will be 2-bytes aligned
  • four-byte types will be 4-bytes aligned
  • eight-byte types will be 4-bytes aligned on 32-bit architectures and 8-bytes aligned on 64-bit architectures
  • types that use more than 8 bytes will be aligned by the nearest multiple of 8.

Examples:

// Packing a struct with 3 values: the first uses
// 1 byte, the second uses 2 bytes and the third
// uses 8 bytes. Aligning as 32-bit:
btools.pack('32@bhd', [1, 2, 3]);
// will output [1, 0, 2, 0,   0, 0, 0, 0,   0, 0, 8, 64]
// The 8-byte type 'd' alignment is 4 bytes.

// The same data, now 64-bit:
btools.pack('64@bhd', [1, 2, 3]);
// will output [1, 0, 2, 0, 0, 0, 0, 0,   0, 0, 0, 0, 0, 0, 8, 64]
// the padding was considered when reading a 3-byte type
// The 8-byte type 'd' alignment is 8 bytes

With the # operator data will be aligned following the same rules used for the @ operator except for the following case:

  • 8-byte types will be 8-bytes aligned in both 32-bit and 64-bit

Examples:

// 8-byte types are 8-byte aligned in both 32 and 64-bit:

btools.pack('32#Id', [1, 1.1]);
// will output [1, 0, 0, 0, 0, 0, 0, 0,  154, 153, 153, 153, 153, 153, 241, 63]

btools.pack('64#Id', [1, 1.1]);
// will output [1, 0, 0, 0, 0, 0, 0, 0,  154, 153, 153, 153, 153, 153, 241, 63]
Sizes

With the @ operator sizes will be adjusted according to these standards:

  • Types 'l' and 'L' will use 8 bytes for any architecture equal or greater than 64-bit, 4 bytes otherwise. The default size for 'l' and 'L' is 4 bytes. Note that this cause the JavaScript type used for 'l' and 'L' to change to BigInt.

With the # operator sizes will be adjusted according to these standards:

  • Types 'l' and 'L' will use 4 bytes for any architecture.

The types 'i' and 'I' always uses 4 bytes regardless of architecture.

As of version 2.0.x, only types 'l' and 'L' have sizes that may change according to the architecture.

The byte order operators:

Byte order operators indicate the byte order of the data, big endian or little indian.

  • '' - no operator; little endian, UTF-16 and UTF-32 endianness determined by BOM
  • '=' - little endian, UTF-16 and UTF-32 endianness determined by BOM
  • '<' - little endian
  • '>' - big endian
  • '!' - big endian, UTF-16 and UTF-32 endianness determined by BOM

Byte order operators may be used alone or with alignment operators. If using alignment operators then the alignment operator must always come first in the string:

// Using > to indicate big endian
btools.write('@>t', [1, 2, 3, 4]);
// will output [0, 0, 1, 0,   0, 0, 2, 0,   0, 0, 3, 0,   0, 0, 4, 0]
The standard type codes:

The types below can be easily maped to C types, despite some implementation details:

  • 'x' - null pad byte, uses 1 byte
  • 'c' - single ASCII character, uses 1 byte
  • 'C' - single ISO-8859-1 character, uses 1 byte
  • 's' - ASCII string, uses 1 byte per character
  • 'S' - ISO-8859-1 string, uses 1 byte per character
  • 'b' - 8-bit signed integer, uses 1 byte
  • 'B' - 8-bit unsigned integer, uses 1 byte
  • '?' - boolean, uses 1 byte
  • 'h' - 16-bit signed integer, uses 2 bytes
  • 'H' - 16-bit unsigned integer, uses 2 bytes
  • 'i' - 32-bit signed integer, uses 4 bytes
  • 'I' - 32-bit unsigned integer, uses 4 bytes
  • 'l' - 32-bit signed integer, uses either 4 or 8 bytes
  • 'L' - 16-bit unsigned integer, uses either 4 or 8 bytes
  • 'f' - 32-bit floating point, uses 4 bytes
  • 'd' - 64-bit floating point, uses 8 bytes
  • 'q' - 64-bit signed integers, uses 8 bytes
  • 'Q' - 64-bit unsigned integers, uses 8 bytes
The non-standard type codes:
  • 'u' - UTF-8 string, may use 1 to 4 bytes per character
  • 'U' - UTF-16 string, may use 2 or 4 bytes per character
  • 'V' - UTF-32 string, uses 4 bytes per character
  • 'N' - 4-bit unsigned integers, uses 1 byte per pair
  • 'e' - 16-bit floating point, uses 2 bytes
  • 'g' - 16-bit brain floating point, uses 2 bytes
  • 't' - 24-bit signed integer, uses 3 bytes
  • 'T' - 24-bit unsigned integer, uses 3 bytes
  • 'j' - 40-bit signed integer, uses 5 bytes
  • 'J' - 40-bit unsigned integer, uses 5 bytes
  • 'k' - 48-bit signed integer, uses 6 bytes
  • 'K' - 48-bit signed integer, uses 6 bytes
  • 'A' - 128-bit unsigned integers, uses 16 bytes
  • 'Y' - 256-bit unsigned integers, uses 32 bytes
  • 'O' - 512-bit unsigned integers, uses 64 bytes
  • 'M' - 1024-bit unsigned integers, uses 128 bytes

A format character may be preceded by an integral repeat count. For example, the format string '4h' means exactly the same as 'hhhh'.

Repeat count numbers are just used for the packing and unpacking functions: pack(), unpack(), packTo() and unpackTo().

The reading functions read() and readTo() do not use repeat count numbers; they use optional start and end params to specify a slice of the input buffer for reading. The writing functions write() and writeTo() always write the entire input array of values.

Notice that while this module use some names and conventions based on Python's struct module, it is not a JavaScript re-implementation of Python's struct module, neither intends to be. Some types (like 'c') work in a different way than they do in Python, for example. Python's struct module was, however, heavily used to generate the datasets used to test this lib, along with other sources.

Some considerations:

  • pad bytes 'x' are always packed as 0 and ignored when unpacking
  • boolean '?' always pack values as either 0 or 1, and unpacks as either true or false
  • using 0 as a repeat count in the format string is ignored ('0h' is the same as '1h' wich is the same as 'h').
  • repeat counts placed at the end of the string will be ignored
  • the types 'c' and 'C' represent single characters. A repeat count number for types 'c' and 'C' would represent the number of single characters in sequence - the format string '5c' on unpack [97,97,97,97,97] will return ['a','a','a','a','a'].
  • the types 's' and 'S' represent strings. A repeat count number for types 's' and 'S' would represent the number of characters in the string - the format string '5s' on unpack [97,97,97,97,97] will return 'aaaaa'.
  • when packing or unpacking strings of single-byte characters (types 's' and 'S'), each code in the format string represent a independent string. The string size is always determined by the repeat count number and sequential 's' or 'S' characters mean independent strings..
  • when packing, strings are padded with null bytes as appropriate to make it fit.
  • Integers greater than 53-bit (like types q and Q) only work in environments that support BigInt.

Creating custom types

You may define your own integer types with the setType(typeSymbol, bits, signed) function. typeSymbol must be a single character. bits must be a integer greater than 1, and signed is a boolean indicating if the type if signed or not. The signed param is optional and defaults to false.

// Define a 12-bit signed integer
btools.setType('z', 12, true);
// Use your type just like you would use any other:
btools.pack('z', 257);

// Define a 11-bit unsigned integer
btools.setType('z', 11);
btools.pack('z', 560);

The typeSymbol must not be one already in use, neither a reserved symbol. Other than that, you are free to use whatever character you want. The reserved symbols as of version 2.0 are:

x c C s S u U V N b B ? h H e g t T l L f j J k K d q Q A Y O M

Integer types with bits=0 and bits=1 are reserved for pad bytes 'x' and booleans '?'.

Byte order

To set the byte order to big endian, use the '>' or the '!' operator:

let packed = pack('>HH', [1, 1]);
// will return [0, 1, 0, 1]

To set to little endian, use the '<' or the '=' operator:

let packed = pack('<HH', [1, 1]);
// will return [1, 0, 1, 0]

The byte order operators for little endian '=' and '<' and big endian '!' and '>' work exactly the same for all types except UTF-16 and UTF-32 strings.

For UTF-16 and UTF-32 strings, '<' and '>' enforce endianness regardless of a BOM (and cause errors to be throw if the BOM is present and contradicts the endianness defined in the format string), while '!' and '=' assume big-endian and little-endian, respectively, but are overriden by the byte order mark (BOM) if a BOM is present on the string.

If using '<' or '>' with UTF-16 or UTF-32, a BOM will be necessarily included in the string when packing in case a BOM is not already present; operators '!' and '=' do not automatically include a BOM when packing UTF-16 or UTF-32.

The byte order operator may be ommited and will default to '=' regardless of the endianness of the host machine:

let packed = pack('H', 1);
// will return [1, 0] the same as
// pack('=H', 1)
// or
// pack('<H', 1);

Byte order operators placed anywhere except the start of the format string will be ignored. If more than one byte order operator is present, the first one from left to right is the one that will be used.

Pad bytes

Pad bytes (type 'x'), when present in the format string, dont need (and should not have) a matching element in the array of values. Pad bytes are only used with packing and unpacking functions. They should not be used with writing and reading functions.


Bytes as hex strings

To work with bytes as hex strings, use hexToByes() and bytesToHex() to format your inputs and outputs:

btools.hexToBytes('ffff00');
// return [255, 255, 0]
btools.bytesToHex([255, 255, 0]);
// return 'ffff00'

hexToBytes() always return a Array. bytesToHex() input may be an Array or a typed array, and always return a string.

Byte arrays with invalid values will cause a RangeError to be thrown:

btools.bytesToHex([-1, 256, 0]);
// throw a RangeError

Invalid hex strings will throw errors:

btools.hexToBytes('fffff'); // odd number of characters
// throw a 'Invalid hex string' Error 

btools.hexToBytes('ffffxf'); // invalid character
// throw a 'Invalid hex string' RangeError

pack and packTo

pack(format, values, clamp) will return a Array with the bytes of the passed values. values can be a single item or an array.

let packed = pack('h', 1000);
// return [232, 3]
packed = pack('>hh', [1000, 1]);
//return [3, 232, 0, 1]

packTo(format, values, buffer, clamp, index) will write the bytes of the value to a buffer (any Array-like object). Writing starts on index. If no index is informed, it is assumed index=0.

// Create a Uint8Array with size=4
let buffer = new Uint8Array(4);
// Start writing on index=2, writing the bytes of the 16-bit value to
// buffer[2] and buffer[3]. buffer[0] and buffer[1] are left untouched
packTo('h', 402, buffer, false, 2);

index can be ommited and will default to zero:

// Create a Uint8Array with size=4
let buffer = new Uint8Array(4);
// Start writing on index=0, writing the bytes of the 16-bit value to
// buffer[0] and buffer[1]. buffer[2] and buffer[3] are left untouched
packTo('h', 402, buffer);

If the output buffer size is smaller than required by the data and output is a typed array, it throws a Bad buffer length error:

// Create a Uint8Array with size=1
let buffer = new Uint8Array(1);
packTo('h', 402, buffer);
//Error: Bad buffer length

write and writeTo

write(format, values) will return a Array with the bytes of the passed values.

let packed = write('h', [1000, 1, 1, 1]);
// return [232, 3, 1, 0, 1, 0, 1, 0]

writeTo(format, values, buffer, index) will write the bytes of the value to the provided buffer (any Array-like object). Writing starts on index. If index is ommited, it is assumed index=0.

// Create a Uint8Array with size=4
let buffer = new Uint8Array(4);
// Write all values to the buffer
writeTo('h', [402, 1], buffer);

write(), writeTo, read() and readTo() should be used with a single type at a time and are meant to read/write either large datasets or long sequences were all values are of the same type (such as in media files).

read/write functions work with a single type at a time. Any extra types defined in the format string after the first one will be ignored ('>Hbb' will be handled as '>H')

If the output buffer size is smaller than required by the data and output is a typed array, it throws a Bad buffer length error:

// Create a Uint8Array with size=2, but values
// need size=4
let buffer_ = new Uint8Array(2);
writeTo('h', [1, 1], buffer_);
//Error: Bad buffer length

// Create a Uint8Array with size=1, but value uses
// 2 bytes
var buffer_ = new Uint8Array(1);
btools.writeTo('h', [1], buffer_);
//Error: Bad buffer length

// Size match the number of values and type size;
// this is OK
var buffer_ = new Uint8Array(2);
btools.writeTo('h', [1], buffer_);
// buffer_ is now [1, 0]

Packing null, false, true and undefined

Attempts to pack or write integers with the following values:

  • undefined
  • null
  • true
  • false

will throw a TypeError.

If you wish to pack true as 1 and false as 0 you should use the boolean ('?') type.

Notice that boolean ('?') type will throw a TypeError if used with values that are not of type 'boolean' and clamp is set to false.

If clamp is set to true, type boolean ('?') will pack undefined, null and false as 0 and pack any other value (of any type) as 1. This includes the Unicode NULL character, empty objects and empty arrays as long as the empty array is the value itself, inside the values array, for example:

btools.pack('7?', [1, {}, [], -1024, 'a', '\u0000', BigInt(7)], true);
// all values in this array will be packed as 1, as clamp=true
// this would throw a TypeError if clamp was false or unspecified

Unpacking and input buffer length

When unpacking values from a byte buffer insufficient bytes will throw a Bad buffer error.

// throws a 'Bad buffer length' error
let buffer = [0xff];
btools.unpack('H', buffer);

// throws a 'Bad buffer length' error (start reading on index=2),
// attempt to unpack a 16-bit number from a single byte
let buffer = [0xff, 0xff, 0xff];
btools.unpack('H', buffer, false, 2);

// do not throw error (start reading on index=1),
// so skip the first byte and only read the last 2 bytes as a 16-bit number
buffer = [0xff, 0xff, 0xff];
btools.unpack('H', buffer, false, 1); 

For the read and readTo method insufficient bytes will throw a Bad buffer length error and extra bytes in the input buffer will be ignored if present.

Note that with read() and readTo() if the input buffer is empty then no error is thrown. In this case readTo() will write nothing to the output array and read() will return an empty array, just like writeTo() and write() also do not throw errors when writing empty arrays. An error is only thrown if there are bytes, but not enough bytes to read the given data type.

// readTo()

// throws a 'Bad buffer length' error (insufficient bytes)
let buffer = [0xff];
btools.readTo('H', buffer, output, false, 0, buffer.length);

// do not throw error; extra byte is ignored
let output = [];
let buffer = [0xff, 0xff, 0xff];
btools.readTo('H', buffer, output, false, 0, buffer.length);
// output will be [65535]

// read()

// throws a 'Bad buffer length' error (insufficient bytes)
let buffer = [0xff];
btools.read('H', buffer, false, 0, buffer.length);

// do not throw error; extra byte is ignored
var buffer = [0xff, 0xff, 0xff];
var output = btools.read('H', buffer, false, 0, buffer.length);
// output will be [65535]

// do not throw error; output buffer remains unchanged
var output = [0, 0, 0, 0];
btools.readTo('H', [], output);
// output still [0, 0, 0, 0];

// do not throw error; returns empty array
var output = btools.read('H', []);
// output will be []

// just like write() also returns a empty
// array if the input is a empty array.
var output = btools.write('H', []);
// output will be []

// and writeTo will not change the output
// buffer when writing empty arrays
var output = [1, 1, 1, 1];
btools.writeTo('H', [], output);
// output will be [1, 1, 1, 1]

Floating-point numbers

  • Floating-point numbers follow the IEEE 754 standard.
  • NaN is packed as quiet NaN. Both quiet NaN and signaling NaN can be unpacked, both unpacked as NaN. Unpacking NaN with extra information on the significand is supported and will also result in NaN (extra information will be lost).
  • Support packing and unpacking negative zeros.
  • Support packing and unpacking Infinity and negative Infinity

Minifloats

Native support for 16-bit half-precision numbers (format code 'e') and for brain floating point numbers (format code 'g').


Integers

  • Overflow on integers will throw a RangeError.
  • Packing values other than integers will throw a TypeError.
  • You may clamp the input to avoid RangeError by setting clamp to true.

To clamp integers on overflow and avoid RangeError, set the optional clamp param to true:

// Set clamp to true; values will be packed as their max or min values
// on overflow. In this case, packing an array of unsigned 8-bit ints
write('B', [1, 259, 2], true);
// will return [1, 255, 2]

// Set clamp to false; overflows cause a RangeError
write('B', [1, 259, 2], false);
// will throw a RangeError; this is the same as
write('B', [1, 259, 2]); // (omitting the clamp param)
// wich will also throw a RangeError

Signed integers

Signed integers are two's complement.

64-bit, 128-bit, 256-bit, 512-bit and 1024-bit integers

binary-tools have native support for packing and unpacking 64-bit, 128-bit, 256-bit, 512-bit and 1024-bit integers on environments that support BigInt.

64-bit numbers are available as signed (type 'q') and unsigned (type 'Q'). 128, 256, 512 and 1024-bit numbers are only available as unsgined (types 'A', 'Y', 'O', 'M'). You can define signed variations for them if you need it:

// create a 128-bit signed integer
btools.setType('a', 128, true);
btools.pack('a', [-1n]);
// will output [255, 255, 255, 255, 255, 255, 255, 255,
//              255, 255, 255, 255, 255, 255, 255, 255]

Internally, all BigInts are represented using BigInt(), not 0n notation.

Nibbles

With the standard type 'N' (nibbles, 4-bit integers) values will be packed and unpacked as pairs, each pair in a single byte. You may use nibbles in the format string just like any type:

// packing a signed 16-bit integer and 2 nibbles
btools.pack('hNN', [1,  15, 15]);
// will return [1, 0,   255] (1, 0 is the integer, 255 is the nibble pair)

// unpacking a signed 16-bit integer and 2 nibbles
btools.unpack('hNN', [1, 0,  255]);
// will return [1, 15, 15]
Packing and unpacking Nibbles

You may pack or unpack a single nibble, or odd numbers of nibbles, but one nibble will occupy a full byte just like a pair would:

// packing a single nibble will only write the high word.
// 15 (max value of a nibble) is used to better illustrate the result:
btools.pack('N', [15]);
// will return [240] (240 = 11110000)

// packing an odd number of nibbles
btools.pack('NNN', [15, 15,  15]);
// will return [255, 240] (255 is the first pair, 240 is the last lonely nibble)

Values of type 'N' are unsigned. The max value for a nibble is 15; any value greater than that will cause a RangeError:

btools.pack('N', [16])
// will throw a RangeError
Reading and writing Nibbles

Single nibbles or odd number of nibbles are only available for packing and unpacking functions. When using the writing functions write and writeTo only pairs of nibbles can be written; if the input array have a odd count of elements, an error will be thrown:

btools.write('N', [15, 0,  15, 0,  15]);
// will throw an error

var buff = new Uint8Array(3);
btools.writeTo('N', [1, 0,  1, 0,  15], buff);
// will throw an error

var buff = new Uint8Array(3);
btools.writeTo('N', [1, 0,  1, 0,  15, 0], buff);
// this is fine

When using the reading functions read and readTo, nibbles will always be unpacked as pairs:

btools.read('N', [16]);
// will return [1, 0]

btools.read('N', [240]);
// will return [15, 0]

btools.read('N', [255]);
// will return [15, 15]

Single-byte characters and strings

Types 'c' and 'C' represent single characters. Type 'c' represent a single ASCII character (from 0 to 127). Type 'C' represent a single ISO-8859-1 character (from 0 to 255), covering all first 256 characters of Unicode. Both always use one byte.

Types 's' and 'S' represent strings. Type 's' represent a string of ASCII characters, and type 'S' represent a string of ISO-8859-1 characters.

Unless you need to enforce that the characters are 7-bit ASCII characters, you should generally use types 'C' and 'S'.

Single characters (types 'c' and 'C') behave the same way as the number types when it comes to format strings:

btools.pack('cH2c', ['a', 1, 'b', 'b']);
// will output [97, 1, 0, 98, 98]

For string types, the repeat count number represent the size of the string in bytes. Since every character uses exactly one byte, it is also the number of characters:

btools.pack('sH2s', ['a', 1, 'bb']);
// will output [97, 1, 0, 98, 98]

If a string have less characters than the specified in the format, it will be padded with null bytes to fit the size defined in the format string:

btools.pack('sH4s', ['a', 1, 'bb']);
// will output [97, 1, 0, 98, 98, 0, 0]

If a string have more characters than the specified in the format, it will be trimmed to fit the size defined in the format string:

btools.pack('sH2s', ['a', 1, 'bbbb']);
// will output [97, 1, 0, 98, 98]

Unicode strings

Unicode strings (types 'u', 'U' and 'V') are normally used with reading/writing functions, but may be used with packing/unpacking functions as well.

For Unicode strings the repeat count number on the format string when using pack(), packTo(), unpack() or unpackTo() represent the number of bytes that will be encoded or decoded, not the number of characters.

You must take in consideration that a Unicode character may use more than one byte, so simply counting the number of character will not give you the correct number of bytes needed to encode the string; btools have functions to calculate the number of bytes of a Unicode string so you can check if it fits or not in the buffer. Read more about these functions below.

Repeat count numbers are mainly used with Unicode when you have a fixed space for a string on a buffer, but the string may or may not use all the bytes reserved for it. In that case, the string should be packed and the remaining bytes should be set to NULL (0). On the other hand, if the string uses more bytes than defined in the format string, it will be trimmed as appropriate to make it fit - the same behavior of single-byte character encoding strings.

For example, if a UTF-8 string occupies 4 bytes when encoded, but there is a slot of 256 bytes for a UTF-8 string in a file header, the 4 bytes of the encoded string would be written to the buffer while the remaining 252 bytes would be written as NULL, and the index of the writing head will be moved to the next index immediatly after the UTF-8 pre-defined slot to continue packing the next values as defined in the format string.

UTF-8 strings

UTF-8 strings (type 'u') may use from 1 to 4 bytes per character. If there is no room for the string according to the given size (for example, the repeat count number is 5, but the string uses 6 bytes), a error will be thrown:

// The string '美麗' uses 6 bytes and the slot have 6 bytes,
// so the string can be correcly packed:
btools.pack('6u', '美麗');
// will return [231, 190, 142,   233, 186, 151]

// The string uses 6 bytes, 3 for each character, but the slot only
// have 5 bytes; the last code point will be trimmed and any remaining
// space will be filled with NULL bytes:
btools.pack('5u', '美麗');
// will return [231, 190, 142,   0, 0]

If the slot size is greater than the string, then null bytes will be used to fit the size:

// The string '美麗' uses 6 bytes and the slot have 8 bytes,
// so the string can be correcly packed and remaining slots
// will be written as NULL
btools.pack('8u', '美麗');
// will return [231, 190, 142,   233, 186, 151,   0, 0]

If using packTo() with typed arrays, in case the typed array size is not enough to fit all bytes as defined by the repeat count number, a Bad buffer length error will also be thrown:

var byteBuffer = new Uint8Array(5);
btools.packTo('6u', ['美麗'], byteBuffer);
// will throw a Bad buffer length error

When unpacking UTF-8, if the last bytes according to the size defined in the format string are not enough to read a complete character, it will be considered a invalid character and the replacement character will be added to the string:

// size dont reach the last byte of the last character
btools.unpack('5u', [231, 190, 142,   233, 186, 151]);
// will return ['美�']

// size match the 2 characters
btools.unpack('6u', [231, 190, 142,   233, 186, 151]);
// will return ['美麗']

If the slot is greater than the string, then the Unicode character 'NULL' (U+0000) will be included in the resulting string for every null byte present in the slot:

// Reading a slot of 10 bytes, but the string encoded at the slot
// only uses the first 6 bytes:
btools.unpack('10u', [231, 190, 142,   233, 186, 151,   0, 0, 0, 0]);
// will return ['美麗\u0000\u0000\u0000\u0000']
Calculating the buffer size for a UTF-8 string

To find out how many bytes are needed for a given UTF-8 string, use the utf8BufferSize() method:

btools.utf8BufferSize('Hello, world!');
// will return 13; the fact that this is the same number of
// characters in the string is a mere coincidence

btools.utf8BufferSize('Having a romantic dinner with my 👩‍❤️‍👨.');
// will return 54 - the 👩‍❤️‍👨 emoji alone uses 20 bytes.
// It is actually 6 code points, 3 emojis joined by
// Zero Width Joiners that form a single emoji on most fonts.
UTF-8 strings with write() and read()

UTF-8 strings are more commonly used with writing and reading functions; using them with writing and reading functions is also far simpler.

When using type 'u' with the write() function, the full string is always encoded regardless of its size:

btools.write('u', 'Hello, world!');
// will return
// [72, 101, 108, 108, 111,   44,   32,   119, 111, 114, 108, 100,   33]

When using type 'u' with the reading functions read() and readTo(), if no start index or end index are given, then the full string will be unpacked:

btools.read('u', [72, 101, 108, 108, 111,   44,   32,   119, 111, 114, 108, 100,   33]);
// will return ['Hello, world!']

A start or end index may be given when reading the byte buffer; in this case, only the bytes in the slice will be decoded. Other than that, the behavior is the same as using no indexes.

UTF-16 strings

The same rules used for UTF-8 also apply to UTF-16 strings (type 'U') - the differences are that a character may use 2 or 4 bytes and that UTF-16 also have the notion of endianness, while UTF-8 does not. This adds the restriction that only even byte counts may be used in the format string for packing or unpacking, and that only arrays or array slices with even byte counts can be used for reading.

// The string '慈愛' uses 4 bytes, 2 bytes for each character;
// since the format defines 4 bytes, it can be correctly packed:
btools.pack('4U', '慈愛');
// will return [72, 97,   27, 97]

// Uneven byte len; this size is not valid for UTF-16
// and will cause an error to be thrown
btools.pack('3U', '慈愛');
// will throw a Bad buffer length error

// The format string says 2 bytes, but the string needs 4 bytes
// to be encoded, 2 bytes per character; only the first character
// will be encoded.
btools.pack('2U', '慈愛');
// will return [72, 97]

// The format string says 6 bytes, but the string needs 8 bytes
// to be encoded, 4 bytes per character; only the first character
// will be encoded and remaining bytes will be filled with NULL.
btools.pack('!6U', '😀😀');
// will return [216, 61, 222, 0,   0, 0]

// The format string says 6 bytes, but the string only need 4;
// the remainig bytes will be written as NULL
btools.pack('6U', '慈愛');
// will return [72, 97,   27, 97,   0, 0]

If using packTo() or writeTo() with typed arrays, in case the typed array size is not enough to fit the size defined in the format string, a Bad buffer length error will be thrown:

var byteBuffer = new Uint8Array(2);
btools.packTo('4U', ['慈愛'], byteBuffer);
// will throw a Bad buffer length error

var byteBuffer = new Uint8Array(2);
btools.writeTo('2U', ['慈愛'], byteBuffer);
// will throw a Bad buffer length error

If the size is greater than the string, then null bytes will be used to fit the size:

btools.pack('6U', '慈愛');
// will return [72, 97,   27, 97,   0, 0]

When unpacking UTF-16, if the last character uses more bytes than the size defined in the format string, it will be treated as a invalid character and the replacement character will be used in the resulting string:

// size dont reach the last byte of the last character
// last character is 😀, which uses 4 bytes
btools.unpack('4U', [72, 97,   61, 216, 0, 222]);
// will return ['慈�']

// size match the 2 characters
btools.unpack('6U', [72, 97,   61, 216, 0, 222]);
// will return ['慈😀']

If the slot is greater than the string, then the Unicode character 'NULL' (U+0000) will be included in the resulting string for every null code point present in the slot:

// Reading a slot of 6 bytes, but the string encoded at the slot
// only uses the first 4 bytes; NULL characters will be included.
// In this case a single NULL character, since the remaining 2 bytes
// are a single code point
btools.unpack('6U', [72, 97,   27, 97,   0, 0]);
// will return ['慈愛\u0000']

When unpacking UTF-16, the byte count in the format string must always even, since every character uses either 2 or 4 bytes:

// Uneven byte count; will throw an error
btools.unpack('5U', [72, 97,   27, 97,   0, 0]);
// Will throw a Error
UTF-16 and endianness

UTF-16 have the notion of endianness, so byte order operators in the format string will affect how they are packed and unpacked.

When using the operators '<' or '>' to enforce endianness, if no BOM is present on the string, a BOM will be automatically added and this must be considered in the repeat count number:

// Type is '<' and string do not have a BOM; Bom will be added.
// Even if the characters only use 4 bytes, 6 bytes must be defined
// in the repeat count to make room for the BOM
btools.pack('<6U', '慈愛');
// will return [255, 254,   72, 97,   27, 97]

// If the repeat count number do not include space for the BOM,
// and there is not enough room for the last character, only
// the BOM and the first character will be encoded
btools.pack('<4U', '慈愛');
// will return [255, 254,   72, 97]

When using types '!' or '=' to represent endianness, no BOM will be added, so the repeat count number only need to consider the number of bytes used by the characters:

// Use '=' operator to represent little endian; no BOM will be included
btools.pack('=4U', '慈愛');
// will return [72, 97,   27, 97]

// Use '!' operator to represent big endian; no BOM will be included
btools.pack('!4U', '慈愛');
// will return [97, 72,   97, 27]

// Use '=' operator to represent little endian with a string that have a BOM
// In this case, the BOM will be packed like any other character. Notice that,
// like any other character, the BOM must be accounted for in the repeat count:
btools.pack('=6U', '慈愛');
// will return [255, 254,   72, 97,   27, 97]

To find how many bytes are needed for a given UTF-16 string, use the utf16BufferSize() function.

let buffer_ = new Uint8Array(
    btools.utf16BufferSize('Rafael is ❤️‍🔥 about his work!'));
btools.writeTo('U', 'Rafael is ❤️‍🔥 about his work!', buffer_);
let unpacked = btools.read('U', buffer_);
// unpacked = ['Rafael is ❤️‍🔥 about his work!']
Calculating the buffer size for a UTF-16 string

To find out how many bytes are needed for a UTF-16 string, use the utf16BufferSize() method:

btools.utf16BufferSize('Hello, world!');
// will return 26

The method accept a optional parameter forceBOM to indicate that a BOM should be considered in the size even if the original string does not have a BOM:

btools.utf16BufferSize('Hello, world!');
// will return 26

// foceBOM set to true, and there is no BOM in the string;
// in this case utf16BufferSize() will return the number of
// bytes needed to encode the string + 2 extra bytes to make
// room for the BOM:
btools.utf16BufferSize('Hello, world!', true); 
// will return 28; 26 bytes for characters + 2 bytes for the BOM

// BOM in the string, foceBOM true; in this case the
// BOM will be counted like any other character, and no
// extra bytes will be considered in the count
btools.utf16BufferSize('Hello, world!', true); 
// will also return 28

// BOM in the string, forceBOM false; in this case the
// BOM will be counted, too
btools.utf16BufferSize('Hello, world!'); 
// will also return 28
UTF-16 strings with write() and read()

When using type 'U' with write() the full string will be encoded:

btools.write('U', 'Hello, world!');
// will return
// [72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 44, 0, 32, 0,
//   119, 0, 111, 0, 114, 0, 108, 0, 100, 0, 33, 0];

When using type 'U' with the reading functions read() and readTo, if no start index or end index are given, then the full string will be decoded:

btools.read('U', [72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 44, 0, 32, 0,
    119, 0, 111, 0, 114, 0, 108, 0, 100, 0, 33, 0]);
// will return ['Hello, world!']

A start or end index may be given when reading the byte buffer; in this case, only the bytes in the slice defined by the indexes will be decoded. Other than that, the behavior is the same as using no indexes.

Note that, as with other types, the read() and readTo() functions will ignore extra bytes in the end of the input buffer:

// Array does not have a even size; extra bytes will be ignored
btools.read('U', [72, 97,   27, 97,  0]);
// returns ['慈愛'], ignoring the extra byte

You may adjust the size using the start and end params to fit a valid UTF-16 buffer size.

// Adjust the size to create a valid slice
// of the buffer for UTF-16:
btools.read('U', [72, 97,   27, 97,  0], false, 0, 4);
// also returns ['慈愛']

UTF-32 strings

The same rules used for UTF-16 also apply to UTF-32 strings (type 'V'). The main difference is that it always uses 4 bytes per code point, so the byte count must be always a multiple of 4.

// This string have 3 characters (regardless of being rendered
// as 2 characters in some fonts). Since every character uses
// 4 bytes, 12 bytes are needed to encode it:
btools.pack('12V', 'सुख');
// Will return [56, 9, 0, 0,   65, 9, 0, 0,   22, 9, 0, 0]

// Repeat count is not a multiple of 4; Not a valid count
// for UTF-32, and will cause an error to be thrown
btools.pack('10V', 'सुख');
// Will throw a Bad buffer length

// Repeat count is smaller then the number of bytes needed to encode
// the string; only the first two characters will be encoded:
btools.pack('8V', 'सुख'); 
// Will return [56, 9, 0, 0,   65, 9, 0, 0]

If using packTo() or writeTo() with typed arrays, in case the typed array size is not enough to fit the size defined in the format string, a Bad buffer length error will also be thrown:

var byteBuffer = new Uint8Array(6);
btools.packTo('20V', ['🚵🏻‍♂️'], byteBuffer);
// will throw a Bad buffer length error; this character
// needs 20 bytes, as it uses 5 code points

If the size is greater than the string, then null bytes will be used to fit the size:

btools.pack('24V', '🚵🏻‍♂️');
// will return [181, 246, 1, 0,   251, 243, 1, 0,   13, 32, 0, 0,
//              66, 38, 0, 0,   15, 254, 0, 0,   0, 0, 0, 0]

With UTF-32 the byte count in the format string must always be a multiple of 4, since every character uses always 4 bytes:

// size dont reach the last byte of the last character
btools.unpack('7V', [120, 243, 1, 0,   121, 243, 1, 0]);
// Will throw a Error

// size match all the characters
btools.unpack('8V', [120, 243, 1, 0,   121, 243, 1, 0]);
// will return ['🍸🍹']

If the slot is greater than the string, then the Unicode character 'NULL' (U+0000) will be included in the resulting string for every null byte present in the slot:

// Reading a slot of 12 bytes, but the string encoded at the slot
// only uses the first 8 bytes:
btools.unpack('12V', [120, 243, 1, 0,   121, 243, 1, 0,   0, 0, 0, 0]);
// will return ['🍸🍹\u0000']
UTF-32 and endianness

When using types '<' or '>' to enforce endianness, if no BOM is present on the string, a BOM will be automatically added and this must be considered in the repeat count number:

// Type is '<' and string do not have a BOM; BOM will be added.
// Even if the characters only uses 4 bytes each, 16 bytes must be defined
// in the repeat count to make room for the BOM
btools.pack('<16V', 'सुख');
// will return [255, 254, 0, 0,   56, 9, 0, 0,   65, 9, 0, 0,   22, 9, 0, 0]

// Number of bytes is a multiple of 4, but not enough to encode
// the full string; only the BOM and the first 2 characters will
// be encoded.
btools.pack('<12V', 'सुख');
// will return [255, 254, 0, 0,   56, 9, 0, 0,   65, 9, 0, 0]

When using types '!' or '=' to represent endianness, no BOM will be added, so the repeat count number only need to consider the number of bytes used by the characters already present in the string:

// Use '=' operator to represent little endian; no BOM will be included
// this is the same as using no operator at all
btools.pack('=12V', 'सुख');
// will return [56, 9, 0, 0,   65, 9, 0, 0,   22, 9, 0, 0]

// Use '!' operator to represent big endian; no BOM will be included
btools.pack('!12V', 'सुख');
// will return [0, 0, 9, 56,   0, 0, 9, 65,   0, 0, 9, 22]

// Use '=' operator to represent little endian with a string that have a BOM
// In this case, the BOM will be packed like any other character. Notice that,
// like any character, the size of the BOM must be considered in the repeat count:
btools.pack('=16V', 'सुख');
// will return [255, 254, 0, 0,   56, 9, 0, 0,   65, 9, 0, 0,   22, 9, 0, 0]

To determine how many bytes are needed for a given UTF-32 string, use the utf32BufferSize() function:

let buffer_ = new Uint8Array(
    btools.utf32BufferSize('Time to 🚵🏻‍♂️'));
btools.writeTo('V', 'Time to 🚵🏻‍♂️', buffer_);
let unpacked = btools.read('V', buffer_);
// unpacked = ['Time to 🚵🏻‍♂️']
Calculating the buffer size for a UTF-32 string

To find out how many bytes are needed for a UTF-32 string, use the utf32BufferSize() method:

btools.utf32BufferSize('Hello, world!');
// will return 52

The method accept a optional parametr forceBOM to indicate that a BOM should be accounted in the size even if the original string do not have a BOM:

btools.utf32BufferSize('Hello, world!');
// will return 52

// foceBOM set to true
btools.utf32BufferSize('Hello, world!', true);
// will return 56; 52 bytes for characters + 4 bytes for the BOM

// BOM in the string, foceBOM true
btools.utf32BufferSize('Hello, world!', true);
// will also return 56

// BOM in the string, forceBOM false
btools.utf32BufferSize('Hello, world!');
// will also return 56
UTF-32 strings with write() and read()

When using type 'V' with write() the full string will be encoded:

btools.write('V', 'Hello, world!');
// will return
// [72, 0, 0, 0,   101, 0, 0, 0,   108, 0, 0, 0,   108, 0, 0, 0,
//   111, 0, 0, 0,   44, 0, 0, 0,   32, 0, 0, 0,
//   119, 0, 0, 0,   111, 0, 0, 0,   114, 0, 0, 0,
//   108, 0, 0, 0,   100, 0, 0, 0,   33, 0, 0, 0];

When using type 'V' with the reading functions read() or readTo, if no start index or end index are given, then the full string will be unpacked:

btools.read('V', [72, 0, 0, 0,   101, 0, 0, 0,   108, 0, 0, 0,   108, 0, 0, 0,
    111, 0, 0, 0,   44, 0, 0, 0,   32, 0, 0, 0,
    119, 0, 0, 0,   111, 0, 0, 0,   114, 0, 0, 0,
    108, 0, 0, 0,   100, 0, 0, 0,   33, 0, 0, 0]);
// will return ['Hello, world!']

Start and end indexes may be given when reading the byte buffer; in this case, only the bytes in the slice will be decoded. Other than that, the behavior is the same as using no indexes.

Note that, just as with other types, the read() and readTo() functions will ignore extra bytes in the end of the reading buffer:

// Byte count is not a multiple of 4
// extra bytes are ignored:
btools.read('V', [120, 243, 1, 0,   121, 243, 1, 0,  0]);
// will return ['🍸🍹'], ignoring the extra byte

You may adjust the size using the start and end params to fit a valid UTF-32 buffer size:

// Adjust the size to create a valid slice
// of the buffer for UTF-16:
btools.read('V', [120, 243, 1, 0,   121, 243, 1, 0,   0], false, 0, 8);
// also returns ['🍸🍹']

API

Packing/unpacking functions:

/**
 * Pack values to a byte buffer.
 * @param {string} format The struct format definition.
 * @param {!Array<*>} values The values to pack.
 * @param {!Array<number>} buffer The byte buffer to write on.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [index=0] The buffer index to start writing.
 * @return {number} The next index to write on the buffer.
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} If the output buffer is typed array and size is not valid.
 * @throws {RangeError} On overflow if clamp is set to false.
 * @throws {TypeError} If a value is not valid for its type.
 */
export function packTo(format, values, buffer, clamp=false, index=0) {}

/**
 * Unpack values from an array of bytes to an Array-like object.
 * Always start writing the output at the beginning of the output array.
 * @param {string} format The struct format definition.
 * @param {!Array<number>} buffer The byte buffer to unpack.
 * @param {!Array<number|string|bigint|boolean>} output The output Array.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [index=0] The buffer index to read.
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} On bad input buffer length.
 */
export function unpackTo(format, buffer, output, clamp=false, index=0) {}

/**
 * Pack values as a array of bytes.
 * @param {string} format The struct format definition.
 * @param {!Array<*>} values The values to pack.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @return {!Array<number>} The packed values.
 * @throws {Error} On unsupported type on the format string.
 * @throws {RangeError} On overflow if clamp is set to false.
 * @throws {TypeError} If a value is not valid for its type.
 */
export function pack(format, values, clamp=false) {}

/**
 * Unpack values from an array of bytes.
 * This method returns an Array even if only a single value is unpacked.
 * @param {string} format The struct format definition.
 * @param {!Array<number>} buffer The byte buffer to unpack.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [index=0] The buffer index to read.
 * @return {!Array<number|string|bigint|boolean>} The unpacked values.
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} On bad input buffer length.
 */
export function unpack(format, buffer, clamp=false, index=0) {}

Writing/reading functions to handle long sequences of the same type:

/**
 * Write a array of values to a byte buffer.
 * @param {string} format The format definition.
 * @param {!Array<*>} values The values to write.
 * @param {!Array<number>} buffer The buffer to write on.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [index=0] The buffer index to start writing.
 * @return {number} The next buffer index to write on.
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} If the output buffer is typed array and size is not valid.
 * @throws {RangeError} On integer overflow if clamp is set to false.
 * @throws {TypeError} If a value is not valid for its type.
 */
export function writeTo(format, values, buffer, clamp=false, index=0) {}

/**
 * Read a array of values from a byte buffer to a array or a typed array.
 * @param {string} format The format definition.
 * @param {!Array<number>} buffer The byte buffer to read.
 * @param {!Array<number|string|bigint|boolean>} output The output array.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [start=0] The input buffer index to start reading.
 * @param {number=} [end=buffer.length] The input buffer index to stop reading.
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} On bad input buffer length.
 */
export function readTo(
    format, buffer, output, clamp=false, start=0, end=buffer.length) {}

/**
 * Write a array of values as a array of bytes.
 * @param {string} format The format definition.
 * @param {!Array<*>} values The values to write.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @return {!Array<number>} The packed values.
 * @throws {Error} On unsupported type on the format string.
 * @throws {RangeError} On overflow if clamp is set to false.
 * @throws {TypeError} If a value is not valid for its type.
 */
export function write(format, values, clamp=false) {}

/**
 * Read a array of values from a byte buffer.
 * @param {string} format The format definition.
 * @param {!Array<number>} buffer The byte buffer.
 * @param {boolean=} [clamp=false] True to clamp values on overflow.
 * @param {number=} [start=0] The buffer index to start reading.
 * @param {number=} [end=buffer.length] The buffer index to stop reading.
 * @return {!Array<number|string|bigint|boolean>}
 * @throws {Error} On unsupported type on the format string.
 * @throws {Error} On bad input buffer length.
 */
export function read(
    format, buffer, clamp=false, start=0, end=buffer.length) {}

Note that in versions 1.x and 2.x the end param for read and readTo is non-inclusive, so it must be set always as index + 1. For example, to read from array position 0 to position 7 you should read('u', buffer, 0, 8).

Tools:

/**
 * Swap the byte ordering in a buffer. The buffer is modified in place.
 * @param {!Array<number>} buffer The bytes.
 * @param {number} offset The byte offset.
 * @param {number=} [start=0] The start index.
 * @param {number=} [end=bytes.length] The end index.
 * @function endianness
 * @memberOf module:binary-tools
 */
export function endianness(buffer, offset, start=0, end=buffer.length) {}

/**
 * Set a type in the types object.
 * @param {string} typeSymbol The type symbol.
 * @param {number} bits The type bits.
 * @param {boolean=} [signed=false] True for signed, false otherwise.
 * @throws {Error} On unsupported type.
 */
export function setType(typeSymbol, bits, signed=false) {}

/**
 * Calculate the buffer size based on a format string.
 * @param {string} format The format string.
 * @return {number}
 * @throws {Error} On unsupported type on the format string.
 */
export function calcSize(format) {}

/**
 * Returns how many bytes are needed to serialize a UTF-8 string.
 * @param {string} str The string to pack.
 * @return {number} The number of bytes needed for the string.
 */
export function utf8BufferSize(str) {}

/**
 * Returns how many bytes are needed to serialize a UTF-16 string.
 * @param {string} str The string.
 * @param {boolean=} [forceBOM=false] If BOM should be enforced or not.
 * If false (default), then it will only count the bytes according to
 * the characters on the string; if true and string have no BOM, then
 * it will count the bytes of the characters plus BOM (size + 2).
 * @return {number} The number of bytes needed for the string.
 */
export function utf16BufferSize(str, forceBOM=false) {}

/**
 * Returns how many bytes are needed to serialize a UTF-32 string.
 * @param {string} str The string.
 * @param {boolean=} [forceBOM=false] If BOM should be enforced or not.
 * If false (default), then it will only count the bytes according to
 * the characters on the string; if true and string have no BOM, then
 * it will count the bytes of the characters plus BOM (size + 2).
 * @return {number} The number of bytes needed for the string.
 */
export function utf32BufferSize(str, forceBOM=false) {}

/**
 * Format a byte array as a string of hex numbers.
 * @param {!Array<number>} bytes The bytes.
 * @return {string}
 * @throws {RangeError} If bytes contains invalid values.
 */
export function bytesToHex(bytes) {};

/**
 * Convert a hex string to a byte array.
 * @param {string} hexStr The hex string.
 * @return {!Array<number>}
 * @throws {RangeError} If string contains chars outside the 0..f range.
 * @throws {Error} If string length is not even.
 */
export function hexToBytes(hexStr) {};

Note that in versions 1.x and 2.x the end param for endiannness is non-inclusive, so it must be set always as index + 1.


Building

Information in this section is only relevant if you are developing binary-tools.js yourself or need to expose binary-tools.js methods on your software's API.

Compilation

binary-tools dist files are created with Closure Compiler with compilation level set to ADVANCED, so properties that have not been exported will be renamed (and likely result in errors). During the build process the tests run against both source files and the compiled dist files to ensure all went OK during the compilation.

The externs file is in ./externs/binary-tools.js.

Style guide

binary-tools source code should follow the Google JavaScript Style Guide:
https://google.github.io/styleguide/jsguide.html


Additional files

The TypeScript declarations are in ./index.d.ts

The TypeScript declarations for the ES3 distribution are in ./dist/binary-tools.d.ts

The Closure Compiler externs are in ./externs/binary-tools.js


Reporting security issues

Report security issues to this e-mail: [email protected].


Buying the source code:

The unpaid version of this software is released under the CC-BY-NC-ND-4.0 License. You are free to use it under the terms of the CC-BY-NC-ND-4.0 License. View COPYING for more information.

You can buy a version of this software licensed under more permissive open source licenses at https://rochars.com. The complete source code, tests, documentation and all files related to the project are included for your convenience.


LICENSE

binary-tools: JavaScript binary parser for any browser or environment.
Copyright (c) 2023-2025 Rafael da Silva Rocha

See LICENSE, NOTICE and COPYING for more information.