Signatory

Differentiable computations of the signature and logsignature transforms, on both CPU and GPU.

The Signatory project is hosted on GitHub.

Introduction

This is the documentation for the Signatory package, which provides facilities for calculating the signature and logsignature transforms of streams of data.

If you want to get started on using the signature transform in your code then check out Simple example for a simple demonstration.

If you want to know more about the mathematics of the signature transform and how to use it then see What is the signature transform? for a very brief introduction. Further links to papers discussing the subject in more detail can also be found there.

If you have any comments or queries about signatures or about this package (for example, bug reports or feature requests) then open an issue on GitHub.

Installation

pip install signatory==<SIGNATORY_VERSION>.<TORCH_VERSION> --no-cache-dir --force-reinstall

where <SIGNATORY_VERSION> is the version of Signatory you would like to download (the most recent version is 1.2.7) and <TORCH_VERSION> is the version of PyTorch you are using.

Available for Python 3.7–3.9 on Linux and Windows. Requires PyTorch 1.8.0–1.11.0.

(If you need it, then previous versions of Signatory included support for older versions of Python, PyTorch, and MacOS, see here.)

After installation, just import signatory inside Python.

Take care not to run pip install signatory, as this will likely download the wrong version.

Example:

For example, if you are using PyTorch 1.11.0 and want Signatory 1.2.7, then you should run:

pip install signatory==1.2.7.1.11.0 --no-cache-dir --force-reinstall

Why you need to specify all of this:

Yes, this looks a bit odd. This is needed to work around limitations of PyTorch and pip.

The --no-cache-dir --force-reinstall flags are because pip doesn’t expect to need to care about versions quite as much as this, so it will sometimes erroneously use inappropriate caches if not told otherwise.

If you have any problems with installation then check the FAQ. If that doesn’t help then feel free to open an issue.

Install from source

For most use-cases, the prebuilt binaries available as described above should be sufficient. However installing from source is also perfectly feasible, and usually not too tricky.

You’ll need to have a C++ compiler installed and known to pip, and furthermore this must be the same compiler that PyTorch uses. (This is msvc on Windows, gcc on Linux, and clang on Macs.) You must have already installed PyTorch. (You don’t have to compile PyTorch itself from source, though!)

Then run either

pip install signatory==<SIGNATORY_VERSION>.<TORCH_VERSION> --no-binary signatory

(where <SIGNATORY_VERSION> and <TORCH_VERSION> are as above.)

or

git clone https://github.com/patrick-kidger/signatory.git
cd signatory
python setup.py install

If you chose the first option then you’ll get just the files necessary to run Signatory.

If you choose the second option then tests, benchmarking code, and code to build the documentation will also be provided. Subsequent to this,

Note

If on Linux then the commands stated above should probably work.

If on Windows then it is probably first necessary to run a command of the form

"C:/Program Files (x86)/Microsoft Visual Studio/2017/Enterprise/VC/Auxiliary/Build/vcvars64.bat"

(the exact command will depend on your operating system and version of Visual Studio).

If on a Mac then the installation command should instead look like either

MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ pip install signatory==<SIGNATORY_VERSION>.<TORCH_VERSION> --no-binary signatory

or

MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

depending on the choice of installation method.

A helpful point of reference for getting this to work might be the official build scripts for Signatory.

Older versions

Older versions of Signatory supported earlier versions of Python and PyTorch. It also included support for MacOS, but this has now been dropped as being difficult to maintain.

The full list of available combinations can seen on PyPI.

Library API

For quick reference these are a list of all provided functions, grouped by which reference page they are on.

Signatures

signatory.signature

Applies the signature transform to a stream of data.

signatory.Signature

torch.nn.Module wrapper around the signatory.signature() function.

signatory.signature_channels

Computes the number of output channels from a signature call.

signatory.extract_signature_term

Extracts a particular term from a signature.

signatory.signature_combine

Combines two signatures into a single signature.

signatory.multi_signature_combine

Combines multiple signatures into a single signature.

Logsignatures

signatory.logsignature

Applies the logsignature transform to a stream of data.

signatory.LogSignature

torch.nn.Module wrapper around the signatory.logsignature() function.

signatory.logsignature_channels

Computes the number of output channels from a logsignature call with mode in ("words", "brackets").

signatory.signature_to_logsignature

Calculates the logsignature corresponding to a signature.

signatory.SignatureToLogSignature

torch.nn.Module wrapper around the signatory.signature_to_logsignature() function.

Path

signatory.Path

Calculates signatures and logsignatures on intervals of an input path.

Signature inversion

signatory.invert_signature

Invert the signature with the insertion algorithm: reconstruct a stream of data given its signature.

Utilities

signatory.Augment

Augmenting a stream of data before feeding it into a signature is often useful; the hope is to obtain higher-order information in the signature.

signatory.all_words

Computes the collection of all words up to length depth in an alphabet of size channels.

signatory.lyndon_words

Computes the collection of all Lyndon words up to length depth in an alphabet of size channels.

signatory.lyndon_brackets

Computes the collection of all Lyndon words, in their standard bracketing, up to length depth in an alphabet of size channels.

Signatures

At the heart of the package is the signatory.signature() function.

Note

It comes with quite a lot of optional arguments, but most of them won’t need to be used for most use cases. See Simple example for a straightforward look at how to use it.

signatory.signature(path: torch.Tensor, depth: int, stream: bool = False, basepoint: Union[bool, torch.Tensor] = False, inverse: bool = False, initial: Optional[torch.Tensor] = None, scalar_term: bool = False) torch.Tensor

Applies the signature transform to a stream of data.

The input path is expected to be a three-dimensional tensor, with dimensions \((N, L, C)\), where \(N\) is the batch size, \(L\) is the length of the input sequence, and \(C\) denotes the number of channels. Thus each batch element is interpreted as a stream of data \((x_1, \ldots, x_L)\), where each \(x_i \in \mathbb{R}^C\).

Let \(f = (f_1, \ldots, f_C) \colon [0, 1] \to \mathbb{R}^C\), be the unique continuous piecewise linear path such that \(f(\tfrac{i - 1}{N - 1}) = x_i\). Then and the signature transform of depth depth is computed, defined by

\[\mathrm{Sig}(\text{path}) = \left(\left( \,\underset{0 < t_1 < \cdots < t_k < 1}{\int\cdots\int} \prod_{j = 1}^k \frac{\mathrm d f_{i_j}}{\mathrm dt}(t_j) \mathrm dt_1 \cdots \mathrm dt_k \right)_{\!\!1 \leq i_1, \ldots, i_k \leq C}\right)_{\!\!1\leq k \leq \text{depth}}.\]

This gives a tensor of shape

\[(N, C + C^2 + \cdots + C^\text{depth}).\]
Parameters
  • path (torch.Tensor) – The batch of input paths to apply the signature transform to.

  • depth (int) – The depth to truncate the signature at.

  • stream (bool, optional) – Defaults to False. If False then the usual signature transform of the whole path is computed. If True then the signatures of all paths \((x_1, \ldots, x_j)\), for \(j=2, \ldots, L\), are returned. (Or \(j=1, \ldots, L\) is basepoint is passed, see below.)

  • basepoint (bool or torch.Tensor, optional) – Defaults to False. If basepoint is True then an additional point \(x_0 = 0 \in \mathbb{R}^C\) is prepended to the path before the signature transform is applied. (If this is False then the signature transform is invariant to translations of the path, which may or may not be desirable. Setting this to True removes this invariance.) Alternatively it may be a torch.Tensor specifying the value of \(x_0\), in which case it should have shape \((N, C)\).

  • inverse (bool, optional) – Defaults to False. If True then it is in fact the inverse signature that is computed. (Signatures form a group under the operation of the tensor product; the inverse is defined with respect to this operation.) From a machine learning perspective it does not particularly matter whether the signature or the inverse signature is computed - both represent essentially the same information as each other.

  • initial (None or torch.Tensor, optional) – Defaults to None. If it is a torch.Tensor then it must be of size \((N, C + C^2 + ... + C^\text{depth})\), corresponding to the signature of another path. Then this signature is pre-tensor-multiplied on to the signature of path. For a more thorough explanation, see this example.

  • scalar_term (bool, optional) – Defaults to False. If True then the first channel of the computed signature will be filled with the constant 1 (in accordance with the usual mathematical definition). If False then this channel is omitted (in accordance with useful machine learning practice).

Returns

A torch.Tensor. Given an input torch.Tensor of shape \((N, L, C)\), and input arguments depth, basepoint, stream, then the return value is, in pseudocode:

if stream:
    if basepoint is True or isinstance(basepoint, torch.Tensor):
        return torch.Tensor of shape (N, L, C + C^2 + ... + C^depth)
    else:
        return torch.Tensor of shape (N, L - 1, C + C^2 + ... + C^depth)
else:
    return torch.Tensor of shape (N, C + C^2 + ... + C^depth)

Note that the number of output channels may be calculated via the convenience function signatory.signature_channels().

class signatory.Signature(depth: int, stream: bool = False, inverse: bool = False, scalar_term: bool = False, **kwargs)

torch.nn.Module wrapper around the signatory.signature() function.

Parameters
forward(path: torch.Tensor, basepoint: Union[bool, torch.Tensor] = False, initial: Optional[torch.Tensor] = None) torch.Tensor

The forward operation.

Parameters
Returns

As signatory.signature().

signatory.signature_channels(channels: int, depth: int, scalar_term: bool = False) int

Computes the number of output channels from a signature call. Specifically, it computes

\[\text{channels} + \text{channels}^2 + \cdots + \text{channels}^\text{depth}.\]
Parameters
  • channels (int) – The number of channels in the input; that is, the dimension of the space that the input path resides in.

  • depth (int) – The depth of the signature that is being computed.

  • scalar_term (bool, optional) – Defaults to False. Whether to include the constant ‘1’ scalar that may be included.

Returns

An int specifying the number of channels in the signature of the path.

signatory.extract_signature_term(sigtensor: torch.Tensor, channels: int, depth: int, scalar_term: bool = False) torch.Tensor

Extracts a particular term from a signature.

The signature to depth \(d\) of a batch of paths in \(\mathbb{R}^\text{C}\) is a tensor with \(C + C^2 + \cdots + C^d\) channels. (See signatory.signature().) This function extracts the depth term of that, returning a tensor with just \(C^\text{depth}\) channels.

Parameters
  • sigtensor (torch.Tensor) – The signature to extract the term from. Should be a result from the signatory.signature() function.

  • channels (int) – The number of input channels \(C\).

  • depth (int) – The depth of the term to be extracted from the signature.

  • scalar_term (bool, optional) – Whether the signature was called with scalar_term=True or not.

Returns

The torch.Tensor corresponding to the depth term of the signature.

signatory.signature_combine(sigtensor1: torch.Tensor, sigtensor2: torch.Tensor, input_channels: int, depth: int, inverse: bool = False, scalar_term: bool = False) torch.Tensor

Combines two signatures into a single signature.

Usage is most clear by example. See Combining signatures.

See also signatory.multi_signature_combine() for a more general version.

Parameters
  • sigtensor1 (torch.Tensor) – The signature of a path, as returned by signatory.signature(). This should be a two-dimensional tensor.

  • sigtensor2 (torch.Tensor) – The signature of a second path, as returned by signatory.signature(), with the same shape as sigtensor1. Note that when the signature of the second path was created, it should have been called with basepoint set to the final value of the path that created sigtensor1. (See Combining signatures.)

  • input_channels (int) – The number of channels in the two paths that were used to compute sigtensor1 and sigtensor2. This must be the same for both sigtensor1 and sigtensor2.

  • depth (int) – The depth that sigtensor1 and sigtensor2 have been calculated to. This must be the same for both sigtensor1 and sigtensor2.

  • inverse (bool, optional) – Defaults to False. Whether sigtensor1 and sigtensor2 were created with inverse=True. This must be the same for both sigtensor1 and sigtensor2.

  • scalar_term (bool, optional) – Defaults to False. Whether sigtensor1 and sigtensor2 were created with scalar_term=True. This must the same for both sigtensor1 and sigtensor2.

Returns

Let path1 be the path whose signature is sigtensor1. Let path2 be the path whose signature is sigtensor2. Then this function returns the signature of the concatenation of path1 and path2 along their stream dimension.

Danger

There is a subtle bug which can occur when using this function incautiously. Make sure that sigtensor2 is created with an appropriate basepoint, see Combining signatures.

If this is not done then the return value of this function will be essentially meaningless numbers.

signatory.multi_signature_combine(sigtensors: List[torch.Tensor], input_channels: int, depth: int, inverse: bool = False, scalar_term: bool = False) torch.Tensor

Combines multiple signatures into a single signature.

See also signatory.signature_combine() for a simpler version.

Parameters
Returns

Let sigtensors be a list of tensors, call them \(\text{sigtensor}_i\) for \(i = 0, 1, \ldots, k\). Let \(\text{path}_i\) be the path whose signature is \(\text{sigtensor}_i\). Then this function returns the signature of the concatenation of \(\text{path}_i\) along their stream dimension.

Danger

Make sure that each element of sigtensors is created with an appropriate basepoint, as with signatory.signature_combine().

Logsignatures

signatory.logsignature(path: torch.Tensor, depth: int, stream: Union[bool, torch.Tensor] = False, basepoint: bool = False, inverse=False, mode: str = 'words') torch.Tensor

Applies the logsignature transform to a stream of data.

The modes argument determines how the logsignature is represented.

Note that if performing many logsignature calculations for the same depth and size of input, then you will see a performance boost (at the cost of using a little extra memory) by using signatory.LogSignature instead of signatory.logsignature().

Parameters
Returns

A torch.Tensor, of almost the same shape as the tensor returned from signatory.signature() called with the same arguments.

If mode == "expand" then it will be exactly the same shape as the returned tensor from signatory.signature().

If mode in ("brackets", "words") then the channel dimension will instead be of size signatory.logsignature_channels(path.size(-1), depth). (Where path.size(-1) is the number of input channels.)

The different modes correspond to different mathematical representations of the logsignature.

Tip

If you haven’t studied tensor algebras and free Lie algebras, and none of the following explanation makes sense to you, then you probably want to leave mode on its default value of "words" and it will all be fine!

If mode == "expand" then the logsignature is presented as a member of the tensor algebra; the numbers returned correspond to the coefficients of all words in the tensor algebra.

If mode == "brackets" then the logsignature is presented in terms of the coefficients of the Lyndon basis of the free Lie algebra.

If mode == "words" then the logsignature is presented in terms of the coefficients of a particular computationally efficient basis of the free Lie algebra (that is not a Hall basis). Every basis element is given as a sum of Lyndon brackets. When each bracket is expanded out and the sum computed, the sum will contain precisely one Lyndon word (and some collection of non-Lyndon words). Moreover every Lyndon word is represented uniquely in this way. We identify these basis elements with each corresponding Lyndon word. This is natural as the coefficients in this basis are found just by extracting the coefficients of all Lyndon words from the tensor algebra representation of the logsignature.

In all cases, the ordering corresponds to the ordering on words given by first ordering the words by length, and then ordering each length class lexicographically.

class signatory.LogSignature(depth: int, stream: bool = False, inverse: bool = False, mode: str = 'words', **kwargs)

torch.nn.Module wrapper around the signatory.logsignature() function.

This torch.nn.Module performs certain optimisations to allow it to calculate multiple logsignatures faster than multiple calls to signatory.logsignature().

Specifically, these optimisations will apply if this torch.nn.Module is called with an input path with the same number of channels as the last input path it was called with, as is likely to be very common in machine learning set-ups. For larger depths or numbers of channels, this speedup will be substantial.

Parameters
forward(path: torch.Tensor, basepoint: Union[bool, torch.Tensor] = False) torch.Tensor

The forward operation.

Parameters
Returns

As signatory.logsignature().

prepare(in_channels: int) None

Prepares for computing logsignatures for paths of the specified number of channels. This will be done anyway automatically whenever this torch.nn.Module is called, if it hasn’t been called already; this method simply allows to have it done earlier, for example when benchmarking.

Parameters

in_channels (int) – The number of input channels of the path that this instance will subsequently be called with. (corresponding to path.size(-1).)

signatory.logsignature_channels(in_channels: int, depth: int) int

Computes the number of output channels from a logsignature call with mode in ("words", "brackets").

Parameters
  • in_channels (int) – The number of channels in the input; that is, the dimension of the space that the input path resides in. If calling signatory.logsignature() with argument path then in_channels should be equal to path.size(-1).

  • depth (int) – The depth of the signature that is being computed.

Returns

An int specifying the number of channels in the logsignature of the path.

signatory.signature_to_logsignature(signature: torch.Tensor, channels: int, depth: int, stream: bool = False, mode: str = 'words', scalar_term: bool = False) torch.Tensor

Calculates the logsignature corresponding to a signature.

Parameters

Example

import signatory
import torch
batch, stream, channels = 8, 8, 8
depth = 3
path = torch.rand(batch, stream, channels)
signature = signatory.signature(path, depth)
logsignature = signatory.signature_to_logsignature(signature, channels, depth)
Returns

A torch.Tensor representing the logsignature corresponding to the given signature. See signatory.logsignature().

class signatory.SignatureToLogSignature(channels: int, depth: int, stream: bool = False, mode: str = 'words', scalar_term: bool = False, **kwargs)

torch.nn.Module wrapper around the signatory.signature_to_logsignature() function.

Calling this torch.nn.Module on an input signature with the same number of channels as the last signature it was called with will be faster than multiple calls to the signatory.signature_to_logsignature() function, in the same way that signatory.LogSignature will be faster than signatory.logsignature().

Parameters
forward(signature: torch.Tensor) torch.Tensor

The forward operation.

Parameters

signature (torch.Tensor) – As signatory.signature_to_logsignature().

Returns

As signatory.signature_to_logsignature().

Path

class signatory.Path(path: torch.Tensor, depth: int, basepoint: Union[bool, torch.Tensor] = False, remember_path: bool = True, scalar_term: bool = False, **kwargs)

Calculates signatures and logsignatures on intervals of an input path.

By doing some precomputation, it can rapidly calculate the signature or logsignature over any slice of the input path. This is particularly useful if you need the signature or logsignature of a path over many different intervals: using this class will be much faster than computing the signature or logsignature of each sub-path each time.

May be efficiently sliced and indexed along its batch dimension via [] syntax. This will return a new Path without copying the underlying data.

Parameters
  • path (torch.Tensor) – As signatory.signature().

  • depth (int) – As signatory.signature().

  • basepoint (bool or torch.Tensor, optional) – As signatory.signature().

  • remember_path (bool, optional) – Defaults to True. Whether to record the path argument that this was called with. If True, then it will be accessible as the .path attribute. This argument may be set to False if it is known that the underlying path is no longer of interest and should not kept in memory just because it was passed as an argument here.

  • scalar_term (bool, optional) – Defaults to False. Whether to include the scalar ‘1’ when calling the signatory.Path.signature() method; see also the equivalent argument for signatory.signature().

signature(start: Optional[int] = None, end: Optional[int] = None) torch.Tensor

Returns the signature on a particular interval.

Parameters
  • start (int or None, optional) – Defaults to the start of the path. The start point of the interval to calculate the signature on.

  • end (int or None, optional) – Defaults to the end of the path. The end point of the interval to calculate the signature on.

Returns

The signature on the interval [start, end].

In the simplest case, when path and depth are the arguments that this class was initialised with (and basepoint was not passed), then this function returns a value equal to signatory.signature(path[start:end], depth).

In general, let p = torch.cat(self.path, dim=1), so that it is all given paths (including those path from both initialistion and signatory.Path.update(), and any basepoint) concatenated together. Then this function will return a value equal to signatory.signature(p[start:end], depth).

logsignature(start: Optional[int] = None, end: Optional[int] = None, mode: str = 'words') torch.Tensor

Returns the logsignature on a particular interval.

Parameters
Returns

The logsignature on the interval [start, end]. See the documentation for signatory.Path.signature().

update(path: torch.Tensor) None

Concatenates the given path onto the path already stored.

This means that the signature of the new overall path can now be asked for via signatory.Path.signature(). Furthermore this will be dramatically faster than concatenating the two paths together and then creating a new Path object: the ‘concatenation’ occurs implicitly, without actually involving any recomputation or reallocation of memory.

Parameters

path (torch.Tensor) – The path to concatenate on. As signatory.signature().

property remember_path: bool

Whether this Path remembers the path argument it was called with.

property path: List[torch.Tensor]

The path(s) that this Path was created with.

property depth: int

The depth that Path has calculated the signature to.

size(index: Optional[int] = None) Union[int, torch.Size]

The size of the input path. As torch.Tensor.size().

Parameters

index (int or None, optional) – As torch.Tensor.size().

Returns

As torch.Tensor.size().

property shape: torch.Size

The shape of the input path. As torch.Tensor.shape.

channels() int

The number of channels of the input stream.

signature_size(index: Optional[int] = None) Union[int, torch.Size]

The size of the signature of the path. As torch.Tensor.size().

Parameters

index (int or None, optional) – As torch.Tensor.size().

Returns

As torch.Tensor.size().

property signature_shape: torch.Size

The shape of the signature of the path. As torch.Tensor.shape.

signature_channels() int

The number of signature channels; as signatory.signature_channels().

logsignature_size(index: Optional[int] = None) Union[int, torch.Size]

The size of the logsignature of the path. As torch.Tensor.size().

Parameters

index (int or None, optional) – As torch.Tensor.size().

Returns

As torch.Tensor.size().

property logsignature_shape: torch.Size

The shape of the logsignature of the path. As torch.Tensor.shape.

logsignature_channels() int

The number of logsignature channels; as signatory.logsignature_channels().

shuffle()

Randomly permutes the Path along its batch dimension, and returns as a new Path. Returns a tuple of the new Path object, and the random permutation that produced it.

shuffle_()

In place version of signatory.Path.shuffle().

Warning

If repeatedly making forward and backward passes (for example when training a neural network) and you have a learnt layer before the signatory.Path, then make sure to construct a new signatory.Path object for each forward pass.

Reusing the same object between forward passes will mean that signatures aren’t computed using the latest information, as the internal buffers will still correspond to the data passed in when the signatory.Path object was first constructed.

Signature inversion

signatory.invert_signature(signature: torch.Tensor, depth: int, channels: int, initial_position: Optional[torch.Tensor] = None) torch.Tensor

Invert the signature with the insertion algorithm: reconstruct a stream of data given its signature. Given that the signature is invariant by translation, the initial position of the stream is not recovered.

The input signature is the signature transform of depth depth of a batch of paths: it should be a result from the signatory.signature() function. The output is a tensor of shape \((N, L, C)\), where \(N\) is the batch size, \(L\) is the length of the reconstructed stream of data, with \(L = depth + 1\), and \(C\) denotes the number of channels.

Parameters
  • signature (torch.Tensor) – The signature of a batch of paths, as returned by signatory.signature(). This should be a two-dimensional tensor.

  • depth (int) – The depth that signature has been calculated to.

  • channels (int) – The number of channels in the batch of paths that was used to compute signature.

  • initial_position (None or torch.Tensor, optional) – Defaults to None. If it is a torch.Tensor then it must be of size \((N, C)\), corresponding to the initial position of the paths. If None, the reconstructed paths are set to begin at zero.

Returns

The torch.Tensor corresponding to a batch of inverted paths.

Utilities

The following miscellaneous operations are provided as a convenience.


class signatory.Augment(in_channels: int, layer_sizes: typing.Tuple[int, ...], kernel_size: int, stride: int = 1, padding: int = 0, dilation: int = 1, bias: bool = True, activation: typing.Callable[[torch.Tensor], torch.Tensor] = <function relu>, include_original: bool = True, include_time: bool = True, **kwargs)

Augmenting a stream of data before feeding it into a signature is often useful; the hope is to obtain higher-order information in the signature. One way to do this is in a data-dependent way is to apply a feedforward neural network to sections of the stream, so as to obtain another stream; on this stream the signature is then applied; that is what this torch.nn.Module does.

Thus this torch.nn.Module is essentially unrelated to signatures, but is provided as it is often useful in the same context. As described in Deep Signature Transforms – Bonnier et al. 2019, it is often advantageous to augment a path before taking the signature.

The input path is expected to be a three-dimensional tensor, with dimensions \((N, L, C)\), where \(N\) is the batch size, \(L\) is the length of the input sequence, and \(C\) denotes the number of channels. Thus each batch element is interpreted as a stream of data \((x_1, \ldots, x_L)\), where each \(x_i \in \mathbb{R}^C\).

Then this stream may be ‘augmented’ via some function

\[\phi \colon \mathbb{R}^{C \times k} \to \mathbb{R}^{\widehat{C}}\]

giving a stream of data

\[\left(\phi(x_1, ... x_k), \ldots, \phi(x_{n - k + 1}, \ldots, x_n)\right),\]

which is essentially a three-dimensional tensor with dimensions \((N, L - k + 1, \widehat{C})\).

Thus this essentially operates as a one dimensional convolution, except that a whole network is swept across the input, rather than just a single convolutional layer.

Both the original stream and time can be specifically included in the augmentation. (This usually tends to give better empirical results.) For example, if both include_original is True and include_time is True, then each \(\phi(x_i, ... x_{k + i - 1})\) is of the form

\[\left(\frac{i}{T}, x_i, \varphi(x_i, ... x_{k + i - 1})\right).\]

where \(T\) is a constant appropriately chosen so that the first entry moves between \(0\) and \(1\) as \(i\) varies. (Specifically, \(T = L - k + 1 + 2 \times \text{padding}\).)

Parameters
  • in_channels (int) – Number of channels \(C\) in the input stream.

  • layer_sizes (tuple of int) – Specifies the sizes of the layers of the feedforward neural network to apply to the stream. The final value of this tuple specifies the number of channels in the augmented stream, corresponding to the value \(\widehat{C}\) in the preceding discussion.

  • kernel_size (int) – Specifies the size of the kernel to slide over the stream, corresponding to the value \(k\) in the preceding discussion.

  • stride (int, optional) –

    Defaults to 1. How far to move along the input stream before re-applying the feedforward neural network. Thus the output stream is given by

    \[(\phi(x_1, \ldots, x_k), \phi(x_{1 + \text{stride}}, \ldots, x_{k + 2 \times \text{stride}}), \phi(x_{1 + 2 \times \text{stride}}, \ldots, x_{k + 2 \times \text{stride}}), \ldots)\]

  • padding (int, optional) – Defaults to 0. How much zero padding to add to either end of the the input stream before sweeping the feedforward neural network along it.

  • dilation (int, optional) – The spacing between input elements given to the feedforward neural network. Defaults to 1. Harder to describe; see the equivalent argument for torch.nn.Conv1d.

  • bias (bool, optional) – Defaults to True. Whether to use biases in the neural network.

  • activation (callable, optional) – Defaults to ReLU. The activation function to use in the feedforward neural network.

  • include_original (bool, optional) – Defaults to True. Whether or not to include the original stream (pre-augmentation) in the augmented stream.

  • include_time (bool, optional) – Defaults to True. Whether or not to also augment the stream with a ‘time’ value. These are values in \([0, 1]\) corresponding to how far along the stream dimension the element is.

Note

Thus the resulting stream of data has shape \((N, L, \text{out_channels})\), where in pseudocode:

out_channels = layer_sizes[-1]
if include_original:
    out_channels += in_channels
if include_time:
    out_channels += 1
forward(x: torch.Tensor) torch.Tensor

The forward operation.

Parameters

x (torch.Tensor) – The path to augment.

Returns

The augmented path.


signatory.all_words(channels: int, depth: int) List[List[int]]

Computes the collection of all words up to length depth in an alphabet of size channels. Each letter is represented by an integer \(i\) in the range \(0 \leq i < \text{channels}\).

Signatures may be thought of as a sum of coefficients of words. This gives the words in the order that they correspond to the values returned by signatory.signature().

Logsignatures may be thought of as a sum of coefficients of words. This gives the words in the order that they correspond to the values returned by signatory.logsignature() with mode="expand".

Parameters
  • channels (int) – The size of the alphabet.

  • depth (int) – The maximum word length.

Returns

A list of lists of integers. Each sub-list corresponds to one word. The words are ordered by length, and then ordered lexicographically within each length class.


signatory.lyndon_words(channels: int, depth: int) List[List[int]]

Computes the collection of all Lyndon words up to length depth in an alphabet of size channels. Each letter is represented by an integer \(i\) in the range \(0 \leq i < \text{channels}\).

Logsignatures may be thought of as a sum of coefficients of Lyndon words. This gives the words in the order that they correspond to the values returned by signatory.logsignature() with mode="words".

Parameters
  • channels (int) – The size of the alphabet.

  • depth (int) – The maximum word length.

Returns

A list of lists of integers. Each sub-list corresponds to one Lyndon word. The words are ordered by length, and then ordered lexicographically within each length class.

signatory.lyndon_brackets(channels: int, depth: int) List[Union[int, List]]

Computes the collection of all Lyndon words, in their standard bracketing, up to length depth in an alphabet of size channels. Each letter is represented by an integer \(i\) in the range \(0 \leq i < \text{channels}\).

Logsignatures may be thought of as a sum of coefficients of Lyndon brackets. This gives the brackets in the order that they correspond to the values returned by signatory.logsignature() with mode="brackets".

Parameters
  • channels (int) – The size of the alphabet.

  • depth (int) – The maximum word length.

Returns

A list. Each element corresponds to a single Lyndon word with its standard bracketing. The words are ordered by length, and then ordered lexicographically within each length class.

Examples

Simple example

Here’s a very simple example on using signatory.signature().

import torch
import signatory
# Create a tensor of shape (2, 10, 5)
# Recall that the order of dimensions is (batch, stream, channel)
path = torch.rand(2, 10, 5)
# Take the signature to depth 3
sig = signatory.signature(path, 3)
# sig is of shape (2, 155)

In this example, path is a three dimensional tensor, and the returned tensor is two dimensional. The first dimension of path corresponds to the batch dimension, and indeed we can see that this dimension is also in the shape of sig.

The second dimension of path corresponds to the ‘stream’ dimension, whilst the third dimension corresponds to channels. Mathematically speaking, that means that each batch element of path is interpreted as a sequence of points \(x_1, \ldots, x_{10}\), with each \(x_i \in \mathbb{R}^5\).

The output sig has batch dimension of size 2, just like the input. Its other dimension is of size 155. This is the number of terms in the depth-3 signature of a path with 5 channels. (This can also be computed with the helper function signatory.signature_channels().)

Computing the signature of an incoming stream of data

Suppose we have the signature of a stream of data \(x_1, \ldots, x_{1000}\). Subsequently some more data arrives, say \(x_{1001}, \ldots, x_{1007}\). It is possible to calculate the signature of the whole stream of data \(x_1, \ldots, x_{1007}\) with just this information. It is not necessary to compute the signature of the whole path from the beginning!

In code, this problem can be solved like this:

import torch
import signatory

# Generate a path X
# Recall that the order of dimensions is (batch, stream, channel)
X = torch.rand(1, 1000, 5)
# Calculate its signature to depth 3
sig_X = signatory.signature(X, 3)

# Generate some more data for the path
Y = torch.rand(1, 7, 5)
# Calculate the signature of the overall path
final_X = X[:, -1, :]
sig_XY = signatory.signature(Y, 3, basepoint=final_X, initial=sig_X)

# This is equivalent to
XY = torch.cat([X, Y], dim=1)
sig_XY = signatory.signature(XY, 3)

As can be seen, two pieces of information need to be provided: the final value of X along the stream dimension, and the signature of X. But not X itself.

The first method (using the initial argument) will be much quicker than the second (simpler) method. The first method efficiently uses just the new information Y, whilst the second method unnecessarily iterates over all of the old information X.

In particular note that we only needed the last value of X. If memory efficiency is a concern, then by using the first method we can discard the other 999 terms of X without an issue!

Note

If the signature of Y on its own was also of interest, then it is possible to compute this first, and then combine it with sig_X to compute sig_XY. See Combining signatures.

Combining signatures

Suppose we have two paths, and want to combine their signatures. That is, we know the signatures of the two paths, and would like to know the signature of the two paths concatenated together. This can be done with the signatory.signature_combine() function.

import torch
import signatory

depth = 3
input_channels = 5
path1 = torch.rand(1, 10, input_channels)
path2 = torch.rand(1, 5, input_channels)
sig_path1 = signatory.signature(path1, depth)
sig_path2 = signatory.signature(path2, depth,
                                basepoint=path1[:, -1, :])

### OPTION 1: efficient, using signature_combine
sig_combined = signatory.signature_combine(sig_path1, sig_path2,
                                           input_channels, depth)

### OPTION 2: inefficient, without using signature_combine
path_combined = torch.cat([path1, path2], dim=1)
sig_combined = signatory.signature(path_combined, depth)

### Both options will produce the same value for sig_combined

Danger

Note in particular that the end of path1 is used as the basepoint when calculating sig_path2 in Option 1. It is important that path2 starts from the same place that path1 finishes. Otherwise there will be a jump between the end of path1 and the start of path2 which the signature will not see.

If it is known that path1[:, -1, :] == path2[:, 0, :], so that in fact path1 does finish where path2 starts, then only in this case can the use of basepoint safely be skipped. (And if basepoint is set to this value then it will not change the result.)

With Option 2 it is clearest what is being computed. However this is also going to be slower: the signature of path1 is already known, but Option 2 does not use this information at all, and instead performs a lot of unnecessary computation. Furthermore its calculation requires holding all of path1 in memory, instead of just path1[:, -1, :].

Note how with Option 1, once sig_path1 has been computed, then the only thing that must now be held in memory is sig_path1 and path1[:, -1, :]. This means that the amount of memory required is independent of the length of path1. Thus if path is very long, or can grow to arbitrary length as time goes by, then the use of this option (over Option 2) is crucial.

Tip

Combining signatures in this way is the most sensible way to do things if the signature of path2 is actually desirable information on its own.

However if only the signature of the combined path is of interest, then this can be computed even more efficiently by

sig_path1 = signatory.signature(path1, depth)
sig_combined = signatory.signature(path2, depth,
                                   basepoint=path1[:, -1, :],
                                   initial=sig_path1)

For further examples of this nature, see Computing the signature of an incoming stream of data.

Computing signatures over multiple intervals of the same path efficiently

The basic signatory.signature() function computes the signature of a whole stream of data. Sometimes we have a whole stream of data, and then want to compute the signature of just the data sitting in some subinterval.

Naively, we could just slice it:

import torch
import signatory
# WARNING! THIS IS SLOW AND INEFFICIENT CODE
path = torch.rand(1, 1000, 5)
sig1 = signatory.signature(path[:, :40, :], 3)
sig2 = signatory.signature(path[:, 300:600, :], 3)
sig3 = signatory.signature(path[:, 400:990, :], 3)
sig4 = signatory.signature(path[:, 700:, :], 3)
sig5 = signatory.signature(path, 3)

However in this scenario it is possible to be much more efficient by doing some precomputation, which can then allow for computing such signatures very rapidly. This is done by the signatory.Path class.

import torch
import signatory

path = torch.rand(1, 1000, 5)
path_class = signatory.Path(path, 3)
sig1 = path_class.signature(0, 40)
sig2 = path_class.signature(300, 600)
sig3 = path_class.signature(400, 990)
sig4 = path_class.signature(700, None)
sig5 = path_class.signature()

In fact, the signatory.Path class supports adding data to it as well:

import torch
import signatory

path1 = torch.rand(1, 1000, 5)
path_class = signatory.Path(path1, 3)
# path_class is considering a path of length 1000
# calculate signatures as normal
sig1 = path_class.signature(40, None)
sig2 = path_class.signature(500, 600)
# more data arrives
path2 = torch.rand(1, 200, 5)
path_class.update(path2)
# path_class is now considering a path of length 1200
sig3 = path_class.signature(900, 1150)

Note

To be able to compute signatures over intervals like this, then of course signatory.Path must hold information about the whole stream of data in memory.

If only the signature of the whole path is of interest then the main signatory.signature() function will work fine.

If the signature of a path for which data continues to arrive (analogous to the use of signatory.Path.update() above) is of interest, then see Computing the signature of an incoming stream of data, which demonstrates how to efficiently use the signatory.signature() function in this way.

If the signature on adjacent disjoint intervals is required, and the signature on the union of these intervals is desired, then see Combining signatures for how to compute the signature on each of these intervals, and how to efficiently combine them to find the signature on larger intervals. This then avoids the overhead of the signatory.Path class.

Translation and sampling (reparameterisation) invariance of signatures

One of the big attractions of the signature transform is that it may optionally be invariant to two particular types of noise.

Translation invariance

The signature is translation invariant. That is, given some stream of data \(x_1, \ldots, x_n\) with \(x_i \in \mathbb{R}^c\), and some \(y \in \mathbb{R}^c\), then the signature of \(x_1, \ldots, x_n\) is equal to the signature of \(x_1 + y, \ldots, x_n + y\).

Sometimes this is desirable, sometimes it isn’t. If it isn’t desirable, then the simplest solution is to add a ‘basepoint’. That is, add a point \(0 \in \mathbb{R}^c\) to the start of the path. This will allow us to notice any translations, as the signature of \(0, x_1, \ldots, x_n\) and the signature of \(0, x_1 + y, \ldots, x_n + y\) will be different.

In code, this can be accomplished very easily by using the basepoint argument. Simply set it to True to add such a basepoint to the path before taking the signature:

import torch
import signatory
path = torch.rand(2, 10, 5)
sig = signatory.signature(path, 3, basepoint=True)

Sampling (reparameterisation) invariance

The signature is sampling invariant. This has a precise mathematical description in terms of reparameterisation, but the intuition is that it doesn’t matter how many times you measure the underlying path; the signature transform may be applied regardless of how long the stream of data is, or how finely it is sampled. Increasing the number of samples does not require changing anything in the mathematics or in the code. It will simply increase how well the signature of the stream of the data approximates the signature of the underlying path.

Tip

This makes the signature transform an attractive tool when dealing with missing or irregularly-sampled data.

Let’s given an explicit example.

Suppose the underlying path looks like this:

_images/Figure_1.png

And that we observe this at particular points (the underlying path is shown as well for clarity):

_images/Figure_2.png

Alternatively, perhaps we observed this at some other set of points:

_images/Figure_3.png

Then the signature transform of \(x_1, \ldots, x_{6}\) and \(y_1, \ldots, y_{10}\) will be approximately the same, despite the fact that the two sequences are of different lengths, and sampled at different points.

Important

The reason for this is that the index of an element in a sequence is not information that is used by the signature transform.

What this means is that if time (and things that depend on the passing of time, such as speed) is something which you expect your machine learning model to depend upon, then you must explicitly specify this in your stream of data. This is a great advantage of the signature transform: you can use your understanding of the problem at hand to decide whether or not time should be included. Contrast a recurrent neural network, where the passing of time is often implicitly specified by the index of an element in a sequence.

For example, if you want to do handwriting recognition, then you probably don’t care how fast someone wrote something: only the shape of what they wrote.

Using signatures in neural networks

In principle a simple augment-signature-linear model is enough to achieve universal approximation:

import signatory
import torch
from torch import nn


class SigNet(nn.Module):
    def __init__(self, in_channels, out_dimension, sig_depth):
        super(SigNet, self).__init__()
        self.augment = signatory.Augment(in_channels=in_channels,
                                         layer_sizes=(),
                                         kernel_size=1,
                                         include_original=True,
                                         include_time=True)
        self.signature = signatory.Signature(depth=sig_depth)
        # +1 because signatory.Augment is used to add time as well
        sig_channels = signatory.signature_channels(channels=in_channels + 1,
                                                    depth=sig_depth)
        self.linear = torch.nn.Linear(sig_channels,
                                      out_dimension)

    def forward(self, inp):
        # inp is a three dimensional tensor of shape (batch, stream, in_channels)
        x = self.augment(inp)
        if x.size(1) <= 1:
            raise RuntimeError("Given an input with too short a stream to take the"
                               " signature")
        # x in a three dimensional tensor of shape (batch, stream, in_channels + 1),
        # as time has been added as a value
        y = self.signature(x, basepoint=True)
        # y is a two dimensional tensor of shape (batch, terms), corresponding to
        # the terms of the signature
        z = self.linear(y)
        # z is a two dimensional tensor of shape (batch, out_dimension)
        return z

Whilst in principle this exhibits universal approximation, adding some learnt transformation before the signature transform tends to improve things. See Deep Signature Transforms – Bonnier et al. 2019. Thus we might improve our model:

import signatory
import torch
from torch import nn


class SigNet2(nn.Module):
    def __init__(self, in_channels, out_dimension, sig_depth):
        super(SigNet2, self).__init__()
        self.augment = signatory.Augment(in_channels=in_channels,
                                         layer_sizes=(8, 8, 2),
                                         kernel_size=4,
                                         include_original=True,
                                         include_time=True)
        self.signature = signatory.Signature(depth=sig_depth)
        # +3 because signatory.Augment is used to add time, and 2 other channels,
        # as well
        sig_channels = signatory.signature_channels(channels=in_channels + 3,
                                                    depth=sig_depth)
        self.linear = torch.nn.Linear(sig_channels,
                                      out_dimension)

    def forward(self, inp):
        # inp is a three dimensional tensor of shape (batch, stream, in_channels)
        x = self.augment(inp)
        if x.size(1) <= 1:
            raise RuntimeError("Given an input with too short a stream to take the"
                               " signature")
        # x in a three dimensional tensor of shape (batch, stream, in_channels + 3)
        y = self.signature(x, basepoint=True)
        # y is a two dimensional tensor of shape (batch, sig_channels),
        # corresponding to the terms of the signature
        z = self.linear(y)
        # z is a two dimensional tensor of shape (batch, out_dimension)
        return z

The signatory.Signature layer can be used multiple times in a neural network. In this next example the first signatory.Signature layer is called with stream as True, so that the stream dimension is preserved. This means that the signatures of all intermediate streams are returned as well. So as we still have a stream dimension, it is reasonable to take the signature again.

import signatory
import torch
from torch import nn


class SigNet3(nn.Module):
    def __init__(self, in_channels, out_dimension, sig_depth):
        super(SigNet3, self).__init__()
        self.augment1 = signatory.Augment(in_channels=in_channels,
                                          layer_sizes=(8, 8, 4),
                                          kernel_size=4,
                                          include_original=True,
                                          include_time=True)
        self.signature1 = signatory.Signature(depth=sig_depth,
                                              stream=True)

        # +5 because self.augment1 is used to add time, and 4 other
        # channels, as well
        sig_channels1 = signatory.signature_channels(channels=in_channels + 5,
                                                     depth=sig_depth)
        self.augment2 = signatory.Augment(in_channels=sig_channels1,
                                          layer_sizes=(8, 8, 4),
                                          kernel_size=4,
                                          include_original=False,
                                          include_time=False)
        self.signature2 = signatory.Signature(depth=sig_depth,
                                              stream=False)

        # 4 because that's the final layer size in self.augment2
        sig_channels2 = signatory.signature_channels(channels=4,
                                                     depth=sig_depth)
        self.linear = torch.nn.Linear(sig_channels2, out_dimension)

    def forward(self, inp):
        # inp is a three dimensional tensor of shape (batch, stream, in_channels)
        a = self.augment1(inp)
        if a.size(1) <= 1:
            raise RuntimeError("Given an input with too short a stream to take the"
                               " signature")
        # a in a three dimensional tensor of shape (batch, stream, in_channels + 5)
        b = self.signature1(a, basepoint=True)
        # b is a three dimensional tensor of shape (batch, stream, sig_channels1)
        c = self.augment2(b)
        if c.size(1) <= 1:
            raise RuntimeError("Given an input with too short a stream to take the"
                               " signature")
        # c is a three dimensional tensor of shape (batch, stream, 4)
        d = self.signature2(c, basepoint=True)
        # d is a two dimensional tensor of shape (batch, sig_channels2)
        e = self.linear(d)
        # e is a two dimensional tensor of shape (batch, out_dimension)
        return e

Inversion of signatures

We show below a simple example of signature inversion. A crucial parameter is the depth chosen for the signature: the function signatory.invert_signature() reconstructs a piecewise linear path with depth + 1 points that approximates the original path the signature was computed on.

import math
import torch
import signatory

# Create a path consisting in a half circle
time = torch.linspace(0, 1, 10)
path = torch.stack([torch.cos(math.pi * time), torch.sin(math.pi * time)]).T.unsqueeze(0)

# Compute the signature
depth = 11
signature = signatory.signature(path, depth)

# Reconstruct the path by inverting the signature
reconstructed_path = signatory.invert_signature(signature, depth, path.shape[2], initial_position=path[:, 0, :])

Note that the signature being translation invariant, we have given the first position of the path as argument to signatory.invert_signature(). Otherwise, the reconstructed_path would begin at zero.

We show below the original path (blue curve) and its reconstruction (orange curve).

_images/Half_circle_inversion.png

Citation

If you found this library useful in your research, please consider citing the paper.

@inproceedings{kidger2021signatory,
  title={{S}ignatory: differentiable computations of the signature and logsignature transforms, on both {CPU} and {GPU}},
  author={Kidger, Patrick and Lyons, Terry},
  booktitle={International Conference on Learning Representations},
  year={2021},
  note={\url{https://github.com/patrick-kidger/signatory}}
}

FAQ and Known Issues

If you have a question and don’t find an answer here then do please open an issue.

Problems with importing or installing Signatory

  • I get an ImportError: DLL load failed: The specified procedure could not be found. when I try to import Signatory.

This appears to be caused by using old versions of Python, e.g. 3.6.6 instead of 3.6.9. Upgrading your version of Python seems to resolve the issue.

  • I get an Import Error: ... Symbol not found: ... when I try to import Signatory.

This occurs when the version of Python or PyTorch you have installed is different to the version of Python or PyTorch that your copy of Signatory is compiled for. Make sure that you have specified the correct version of PyTorch when downloading Signatory; see the installation instructions, and that you include the extra --no-cache-dir --force-reinstall flags as described there.

Everything else

  • What’s the difference between Signatory and iisignature?

The essential difference (and the reason for Signatory’s existence) is that iisignature is limited to the CPU, whilst Signatory is for both CPU and GPU. Signatory is also typically faster even on the CPU, thanks to parallelisation and algorithmic improvements. Other than that, iisignature is NumPy-based, whilst Signatory uses PyTorch. There are also a few differences in the provided functionality; each package provides some operations that the other doesn’t.

  • Exceptions messages aren’t very helpful on a Mac.

This isn’t an issue directly to do with Signatory. We use pybind11 to translate C++ exceptions to Python exceptions, and some part of this process breaks down when on a Mac. If you’re trying to debug your code then the best (somewhat unhelpful) advice is to try running the problematic code on either Windows or Linux to check what the error message is.

Advice on using signatures

What is the signature transform?

The signature transform is roughly analogous to the Fourier transform, in that it operates on a stream of data (often a time series). Whilst the Fourier transform extracts information about frequency, the signature transform extracts information about order and area. Furthermore (and unlike the Fourier transform), order and area represent all possible nonlinear effects: the signature transform is a universal nonlinearity, meaning that every continuous function of the input stream may be approximated arbitrary well by a linear function of its signature. If you’re doing machine learning then you probably understand why this is such a desirable property!

Besides this, the signature transform has many other nice properties – robustness to missing or irregularly sampled data; optional translation invariance; optional sampling invariance. Furthermore it can be used to encode certain physical quantities, and may be used for data compression.

The definition of the signature transform can be a little bit intimidating -

Definition

Let \(\mathbf x = (x_1, \ldots, x_n)\), where \(x_i \in \mathbb R^d\). Linearly interpolate \(\mathbf x\) into a path \(f = (f^1, \ldots, f^d) \colon [0, 1] \to \mathbb R^d\). The signature transform to depth \(N\) of \(\mathbf x\) is defined as

\[\mathrm{Sig}(\mathbf x) = \left(\left( \,\underset{0 < t_1 < \cdots < t_k < 1}{\int\cdots\int} \prod_{j = 1}^k \frac{\mathrm d f^{i_j}}{\mathrm dt}(t_j) \mathrm dt_1 \cdots \mathrm dt_k \right)_{1 \leq i_1, \ldots, i_k \leq d}\right)_{1 \leq k \leq N}.\]

Really understanding the mathematics behind the signature transform is frankly pretty hard, but you probably don’t need to understand how it works – just how to use it.

Check out this for a primer on the use of the signature transform in machine learning, just as a feature transformation, and this for a more in-depth look at integrating the signature transform into neural networks.

Furthermore, efficient ways of computing it are somewhat nontrivial – but they do exist. Now if only someone had already written a package to compute it for you

Note

Recall that the signature transform extracts information about both order and area. This is because order and area are actually (in some sense) the same concept. For a (very simplistic) example of this: consider the functions \(f(x) = x(1-x)\) and \(g(x) = x(x-1)\) for \(x \in [0, 1]\). Then the area of \(f\) is \(\int_0^1 f(x) \mathrm{d} x = \tfrac{1}{6}\) whilst the area of \(g\) is \(\int_0^1 g(x) \mathrm{d} x = \tfrac{-1}{6}\). Meanwhile, the graph of \(f\) goes up then down, whilst the graph of \(g\) goes down then up: the order of the ups and downs corresponds to the area.

Neural networks

The universal nonlinearity property (mentioned here) requires the whole, infinite, signature. This doesn’t fit in your computer’s memory. The solution is simple: truncate the signature to some finite collection of statistics, and then embed it within a nonlinear model, like a neural network. The signature transform now instead acts as a pooling function, doing a provably good job of extracting information.

Have a look at this for a more in-depth look at integrating it into neural neural networks.

As a general recommendation:

  • The number of terms in signatures can grow rapidly with depth and number of channels, so experiment with what is an acceptable amount of work.

  • Place small stream-preserving neural networks before the signature transform; these typically greatly enhance the power of the signature transform. This can be done easily with the signatory.Augment class.

  • It’s often worth augmenting the input stream with an extra ‘time’ dimension. This can be done easily with the signatory.Augment class. (Have a look at Appendix A of this for an understanding of what augmenting with time gives you, and when you may or may not want to do it.)

Kernels and Gaussian Processes

The signature may be used to define a universal kernel for sequentially ordered data.

See here for using signatures with kernels, and here for using signatures with Gaussian Processes.

Signatures vs. Logsignatures

Signatures can get quite large. This is in fact the whole point of them! They provide a way to linearise all possible functions of their input. In contrast logsignatures tend to be reasonably modestly sized.

If you know that you want to try and capture particularly high order interactions between your input channels then you may prefer to use logsignatures over signatures, as this will capture this the same information, but in a more information-dense way. This comes with a price though, as the logsignature is somewhat slower to compute than the signature.

Note that as the logsignature is computed by going via the signature, it is not more memory-efficient to compute the logsignature than the signature.

Source Code

The Signatory project is hosted on GitHub.

Acknowledgements

The Python bindings for the C++ code were written with the aid of pybind11.

For NumPy-based CPU-only signature calculations, you may also be interested in the iisignature package. The notes accompanying the iisignature project greatly helped with the implementation of Signatory.