Computing the signature of an incoming stream of dataΒΆ
Suppose we have the signature of a stream of data \(x_1, \ldots, x_{1000}\). Subsequently some more data arrives, say \(x_{1001}, \ldots, x_{1007}\). It is possible to calculate the signature of the whole stream of data \(x_1, \ldots, x_{1007}\) with just this information. It is not necessary to compute the signature of the whole path from the beginning!
In code, this problem can be solved like this:
import torch
import signatory
# Generate a path X
# Recall that the order of dimensions is (batch, stream, channel)
X = torch.rand(1, 1000, 5)
# Calculate its signature to depth 3
sig_X = signatory.signature(X, 3)
# Generate some more data for the path
Y = torch.rand(1, 7, 5)
# Calculate the signature of the overall path
final_X = X[:, -1, :]
sig_XY = signatory.signature(Y, 3, basepoint=final_X, initial=sig_X)
# This is equivalent to
XY = torch.cat([X, Y], dim=1)
sig_XY = signatory.signature(XY, 3)
As can be seen, two pieces of information need to be provided: the final value of X
along the stream dimension, and the signature of X
. But not X
itself.
The first method (using the initial
argument) will be much quicker than the second (simpler) method. The first
method efficiently uses just the new information Y
, whilst the second method unnecessarily iterates over all of
the old information X
.
In particular note that we only needed the last value of X
. If memory efficiency is a concern, then by using the first method we can discard the other 999 terms of X
without an issue!
Note
If the signature of Y
on its own was also of interest, then it is possible to compute this first, and then combine it with sig_X
to compute sig_XY
. See Combining signatures.