Replies: 14 comments
-
@robertsd Thank you for your feedback and for taking the time to submit this feature request! Please accept this response as an invitation for a longer form discussion as I'm just talking/thinking out loud here and I may not be fully understanding your request! So, it's been a while since I had implemented "our flavor" of mSTAMP (see When I had originally read the Matrix Profile VI paper (accompanying code by the original author is here), I had not paid too much attention to the "constrained search" section due to the fact that there weren't any explicit algorithms or code beyond mSTAMP (which we've essentially implemented in After quickly re-skimming the paper, it appears that the only detail provided regarding "inclusion" is:
With Algorithm 10 being (copy/pasted from the paper):
It sounds like we'd need to modify the step directly following the Do you happen to already know what one would need to add? I'd be happy to add it if you want to talk me through the process (and even better if you wanted to submit a pull request but absolutely no pressure to)? |
Beta Was this translation helpful? Give feedback.
-
I have some initial thoughts and questions:
|
Beta Was this translation helpful? Give feedback.
-
Yes I have been using stumpy.mstump for analysis of this data already, after reading the mentioned paper. I just happen to have a case where it is important to constrain the search for motif to include an important dimension. While I am not super familiar with the algorithm guts I can perhaps review to determine how to implement it, mostly I wanted to find out if it might already be in the plans.
|
Beta Was this translation helpful? Give feedback.
-
Have you already tried Additionally, would you be comfortable with installing stumpy from source? It shouldn’t take much effort to add this feature but we probably won’t have an official release (on PyPI or Conda) until about a month from now as there is some work that we are focused on completing before the next release. |
Beta Was this translation helpful? Give feedback.
-
I understand, I intended to start with small number of dimensions, I also have a GPU available. I am OK with official release being a month out! This would be fantastic! |
Beta Was this translation helpful? Give feedback.
-
Awesome! I'm sure you are already aware but, to be on the safe side, there is no GPU support (yet) for |
Beta Was this translation helpful? Give feedback.
-
Understood.. |
Beta Was this translation helpful? Give feedback.
-
I did a little digging/asking around and I think the simplest/cleanest solution would be something like:
@robertsd Do you think this is sufficient? |
Beta Was this translation helpful? Give feedback.
-
I don't think I could have done it better myself!
…On Thu, May 21, 2020 at 8:11 PM Sean M. Law ***@***.***> wrote:
While I am not super familiar with the algorithm guts I can perhaps review
to determine how to implement it, mostly I wanted to find out if it might
already be in the plans
I did a little digging/asking around and I think the simplest/cleanest
solution would be something like:
tmp_swap = np.empty((len(include), n-m+1))
if include is not None:
# Swap the rows in `include` with the first `len(include)` rows
tmp_swap[:] = D[:len(include)]
D[:len(include)] = D[include]
D[include] = tmp_swap
# Only sort the rows beyond the the first `len(include)` rows
D[:len(include)].sort(axis=0)
else:
𝑫 ← columnWiseAscendingSort(𝑫)
@robertsd <https://github.com/robertsd> Do you think this is sufficient?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/TDAmeritrade/stumpy/issues/180#issuecomment-632423976>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEMDFAFQA2GZXWMALLGYJDRSXGM3ANCNFSM4NGBZQVA>
.
--
“Do you pine for the days when men were men and wrote their own device
drivers?” - Linus Torvalds
|
Beta Was this translation helpful? Give feedback.
-
@robertsd I didn't see any implementation details regarding "Guided Search" beyond the description:
So, I'm guessing that this is simply accomplished by computing the full M-dimensional matrix profile and then the user explicitly chooses As for "Unconstrained Search", this also seems to involve computing the full M-dimensional matrix profile and then applying some elbow metric to choose So, realistically, the only type of constrained search that needs to be modified at run-time is "inclusion" search. |
Beta Was this translation helpful? Give feedback.
-
@robertsd The feature has been added for |
Beta Was this translation helpful? Give feedback.
-
It turns out that we missed one crucial thing! When the user provides a list of indices to include, one has to account for cases where one or more of the indices is in/from one of the first few rows. So, let's say we have an array and indices as shown:
If we followed our simple procedure above by first swapping the rows first then we'd get the following wrong output (note that
Instead, what we really want is indices
To achieve this, we need to actually do some pre-preparation work to identify which indices are "restricted" (i.e., those in the first few rows) and which indices are "unrestricted" (i.e., outside of the first few rows) and can ultimately be selectively written to at the end:
This is what has been/will be implemented. Another thing that one needs to look out for is repeating indices in the input. This turned out to be a lot trickier than I had anticipated but I provide this here for completeness and transparency. |
Beta Was this translation helpful? Give feedback.
-
I see, nice catch! Thank you so much for working on this.. I did NOT expect
you to build this feature so quickly!
…On Fri, May 22, 2020 at 9:03 PM Sean M. Law ***@***.***> wrote:
I don't think I could have done it better myself!
It turns out that we missed one crucial thing! When you provide a list of
indices to include, one has to account for cases where one of the indices
is in one of the first few rows. So, let's say we have an array and indices
as shown:
import numpy as np
x = np.array([[0,0],
[1,1],
[2,2],
[3,3],
[4,4],
[5,5]])
indices = np.array([1, 2, 4])
If we followed our simple procedure above by first swapping the rows first
then we'd get:
[[1 1]
[0 0]
[1 1]
[3 3]
[2 2]
[5 5]]
Instead, what we really want is indices [1, 2, 4] to be in the first
three rows and for the first row to be moved to index 4:
[[1 1]
[2 2]
[4 4]
[3 3]
[0 0]
[5 5]]
To achieve this, we need to actually do so pre-preparation work to
identify which indices are "restricted" and which indices are
"unrestricted" to be written to at the end:
import numpy as np
x = np.array([[0,0],
[1,1],
[2,2],
[3,3],
[4,4],
[5,5]])
indices = np.array([1, 2, 4])
# pre-preparation
restricted_indices = indices[indices < indices.shape[0]]
unrestricted_indices = indices[indices >= indices.shape[0]]
mask = np.ones(indices.shape[0], bool)
mask[restricted_indices] = False
# Same as before
tmp = x[:len(indices)].copy()
x[:len(indices)] = x[indices]
# x[indices] = tmp # Replace this original step with the next one
x[unrestricted_indices] = tmp[mask]
This is what has been/will be implemented. Another thing that one needs to
look out for is repeating indices in the input.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/TDAmeritrade/stumpy/issues/180#issuecomment-632969036>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEMDFDIXQC37DHS6P3RXUDRS4VHDANCNFSM4NGBZQVA>
.
--
“Do you pine for the days when men were men and wrote their own device
drivers?” - Linus Torvalds
|
Beta Was this translation helpful? Give feedback.
-
Don’t mention it! Thanks again for submitting the feature request and please be sure to spread the word and share STUMPY with your network! 🙏 |
Beta Was this translation helpful? Give feedback.
-
Thank you for developing and maintaining the stumpy project as it has been very useful. I have a use case that requires a "constrained search" on multivariate time series, such that I have at least one variable that must be included in the motif search. Constrained Search is one of three types of search discussed in the paper Matrix Profile VI, the others are Guided Search and Unconstrained Search.
What is the likelihood that these mSTAMP queries become possible in stumpy?
Beta Was this translation helpful? Give feedback.
All reactions