Replies: 5 comments
-
@apachaves I'll preface my response with the fact that I had not heard of or used tsfresh before. Based on my cursory examination of the tsfresh documentation, it appears that tsfresh is squarely focused on time series feature engineering. I think that what they are offering is pretty cool and useful if your ultimate goal is to build a machine learning model. Essentially, they are taking a time series and extracting a set of "typically useful" features from it. However, the goals of tsfresh are fundamentally different from the goals of STUMPY. Undoubtedly, the output(s) of STUMPY can certainly be used as features to a machine learning model (in fact, I think they'd make great input ML features) but the focus of STUMPY is to allow you to analyze and understand where you should be looking in your time series. A more detailed talk that explains the motivation behind STUMPY can be found here or you can read a summary here. To be clear, tsfresh does not provide any of the output (namely a matrix profile) that is captured by STUMPY. Additionally, STUMPY is well suited for exploratory time series data analysis, pattern/motif discovery, anomaly detection, time series chains, semantic segmentation, and this is only to name a few possibilities. I think it would be interesting for tsfresh to incorporate the matrix profile (and matrix profile indices) into their list of features. On the other hand, unfortunately, this type of "automatic feature generation" may enable people to avoid understanding and actually looking at their data. This is more of a subtle difference/observation and less of a criticism (I really think tsfresh is great and would use it myself!). Of course, exploring time series data is hard and often overwhelming and, in contrast, STUMPY makes it "easier" for you to know where to look and what parts of your data might be the most interesting. I strongly suggest that everyone go through the work of the primary authors (not us) of the matrix profile work as there are a ton of great examples and use cases. All of these use cases are built on top of the central concept of a matrix profile. Let me know if this helps or if you have additional questions and I will try my best to cover it. |
Beta Was this translation helpful? Give feedback.
-
Great comment and explanation, @seanlaw. Thank you very much for it and for the resources you shared. Gonna consume them with care to understand better the use cases for each. Again thanks! |
Beta Was this translation helpful? Give feedback.
-
@apachaves I am really curious as to what people are using STUMPY for or plan to use it for. Any chance that you could share some of your thoughts and use cases? |
Beta Was this translation helpful? Give feedback.
-
@seanlaw, I'm considering to use it to develop a pipeline for anomaly detection on steel making machines. It can be promising to use the distance matrix and/or other interesting features from different sensors we have to build an interesting anomaly detection pipeline. |
Beta Was this translation helpful? Give feedback.
-
Very cool! In case it matters, we just added (NVIDIA) GPU support so, depending on the size of your time series and access to hardware resources, you may be able to get a boost. Currently, you'll need to install STUMPY from source but I hope to release it in the coming months. I'm going to close this for now but feel free to re-open/file another issue if you have any questions! |
Beta Was this translation helpful? Give feedback.
-
Quick question, how does Stumpy compares with other time series feature engineering modules likes tsfresh in your words? Which use cases fit better with one or the other?
Maybe someone here can help me understand the best use case for each of them. Thank you all in advance already.
Beta Was this translation helpful? Give feedback.
All reactions