Generic time series data collector / exporter.
Currently, only Prometheus is available as a data source.
It offers the following features:
- Get time series from a start timestamp to an end point.
- Get the last X seconds of a time series.
- Search for a time series of a finite size anywhere in the past.
- Get time series considering only a start point.
- Export the time series in JSON or CSV files via a shared volume or web services.
One of the main advantages of these features is that they don't have any size limit for the time series. If the time series database has its own limit, TSDC will split the queries automatically and reconstruct the whole time series.
The solution is containerized.
First, you need to set your data source -> resources/config.json
- datasource.type: The datasource you want to use. Currently, only the prometheus value is available.
- datasource.srvaddress: The IP:port of the datasource.
- historytime: The limit to get the data backward.
- maxduration: The limit to get the data forward.
- output.json: The output for the generated json files. If empty, the json files generation is disabled.
- output.csv: The output for the generated csv files. If empty, the csv files generation is disabled.
docker build --tag=time-series-data-collector .
The data file generation is enabled by default. To disable it, empty output.json and output.csv field in the config.json file.
docker run -v /host/tsdc-data:/opt/tsdc-data --net=host time-series-data-collector
The tsdc-data directory will be generated into your /host, with 2 subdirectories: json and csv.
One json file will be generated for each time series, but only one csv file will be generated for a group of time series from the same query.
docker run --net=host time-series-data-collector
Method | Path | Description |
---|---|---|
GET | /collector/service/get_ts | Get one or many time series via a query (e.g Prometheus query) |
Parameter | Required | Description |
---|---|---|
query | YES | The data source query (e.g Prometheus query). It's better to encode it before to use it into an URL. |
id | NO | It's possible to set a custom ID in order to name the generated files. If not used, the ID will be auto-generated. |
start | NO | The start timestamp (in seconds) of the time series. If not used, the collector will get the historytime last seconds of data |
end | NO | Works only if the start parameter is used. Gets the time series from start to end. if not used, the collector will get the data from start to start+maxduration. |
historytime | NO | Overrides the historytime parameter in the config.json. |
reducehttprequests | NO | Useful or not considering the use-cases. If you're getting continuous time series, this parameter is useless. However, if you're looking for an isolated time series in your data source (e.g a build in a CI context), it will do only the minimum necessary http requests. Enabled by default. |
Here's a Prometheus data source with 5 time series: Prometheus data source
In this example the Prometheus query is very simple, we just get the 5 time series:
{__name__=~"cpu1|cpu2|memory|bandwidth|score"}
Same query, encoded for URLs:
%7B__name__%3D~"cpu1%7Ccpu2%7Cmemory%7Cbandwidth%7Cscore"%7D&
To get all time series:
http://35.180.145.79/tsdc/api/get-ts?query=%7B__name__%3D~%22cpu1%7Ccpu2%7Cmemory%7Cbandwidth%7Cscore%22%7D&
Here, the historytime parameter is set to 14400 in the config.json, in order to get the 4 last hours of data for each time series
Same query, but to get only the 200 last data points:
http://35.180.145.79/tsdc/api/get-ts?query=%7B__name__%3D~%22cpu1%7Ccpu2%7Cmemory%7Cbandwidth%7Cscore%22%7D&&historytime=200
This solution is able to handle SSL connections.
The only thing you need to do is to:
- Generate your own keystore.jks
- Set the parameters in src/main/java/com/nokia/as/main/jetty/JettyConfig.java
- Add the port to expose in the Dockerfile (at the EXPOSE line)
TSDC is based on range queries.
If the range is too big, the queries will be splitted automatically.
If one or both of the edges of the range are missing, here's the following cases:
- Only the start time is set
- The http request optimizer is enabled:
It will get the data until the maximum value assigned in the configuration file is reached. It the time series seems to be finished (e.g a build time series), the connections will stop. - The http request optimizer is disabled:
It will get the data until the maximum value assigned in the configuration file is reached, no matter what the time series looks like.
- The http request optimizer is enabled:
- No edge is set
- The http request optimizer is enabled:
It will try get the data backward from the current timestamp. If there's no current data point, it will search for it until the hisory time assigned in the configuration file is reached. It it finds it, it will get the data backward. It the time series seems to be finished (e.g a build time series), the connections will stop. - The http request optimizer is disabled:
Same thing, but if it finds a time series, it will continue to get the data backward, no matter what the time series looks like.
- The http request optimizer is enabled:
Icons made by Smashicons, DinosoftLabs, Pixel Buddha from www.flaticon.com is licensed by CC 3.0 BY