This Lab is organised into the following 4 challenges:
Challenge | Description | Est. Time |
---|---|---|
Challenge 5 | Caching and retention policies | 30 Min |
Challenge 6 | Control commands | 30 Min |
Challenge 7 | Advanced KQL operators | 45 Min |
Challenge 8 | Visualization | 45 Min |
In order to receive the "ADX-In-A-Day" digital badge, you will need to complete the tasks marked with ✅ in Lab 1 & Lab 2. Submit your answers for Lab 1 and Lab 2 quizzes in order to receive the "ADX in a Day" digital badge. You may edit your answers after or try again.
ℹ️ Note |
---|
For Lab 2, please submit the results for the tasks marked with ✅ in the following link: Answer sheet - ADX in a Day Lab 2 |
ℹ️ Note |
---|
Please allow us 5 working days to issue the badge |
Among the different policies you can set to the ADX cluster, two policies are of particular importance:
- Retention policy (retention period)
- Cache policy (cache period)
First, a policy is used to enforce and control the properties of the cluster (or the database/table).
-
The retention policy: the time span, in days, for which it’s guaranteed that the data is kept available for querying. The time span is measured from the time that the records are ingested. When the period expires, the records will not be available for querying any more.
In other words, the retention policy defines the period during which data is retained and available to query, measured since ingestion time. Note that a large retention period may impact the cost.
-
The cache policy: the time span, in days, for which to keep recently ingested data (which is usually the frequently queried data) available in the hot cache rather than in long term storage (this is also known as cold tier. Specifically, it is Azure blob storage). Data stored in the hot cache is actually stored in local SSD or the RAM of the machine, very close to the compute nodes.
Therefore, more readily available for querying. The availability of data in the hot cache improves query performance but can potentially increase the cluster cost (as more data is being stored, more VMs are required to store it). In other words, the caching policy defines the period during which data is kept in the hot cache.
All the data is always persisted in the cold tier, for the duration defined in the retention policy. Any data whose age falls within the hot cache policy will also be stored in the hot cache. If you query data from cold cache, it’s recommended to target a small specific range in time (“point in time”) for the queries to be efficient.
Tasks:
Database policies can be overridden per table using a KQL control command. ADX cluster and database are Azure resources. A database is a sub-resource of the cluster, so it can be edited from the portal. Tables are not considered an Azure resource, so they cannot be managed in the portal but via a KQL command.
You can always use KQL commands to alter the policies of the entire Cluster/Database/tables. Table level policy takes precedence over database level which takes precedence over cluster level.
.alter table ingestionLogs policy retention
```
{
"SoftDeletePeriod": "10:12:00:00",
"Recoverability": "Enabled"
}
```
❓ Question 5.1 |
---|
How many total hours is the retention policy of ingestionLogs table after running the above query? |
References:
Tasks:
- Task 1: .show/diagnostic logs/Insights
- Task 2: Use .show queries✅
- Task 3: Use .journal commands✅
- Task 4: Use .show commands ✅
- Task 5: Table details and size✅
Control commands are requests to the service to retrieve information that is not necessarily data in the database tables, or to modify the service state, etc. In addition, they can be used to manage Azure Data Explorer. The first character of the KQL text determines if the request is a control command or a query. Control commands must start with the dot (.) character, and no query may start with that character.
- The
.show queries
command returns a list of queries that have reached a final state, and that the user invoking the command has access to see. - The
.show commands
command returns a table of the admin commands that have reached a final state. The TotalCpu column is the value of the total CPU clock time (User mode + Kernel mode) consumed by this command. - The
.show journal
command returns a table that contains information about metadata operations that are done on the Azure Data Explorer database. The metadata operations can result from a control command that a user executed, or internal control commands that the system executed, such as drop extents by retention - The
.show tables details
command returns a set that contains the specified table or all tables in the database with a detailed summary of each table's properties.
References:
As part of an incident investigation, you need to find out how many queries were executed in the past 3 hours.
Write a command to count the number of queries that were run, in the past 3 hours.
❓ Question 6.2 |
---|
Which column in .show queries has information related to user or app that has run the queries? |
🚩 Hint |
---|
The column shows email of users who ran the queries. |
References:
Write a command to show the details of the function that you created earlier?
❓ Question 6.3 |
---|
What is the Event column value for records which shows the details of function creation? |
🚩 Hint |
---|
You can either create a new function and check the latest .show journal entry or look for record that was created as a part of Challenge 4, Task 1 |
References:
Write a command to show the details of commands that you ran, in the past 4 hours.
❓ Question 6.4 |
---|
What is the "AuthorizationScheme" for commands issued by you? |
🚩 Hint |
---|
Authorization details are available in ClientRequestProperties column in .show commands output |
References:
Write a control command to show details on ingestionLogs tables in the database.
❓ Question 6.4 |
---|
How many days is the "DataHotSpan" for ingestionLogs table? |
🚩 Hint |
---|
Details about cache policy can be extracted from CachingPolicy column. |
References:
ℹ️ Note |
---|
Use ingestionLogs table for all the challenge 7 tasks. |
Tasks:
- Task 1: Declaring variables and using 'let' statements✅
- Task 2: Use the search operator✅
- Task 3: Parse Key-Value pairs strings into separate columns✅
- Task 4: Nulls are important in timeseries analysis (Compare summarize and make-series)✅
- Task 5: Anomaly detection✅
You can use the let
statement to set a variable name equal to an expression or a function, or to create views (virtual, temporary, tables based on the result-set of another KQL query).
let
statements are useful for:
- Breaking up a complex expression into multiple parts, each represented by a variable.
- Defining constants outside of the query body for readability.
- Defining a variable once and using it multiple times within a query.
For example, you can use 2 let
statements to create LogType
and TimeBucket
variables with the following values:
LogType = 'Warning'
TimeBucket = 1m
And then craft a query that performs a count of "Warning" by 1 minute Timestamp buckets (bins).
- Remember to include a
;
at the end of your let statement.
🚩 Hint |
---|
Try to fill in the blanks |
let LogType= .....;
let TimeBucket= .....;
ingestionLogs
| where Level==....
| summarize count() by bin(Timestamp,...)
❓ Question 7.1 |
---|
What is the count_ at 2014-03-08 00:00:00.0000 ? |
References:
You received an alert early in the morning regarding multiple Timeouts in your system. You want to quickly search the traces without using specific columns or table names.
❓ Question 7.2 |
---|
Write a query to "search" for Exception=System.Timeout string in the entire database. |
References:
As part of an incident investigation, you need to look at the INGESTOR_GATEWAY
records (Component == 'INGESTOR_GATEWAY'
).
You need to use the Message column, which contains the message of the trace, representing the information in a key/value form.
An example of a typical message would be:
$IngestionCommand table=scaleEvents format=json
You want to analyze all the message strings, by extracting the Message
text into 2 calculated separate columns: table
and format
.
Let's extract that to discover the number of records per format.
❓ Question 7.3 |
---|
What is the count of json format? |
🚩 Hint |
---|
Use summarize to count() by format |
References:
Many interesting use cases use machine learning algorithms and derive interesting insights from telemetry data. Often, these algorithms require a strictly structured dataset as their input. The raw log data usually doesn't match the required structure and size. We will see how we can use the make-series operator to create well curated data (time series).
Then, we can use built in functions like series_decompose_anomalies
. Anomalies/Outliers will be detected by the Kusto service and highlighted as red dots on the time series chart.
Time series - What is it?
A time series is a collection of observations of well-defined data items obtained through repeated measurements over time and listed in time order. Most commonly, the data points are consistently measured at equally spaced intervals. For example, measuring the temperature of the room each minute of the day would comprise a time series. Data collected irregularly is not a time series.
What is time series analysis?
Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting, for example, is the use of a model to predict future values based on previously observed values.
What is time series decomposition?
Time series decomposition involves thinking of a series as a combination of 4 components:
- trends increasing or decreasing value in the series
- seasonality repeating short-term cycle in the series
- baseline the predicted value of the series, which is the sum of seasonal and trend components
- noise the residual random variation in the series.
We can use built in functions, that uses time series decomposition to forecast future metric values and/or detect anomalous values.
Why should you use make-series instead of the summarize operator?
The summarize operator does not add "null bins" — rows for time bin values for which there's no corresponding row in the table. It's a good idea to "pad" the table with those bins. Advanced built in ML capabilities like anomaly detection need the data points to be consistently measured at equally spaced intervals.
The make-series
operator can create such a “complete” series.
Challenge 7, Task 4: Nulls are important in timeseries analysis (Compare summarize and make-series) ✅
In this task, calculate the average size of data ingested per 30 min by the node 'Engine000000000378'. Use Component as 'INGESTOR_EXECUTER'. File size is available in the 'Properties' column. Render it as a timechart.
🚩 Hint |
---|
Complete the following query |
let TimeBuckets = ....;
ingestionLogs
| where Component == "INGESTOR_EXECUTER" and Node == "Engine000000000378"
| extend Size = ....
| make-series MySeries=round(avg(Size),2) on Timestamp step TimeBuckets by Level
| render ....
❓ Question 7.4 |
---|
What is the file size value (y axis) at 2014-03-08 02:30:00.000 ? |
Example Output:
Why should you use make-series instead of the summarize operator?
The summarize operator does not add "null bins" — rows for time bin values for which there's no corresponding row in the table. It's a good idea to "pad" the table with those bins. Advanced built in ML capabilities like anomaly detection need the data points to be consistently measured at equally spaced intervals. The make-series can create such a “complete” series.
References:
Anomaly detection lets you find outliers/anomalies in the data.
Let's find out any file size anomalies by summarizing the average of file sizes in 1-minute intervals.
Can you spot red dots indicating outliers/anomalies i.e.,spikes in file size on the chart?
🚩 Hints |
---|
Use series_decompose_anomalies to render anomaly chart. |
Fill in the blanks to complete the query. |
let TimeBuckets = 1m;
ingestionLogs
| extend Size = tolong(Properties.size)
| make-series ActualSize=round(avg(Size),2) on .... step ....
| extend anomaly = series_decompose_anomalies(....)
| render anomalychart with(anomalycolumns=...., title='Ingestion Anomalies')
❓ Question 7.5 |
---|
What is the anomaly value (y axis) at 2014-03-08 04:24:00:000? ? |
Example result:
References:
ℹ️ Note |
---|
The following explanation in this task is for your understanding and does not count for the challenge. |
How to display the anomalies in a tabular format?
The series_decompose_anomalies
function returns the following respective series:
ad_flag
: A three-part series containing the values (+1, -1, 0) marking up/down/no anomaly respectivelyad_score
: Anomaly score (using Tukey's fence test. Anomaly scores above 1.5 or below -1.5 indicate a mild anomaly rise or decline respectively. Anomaly scores above 3.0 or below -3.0 indicate a strong anomaly)baseline
: The predicted value of the series, according to the decomposition
To get a tabular format of the detected anomalies, you can use the mv-expand
operator to expand the multi-value dynamic array of the anomaly detection component (AnomalyFlags, AnomalyScore, PredictedUsage) into multiple match records, and then filter by positive and negative deviations from expected usage (where AnomalyFlags != 0).
Example:
ingestionLogs
| where Component == "INGESTOR_EXECUTER"
| extend fileSize=tolong(Properties.size)
| make-series ActualSize=avg(fileSize) on Timestamp step 1min // Creates the time series, listed by data type
| extend(AnomalyFlags, AnomalyScore, PredictedSize) = series_decompose_anomalies(ActualSize, -1) // Scores and extracts anomalies based on the output of make-series
| mv-expand ActualSize to typeof(double), Timestamp to typeof(datetime), AnomalyFlags to typeof(double),AnomalyScore to typeof(double), PredictedSize to typeof(long) // Expands the array created by series_decompose_anomalies()
| where AnomalyFlags != 0 // Returns all positive and negative deviations from expected usage
| project Timestamp,ActualSize = format_bytes(ActualSize, 2),PredictedSize = format_bytes(PredictedSize, 2), AnomalyScore, AnomalyFlags // Defines which columns to return
| sort by abs(AnomalyScore) desc // Sorts results by anomaly score in descending ordering
Looking at the query results, you can see that the query:
- Calculates an expected sum (of the file size) for each bucket.
- Compares actual size to expected size.
- Assigns an anomaly score to each data point, indicating the extent of the deviation of actual size from expected size.
- Identifies positive (1) and negative (-1) anomalies.
Using the Dashboard feature of Azure Data Explorer, build a dashboard using outputs of below 3 queries (on ingestionLogs table).
After you provide dashboard name and click "Next", click on "+ Add tile" next. You will be prompted to add a data source. Click on "+ Data source"
Use the cluster URI of your free cluster as the data source.
Try this!
Create a Timechart using following query. Observe that we used _startTime?
and _endTime
. These 2 are parameters from TimeRange filter in ADX Dashboard with which we can filter the minimum and maximum time of our data.
ingestionLogs
| where Timestamp between (todatetime(_startTime) .. todatetime(_endTime))
| summarize count() by bin(Timestamp, 10m), Component
- Use the above example query as reference to add Timestamp filter with
_startTime
and_endTime
filter to queries in task 1 and task 2. - The following 2 tasks use the timefilter between
2014-03-08T00:00:00
and2014-03-08T10:00:00
Tasks:
Parameterize (add Timefilter) and render an Anomaly chart using the following Anomaly detection query. The chart should show values between 2014-03-08T00:00:00 and 2014-03-08T10:00:00.
❓ Question 8.1 |
---|
What is the anomaly value (y axis) at exactly 04:28 on x axis? |
let TimeBuckets = 1m;
ingestionLogs
| <Add Timefilter parameters>
| make-series MySeries=count() on Timestamp step TimeBuckets
| extend anomaly = series_decompose_anomalies(MySeries)
Parameterize (add Timefilter) and render a Piechart using the following query. The chart should show values between 2014-03-08T00:00:00
and 2014-03-08T10:00:00
.
❓ Question 8.2 |
---|
What is the warning % on the piechart? |
ingestionLogs
| <Add Timefilter parameters>
| summarize count() by Level
You can directly add a query from query window to an existing dashboard.
🚩 Hint |
---|
In the query window, explore the Share menu. |
References:
ℹ️ Note |
---|
Below is just an example dashboard. |
Up for more challenges?
Visualize the outputs of any 2 queries in PowerBI using the DirectQuery mode.
There are multiple ways to connect ADX and PowerBI depending on the use case.
References:
- Visualize data using the Azure Data Explorer connector for Power BI
- Visualize data using a query imported into Power BI
Learn and get hands on experience with a full blown ADX cluster and advanced ADX topics like Materialized Views, scaling, security, geo mapping and more.
- Become a detective and solve some puzzles using Kusto Query Langugage! You can reuse the same free cluster that you have used to complete ADX-in-a-Day challenges. Kusto Detective Agency
🎉 Congrats! You've completed the ADX in a Day Lab 2 challenges!
To earn the digital badge, submit the results of the challenges marked with ✅: Answer sheet - ADX in a Day Lab 2
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.