When working with time series data in CrateDB, visualization is a key aspect that aids in understanding the patterns and trends in your data. One of the most popular tools for this purpose is Grafana, which provides a powerful and intuitive platform for creating and sharing dashboards. In combination with CrateDB's robust handling of time series data, Grafana allows you to create dynamic visualizations that can provide valuable insights, making it easier to analyze and interpret your data.
To visualize your data in Grafana, you'll first need to set up a data source. Select the PostgreSQL type as CrateDB is compatible with the PostgreSQL wire protocol. Next, enter the necessary parameters for your database. This includes information such as the host, database name, user, and password. This process connects Grafana to your database, allowing it to retrieve and display your data.
Now, let’s see how it works in practice:
Now, we are ready to explore an example dashboard and the queries used to create it.
This is the example dashboard for the weather data. The first panels tell us how many entries are in the dataset, how many cities are covered, and when the first and last measurements took place. If we go to inspect, query, and move to the query tab we can see which query is used to generate this information. In the first panel, we show the number of measurements as a count of all timestamps and the number of cities as a count of distinct locations.
The table on the dashboard tells us for each location what was the highest recorded temperature and when this temperature was recorded. Check the query that generates this data. In the SELECT
statement, we use the max
aggregation function to find the highest temperature and max_by
to find a timestamp for this value. The results are also grouped by location.
The graph next to the table tells us how the interpolated temperature compares to the real values. It is a common problem when dealing with timeseries data to handle missing values. Such occurrences are common, perhaps due to a sensor malfunction or disconnection. To address this, the missing values need to be filled in. CrateDB can employ another useful tool: window functions paired with the IGNORE NULLS
feature. Within a Common Table Expression, we utilize window functions to spot the next and prior non-null temperature recordings, and then compute the arithmetic mean to bridge the gap. You can see how the query looks in the query window.
The next graphs show rolling averages. They are crucial in time-series analysis because they help smooth out short-term fluctuations and reveal underlying trends by averaging data points over a specified period. If you again check the query window for this graph you can see the query for computing rolling averages. Rolling averages are computed for all metrics: temperature, humidity, and wind speed by using the window of the last 10 and 20 readings.
Last, but not least, in CrateDB you can effectively query geospatial data and this is shown in the last piece of this dashboard. The map shows all weather stations that are within a radius of 1000km from Berlin. Let’s check the query itself that produces the data: to show points on the map we need latitude and longitude information from each point, which is easy to get due to existing scalar functions in CrateDB. In the WHERE
clause, we use the distance function to filter only those points that satisfy our condition. The result illustrates four locations.
With this, we are closing our overview on time series visualization. You have learned how to use Grafana with CrateDB to query, visualize and further analyze your time series data.