This is Part 2 in a series on Thrift support in ScyllaDB. Part 1 is here.
KairosDB is an open source time series database written on top of Cassandra, which it accesses through the Hector client library. That means it is still using the Thrift API. KairosDB can easily be set up by following the getting started instructions. All the Cassandra-specific configuration options are valid for ScyllaDB as well. To have KairosDB use ScyllaDB all we need to do is start it instead of Cassandra:
$ /usr/bin/scylla --smp 2 --memory 4GB
$ kairosdb/bin/kairosdb.sh run
Note that if you’re running both ScyllaDB and KairosDB on the same machine, you’ll want to constraint ScyllaDB’s CPU and memory usage, like in the command line above (preferably by editing the /etc/sysconfig/scylla-server
configuration file).
We can use KairosDB to store a time series of how many clicks a given story receives. KairosDB allows pushing data via telnet, so we can use the following script to populate it with some data, assuming KairosDB is running on the local host:
#!/bin/bash
# Current time in milliseconds
now=$(($(date +%s%N)/1000000))
metric=story_views
story_id=$RANDOM
scrolled=$(($RANDOM % 3))
if [ $scrolled -eq 0 ]
then
scrolled="start"
elif [ $scrolled -eq 1 ]
then
scrolled="middle"
else
scrolled="end"
fi
echo "put $metric $now $story_id scrolled=$scrolled" \
| nc -w 30 localhost 4242
This script adds a data point for the story_views
metric at the current time, assigning it an artificial story_id
and tagging it with the scrolled
tag, which acts as a heuristic about how the user engaged with the story. When pushing the first data point, KairosDB will create its schema in ScyllaDB.
After inserting some data into KairosDB, we can inspect the CQL tables that ScyllaDB created using cqlsh
. For example, we can describe the string_index
table, which contains the metric names, tag names and tag values:
cqlsh> desc kairosdb.string_index;
CREATE TABLE kairosdb.string_index (
key blob,
column1 text,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC);
We can see that this table was created from the definition of a dynamic column family, as it defines a clustering key.
We can also use cqlsh
to query this table and see that it contains entries with the metric name and the different values for the scrolled
tag:
select * from kairosdb.string_index where key = 0x7461675f76616c756573;
key | column1 | value
------------------------+---------------+-------
0x7461675f76616c756573 | end | 0x
0x7461675f76616c756573 | middle | 0x
0x7461675f76616c756573 | story_views | 0x
0x7461675f76616c756573 | start | 0x
...
All of the KairosDB API works well – and transparently – with ScyllaDB. We can, for example, query the 2 most recent stories (assuming sequentially incremented IDs) viewed in the last day:
curl -XPOST http://localhost:8080/api/v1/datapoints/query -d '{
"start_relative":{
"value":"1",
"unit":"days"
},
"metrics":[
{
"name":"story_views",
"order":"desc",
"limit":"2"
}
]
}
'
> {
"queries":[
{
"results":[
{
"group_by":[
{
"name":"type",
"type":"number"
}
],
"name":"story_views",
"tags":{
"scrolled":[
"end"
]
},
"values":[
[
1469016283663,
12506
],
[
1469016283636,
28348
]
]
}
],
"sample_size":2
}
]
}
We can also issue a more complex query, such as counting how many stories were viewed and scrolled until the end during the last day:
curl -XPOST http://localhost:8080/api/v1/datapoints/query -d '{
"start_relative":{
"value":"1",
"unit":"days"
},
"metrics":[
{
"name":"story_views",
"tags":{
"scrolled":[
"end"
]
},
"aggregators":[
{
"name":"count",
"sampling":{
"value":1,
"unit":"days"
}
}
]
}
]
}
'
> {
"queries":[
{
"results":[
{
"group_by":[
{
"name":"type",
"type":"number"
}
],
"name":"story_views",
"tags":{
"scrolled":[
"end"
]
},
"values":[
[
1469014929878,
455
]
]
}
],
"sample_size":455
}
]
}
When we’re done, we can delete our metric and all of its data points, using cqlsh
to verify data is actually being deleted:
cqlsh> select count(*) from kairosdb.data_points;
count
-------
1665
curl -XDELETE http://localhost:8080/api/v1/metric/story_views
cqlsh> select count(*) from kairosdb.data_points;
count
-------
681
Follow ScyllaDB on Twitter for updates.
Thrift and KairosDB at ScyllaDB Summit
Thrift is one of many topics to be covered at the upcoming ScyllaDB Summit. Come to ScyllaDB Summit on September 6th, in San Jose, California, to learn more about Thrift and other new and upcoming ScyllaDB features—along with info on how companies like IBM, Outbrain, Samsung SDS, Appnexus, Hulu, and Mogujie are using ScyllaDB for better performance and faster development. Meet ScyllaDB developers and devops users who will cover ScyllaDB design, best practices, advanced tooling and future roadmap items.
Going to Cassandra Summit? Add another day of NoSQL ScyllaDB Summit takes place the day before Cassandra Summit begins and takes place at the Hilton San Jose, adjacent to the San Jose convention Center. Lunch and refreshments are provided.