There are many paths to adopting and mastering ScyllaDB, but the vast majority all share a visit to ScyllaDB University. I want to update the ScyllaDB community about recent changes to ScyllaDB University, introduce the High Availability lab, and share some future events and plans.
What is ScyllaDB University?
For those new to ScyllaDB University, it’s an online learning and training center for ScyllaDB. The material is self-paced and includes theory, quizzes, and hands-on labs that you can run yourself. Access is free; to get started, just create an account.
ScyllaDB University has six courses and covers most ScyllaDB-related material.
As the product evolves, the training material is updated and enhanced.
In addition to the self-paced, on-demand material available on ScyllaDB University, we host live training events. These virtual and highly interactive events offer special opportunities for users to learn from the leading ScyllaDB engineers and experts, ask questions, and connect with their peers.
The last ScyllaDB University LIVE event took place in March. It had two tracks: an Essentials track covering basic content, as well as an Advanced track covering the latest features and more complex use cases. The next event in the US/EMEA time zones will be on June 18th.
Join us at ScyllaDB University LIVE
Also, if you’re in India, we’re hosting a special event for you! On May 29, 9:30am-11:30am IST, join us for an interactive workshop where we’ll go hands-on to build and interact with high-performance apps using ScyllaDB. This is a great way to discover the NoSQL strategies used by top teams and apply them in a guided, supportive environment.
Register for ScyllaDB Labs India
Example Lab: High Availability
Here’s a taste of the type of content you will find on ScyllaDB University. The High Availability lab is one of the most popular labs on the platform. It demonstrates what happens when we read and write data while simulating a situation where some of the replica nodes are unavailable, with different consistency levels.
ScyllaDB, at its core, is designed to be highly available. With the right settings, it will remain up and available at all times.
The database features a peer-to-peer architecture that is leaderless and uses the Gossip protocol for internode communication. This enhances resiliency and scalability. Upon startup and during topology changes such as node addition or removal, nodes automatically discover and communicate with each other, streaming data between them as needed.
Multiple copies of the data (replicas) are kept. This is determined by setting a replication factor (RF) to ensure data is stored on multiple nodes, thereby allowing for data redundancy and high availability. Even if a node fails, the data is still available on other nodes, reducing downtime risks.
Topology awareness is part of the design. ScyllaDB enhances its fault tolerance through rack and data center awareness, allowing distribution across multiple locations. It supports multi-datacenter replication with customizable replication factors for each site, ensuring data resiliency and availability even in case of significant failures of entire data centers.
First, we’ll bring up a 3-node ScyllaDB Docker cluster.
Set up a Docker Container with one node, called Node_X:
docker run --name Node_X -d scylladb/scylla:5.2.0 --overprovisioned 1 --smp 1
Create two more nodes, Node_Y and Node_Z, and add them to the cluster of Node_X. The command “$(docker inspect –format='{{ .NetworkSettings.IPAddress }}’ Node_X)” translates to the IP address of Node-X:
docker run --name Node_Y -d scylladb/scylla:5.2.0 --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' Node_X)" --overprovisioned 1 --smp 1
docker run --name Node_Z -d scylladb/scylla:5.2.0 --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' Node_X)" --overprovisioned 1 --smp 1
Wait a minute or so and check the node status:
docker exec -it Node_Z nodetool status
You’ll see that eventually, all the nodes have UN for status. U means up, and N means normal. Read more about Nodetool Status Here.
Once the nodes are up, and the cluster is set, we can use the CQL shell to create a table.
Run a CQL shell:
docker exec -it Node_Z cqlsh
Create a keyspace called “mykeyspace”, with a Replication Factor of three:
CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'replication_factor' : 3};
Next, create a table with three columns: user id, first name, and last name, and insert some data:
use mykeyspace;
CREATE TABLE users ( user_id int, fname text, lname text, PRIMARY KEY((user_id)));
Insert into the newly created table two rows:
insert into users(user_id, fname, lname) values (1, 'rick', 'sanchez');
insert into users(user_id, fname, lname) values (4, 'rust', 'cohle');
Read the table contents:
select * from users;
Now we will see what happens when we try to read and write data to our table, with varying Consistency Levels and when some of the nodes are down.
Let’s make sure all our nodes are still up:
docker exec -it Node_Z nodetool status
Now, set the Consistency Level to QUORUM and perform a write:
docker exec -it Node_Z cqlsh
use mykeyspace;
CONSISTENCY QUORUM
insert into users (user_id, fname, lname) values (7, 'eric', 'cartman');
Read the data to see if the insert was successful:
select * from users;
The read and write operations were successful. What do you think would happen if we did the same thing with Consistency Level ALL?
CONSISTENCY ALL
insert into users (user_id, fname, lname) values (8, 'lorne', 'malvo');
select * from users;
The operations were successful again. CL=ALL means that we have to read/write to the number of nodes according to the Replication Factor, 3 in our case. Since all nodes are up, this works.
Next, we’ll take one node down and check read and write operations with a Consistency Level of Quorum and ALL.
Take down Node_Y and check the status (it might take some time until the node is actually down):
exit
docker stop Node_Y
docker exec -it Node_Z nodetool status
Now, set the Consistency Level to QUORUM and perform a write:
docker exec -it Node_Z cqlsh
CONSISTENCY QUORUM
use mykeyspace;
insert into users (user_id, fname, lname) values (9, 'avon', 'barksdale');
Read the data to see if the insert was successful:
select * from users;
With CL = QUORUM, the read and write were successful. What will happen with CL = ALL?
CONSISTENCY ALL
insert into users (user_id, fname, lname) values (10, 'vm', 'varga');
select * from users;
Both read and write fail. CL = ALL requires that we read/write to three nodes (based on RF = 3), but only two nodes are up.
What happens if another node is down?
Take down Node_Z and check the status:
exit
docker stop Node_Z
docker exec -it Node_X nodetool status
Now, set the Consistency Level to QUORUM and perform a read and a write:
docker exec -it Node_X cqlsh
CONSISTENCY QUORUM
use mykeyspace;
insert into users (user_id, fname, lname) values (11, 'morty', 'smith');
select * from users;
With CL = QUORUM, the read and write fail. Since RF=3, QUORUM means at least two nodes need to be up. In our case, just one is. What will happen with CL = ONE?
CONSISTENCY ONE
insert into users (user_id, fname, lname) values (12, 'marlo', 'stanfield');
select * from users;
With CL = QUORUM, the read and write fail. Since RF=3, QUORUM means at least two nodes need to be up. In our case, just one is. What do you think will happen with CL = ONE?
Check out the full High Availability lesson and lab on ScyllaDB University, with expected outputs, full explanation of the theory, and an option for seamlessly running the lab within a browser, regardless of the platform you’re using.
Next Steps and Upcoming NoSQL Events
A good starting point on ScyllaDB University is ScyllaDB Essentials – Overview of ScyllaDB and NoSQL Basics course. There, you can find more hands-on labs like the one above, and learn about distributed databases, NoSQL, and ScyllaDB.
ScyllaDB is backed by a vibrant community of users and developers. If you have any questions about this blog post or about a specific lesson, or want to discuss it, you can write on the ScyllaDB community forum. Say hello here.
The next live training events are taking place in:
- An Asia-friendly time zone event will take place on the 29th of May
- US/Europe, Middle East, and Africa time zones on the 18th of June.
Stay tuned for more details!