Cassandra with Ruby

Ruby Gem Installation for Cassandra

Install Ruby gem for Cassandra

sudo gem install cassandra

Cassandra with Ruby

Add Cassandra gem in a ruby file

require 'cassandra'

Connect to a Cassandra server and a keyspace

client = Cassandra.new('Twitter')
client = Cassandra.new('Twitter', '127.0.0.1:9160')

Authenticate yourself if required

client.login!('username','password')

Script in creating the keyspaces for Twissandra

create keyspace Twitter with
  placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' AND
  replication_factor = 1;
use Twitter;
create column family Users with comparator = 'UTF8Type';
create column family UserAudits with comparator = 'UTF8Type';
create column family UserRelationships with
  comparator = 'UTF8Type' and
  column_type = 'Super' and
  subcomparator = 'TimeUUIDType';
create column family Usernames with comparator = 'UTF8Type';
create column family Statuses with comparator = 'UTF8Type';
create column family StatusAudits with comparator = 'UTF8Type';
create column family StatusRelationships with
  comparator = 'UTF8Type' and
  column_type = 'Super' and
  subcomparator = 'TimeUUIDType';
create column family Index with
  comparator = 'UTF8Type' and
  column_type = 'Super';
create column family TimelinishThings with
  comparator = 'BytesType';

create keyspace Multiblog with
  placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' AND
  replication_factor = 1;
use Multiblog;
create column family Blogs with comparator = 'TimeUUIDType';
create column family Comments with comparator = 'TimeUUIDType';


create keyspace MultiblogLong with
  placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' AND
  replication_factor = 1;
use MultiblogLong;
create column family Blogs with comparator = 'LongType';
create column family Comments with comparator = 'LongType';

create keyspace CassandraObject with
  placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' AND
  replication_factor = 1;
use CassandraObject;
create column family Customers with comparator = 'UTF8Type';
create column family CustomerRelationships with
  comparator = 'UTF8Type' AND
  column_type = 'Super' AND
  subcomparator = 'TimeUUIDType';
create column family CustomersByLastName with comparator = 'TimeUUIDType';
create column family Invoices with comparator = 'UTF8Type';
create column family InvoiceRelationships with
  comparator = 'UTF8Type' AND
  column_type = 'Super' AND
  subcomparator = 'TimeUUIDType';
create column family InvoicesByNumber with comparator = 'UTF8Type';
create column family Payments with comparator = 'UTF8Type';
create column family Appointments with comparator = 'UTF8Type';

Cassandra Twissandra JSON Schema

{"Twitter":{
    "Users":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Standard"},
    "UserAudits":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Standard"},
    "UserRelationships":{
      "subcomparator_type":"org.apache.cassandra.db.marshal.TimeUUIDType",
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Super"},
    "Usernames":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Standard"},
    "Statuses":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Standard"},
    "StatusAudits":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Standard"},
    "StatusRelationships":{
      "subcomparator_type":"org.apache.cassandra.db.marshal.TimeUUIDType",
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Super"},
    "Index":{
      "comparator_type":"org.apache.cassandra.db.marshal.UTF8Type",
      "column_type":"Super"},
    "TimelinishThings":{
      "comparator_type":"org.apache.cassandra.db.marshal.BytesType",
      "column_type":"Standard"}
  },
"Multiblog":{
    "Blogs":{
      "comparator_type":"org.apache.cassandra.db.marshal.TimeUUIDType",
      "column_type":"Standard"},
    "Comments":{
      "comparator_type":"org.apache.cassandra.db.marshal.TimeUUIDType",
      "column_type":"Standard"}
  },
"MultiblogLong":{
    "Blogs":{
      "comparator_type":"org.apache.cassandra.db.marshal.LongType",
      "column_type":"Standard"},
    "Comments":{
      "comparator_type":"org.apache.cassandra.db.marshal.LongType",
      "column_type":"Standard"}
  }
}

Create the schema with a JSON definition

bin/cassandra-cli --host 10.10.10.1 --batch < twitter.json

Cassandra Insert with Ruby

Insert a row into a column family Users

client.insert(:Users, "10", {'screen_name' => "john"})
  • "10" is the key to the row

Insert a row with Time to live (TTL)

client.insert(:Users, "10", {'screen_name' => "john"}, {:ttl=>10})

Post and response to a tweet

t1 = {'text' => 'My first tweet', 'user_id' => '10'}
client.insert(:Statuses, '1', t1)

t2 = {'text' => '@paul welcome', 'user_id' => '10', 'reply_to_id' => '20'}
client.insert(:Statuses, '2', t2)

Insert a row into a super column family

client.insert(:UserRelationships, "10", {"user_timeline" => {UUID.new => "1"}})
client.insert(:UserRelationships, "10", {"user_timeline" => {UUID.new => "2"}})
  • UUID.new generates a unique column name

Cassandra Query with Ruby

Syntax for retrieving a column

keyspace.get("column_family", mykey, "mycolumn")

Syntax for retrieving a Super column

keyspace.get("column_family", mykey, "super_column", "column")

Query a super column

timeline = client.get(:UserRelationships, "5", "user_timeline")

Query all the tweets and return as an array

tweets = client.get(:UserRelationships, '10', 'user_timeline', :reversed => true)
tweets.map { |t, id| client.get(:Statuses, id, 'text') }
tweets = client.get(:UserRelationships, '10', 'user_timeline', :reversed => true).to_a

Cassandra Secondary Index with Ruby

Create and delete a Cassandra secondary index

client.create_index("Twitter", "Users", "revenue_generating_units", "LongType")
client.delete_index("Twitter", "Users", "revenue_generating_units"

Create an index clause and query an indexed column family

expr   = client.create_idx_expr("revenue_generating_units", 100, ">")
clause = client.create_idx_clause([expr])
client.get_indexed_slices(:Users, clause)