Riak, Fulll text search with ripple

One of the major disadvantages with Riak’s key value store is normal active record type search is present only using the bucket key.

For example a search like

User.where (:active=>true,:role=>"admin")

is not available in Riak. There is a simple work around for this, Riak full text search.

Riak Search is a distributed, easily-scalable, failure-tolerant, real-time, full-text search engine built around Riak Core and tightly integrated with Riak KV.Riak Search allows you to find and retrieve your Riak objects using the objects’ values. When a Riak KV bucket has been enabled for Search integration (by installing the Search pre-commit hook), any objects stored in that bucket are also indexed seamlessly in Riak Search.

The Ripple documentation on using riak search isnt straight forward for a Active Record newbie, after a little digging around I got it working.

1. Enable Riak search in app.config.

By default  riak  search is disabled. Enabling riak search is as changing a false to true in the app.config

<pre>{riak_search, [{enabled, true},]},</pre>

2.Index the bucket for which you want to use Riak search
By default none of the buckets are indexed, you have to tell riak which buckets you have to index.
From the console run.

BucketName.bucket.enable_index!

This will make index any new entry to the bucket. Please note that any old data already present in the bucket will not be
indexed by this.

3.Searching using lucene query
Now we can use a lucene query to search the indexed data.

client=Ripple.new

This creates a client object for riak.

query="role:admin AND active:1"

A lucene query for searching where role is admin and active is true

client.search("users",query)

We use the method search to run the lucene query on users bucket, this will return a JSON result.

4.Params for search
The search methods takes an extra optional argument as hash where we can specify certain options.

  • df the default field to search in
  • ‘q.op’ the default operator between terms (“or”, “and”)
  • wt (“json”) the response type – “json” and “xml” are valid
  •   sort (‘none’) the field and direction to sort, e.g. “name asc”
  • start (0) the offset into the query to start from, e.g. for pagination
  • rows (10) the number of results to return
start and rows are neat little options which will let you paginate over your search results.

Leave a comment