[Openais] Add a graph hash table to libqb

Steven Dake sdake at redhat.com
Thu Apr 15 10:00:55 PDT 2010


On Thu, 2010-04-15 at 05:37 +0000, Dietmar Maurer wrote:
> > The long term goal of this work is to enable a replicated structured
> > memory-based key-value storage that maintains consistency after a merge
> > from a network partition.  This allows IPC speed reads, and network
> > speed writes of key/value pairs with full availability of key/value
> > data on all nodes in the network.
> 
> Do you plan to make that key-value store persistent? I have a glib
> based implementation of such thing which uses Berkeley DB to store
> data on each node (some code already posted on the openais list).
> 
> The advantage of the 'in memory' architecture is that it is easy
> to calculate diffs to do the sync (snapshots/transactions are
> simply a copy in memory). The drawback is memory consumption, which
> can be huge if there is a large number of key/value pairs. So
> a pure database implementation (db4.7 supports snapshots and 
> transactions) would be even better?
> 

We plan to address only small scale (32 nodes or less) consistent
replica hashing in memory (not persistent).  Adding support for a
database backend of hash key/value writes seems pretty easy to address.

The properties for where this type of technology evolves into:

1) key/value store with hash-table speed lookup in majority of cases
without requiring network round trip request/response operation to
achieve key and value
2) scale to thousands of nodes
3) key/values are replicated, but perhaps not everywhere
4) key/values are persistently stored
5) API to access the key/value data store
6) Assume that nodes fail often and work in a nonblocking fashion when
those node failures occur
7) keys can be structured, such that organization can be created
identifying keys related to other keys.

To achieve #2 and #6, we may need to investigate technologies
alternative to corosync.  The guarantees corosync provides make scaling
difficult.   Relaxing guarantees could allow us to scale dramatically
larger for this scenario.

Regards
-steve



More information about the Openais mailing list