Many to many relationships and RavenDB models
I've recently come across this question in StackOverflow titled "Many to many design for NoSql (RavenDB)", and I thought it is a really good case of a very frequently asked question worth featuring here:
I have two simple objects, 'user' and 'events'. A user can enter many events and an event can be entered by many users - a standard many to many relationship. I'm trying to get out of the relational database mindset!
Here are the queries/actions that I'd like to run against the database:
- Get me all the events that a user has not entered
- Get me all the events that a user has entered
- Update all the events (properties like remaining spaces) very frequently (data polled from various external data sources).
- Removing events when they've expired
The OP then lists 3 options of solving this:
- A "relational approach": Create a new object, that links user and events. For example, a "booking" object, that stores the userId, eventId
- Denormalise the events data within the user object, so there is a list of eventIds on the user.
- Don't use RavenDB for this, instead use a relational database.
Relational vs RavenDB modeling questions are very common, and the question about many to many relations is one in particular that I'm getting very frequently. That is, when people are aware design decisions differ between relational databases and document databases (and between document databases and RavenDB, and basically between every type of non-relational databases). Its also very common for people to find it hard to escape the relational thinking, as can be evident from the 3 options the OP listed.
I was just finishing writing chapter 5 of RavenDB in Action which discusses document-oriented modeling, so I grabbed my pen (well, keyboard) and wrote down the following answer. It was only a coincidence that I previously discussed almost the exact same domain in the past.
The event-registration domain is a great one for discussing RavenDB modeling, especially because there may be various use cases in play, and use cases dictate the approach to take when deciding on the model. This is pretty much what I tried explaining in my answer, using some nice indexing technique to solve one less-trivial query that was required:
Actually, RavenDB is a perfect fit for that. To properly do that, ask yourself what are the main entities in your model? each one of those will be a document type in RavenDB.
So in your scenario, you'd have Event
and User
. Then, an Event
can have a list of User
IDs which you can then easily index and query on. There are more ways to do that, and I actually discussed this in my blog some time in the past with some further considerations that might come up.
The only non-trivial bit is probably the index for answering queries like "all events user has not entered", but that's quite easily done as well:
public class Events_Registrations : AbstractIndexCreationTask<Event>
{
public Events_Registrations()
{
Map = events => from e in events
select new { EventId = e.Id, UserId = e.Registrations.SelectMany(x => x.UserId) });
}
}
Once you have such an index in place, you can do a query like below to get all events a specified user has no registrations to:
var events = RavenSession.Advanced.LuceneQuery<Event, Events_Registrations>()
.Where("EventId:*")
.AndAlso()
.Not
.WhereEquals("UserId", userId).ToList();
Then handling things like expiring events etc is very easily done. Definitely don't denormalize event data in the User object, it will make your life living hell.
There's some data missing though - for example how many registrations are allow per event. If there are too many you may want to break it out of the actual Event
object, or to revert to Booking objects as you mention. I discuss this in length in my book RavenDB in Action.