SisoDB: The wrong solution to the wrong problems

May 15th, 2011 .NET, English posts, RavenDB, SisoDB

6 min read

Data structures are the corner stone of computing. If you get them done right, you will most probably succeed in your mission of delivering an application that uses its resources wisely and performs well. In modern computing, most data is stored to and retrieved from databases. Databases are data structures' big brothers - they serve the same purpose but with added value. Choosing one wisely can greatly help you in some many ways; going with the wrong one would cost you too much. This is why one should not take lightly the decision of which database solution to use.

Dealing with data explosion

Since the 70's, whenever data had to be persisted, RDBMSes were the most effective and trusted tools to use. Since OOP became dominant developers found it quite itching to stuff their hierarchical entities into the flat structure of Tables and Rows, which is the ABC of RDBMSes. This is how ORMs came to life. Coming to think of it retrospectively, ORMs were never the solution. They just made the problem less itching. In practice, your data still had to go through quite an awful lot of processing until it was persisted to, or loaded from store. But as long as it was transparent for the developer, and he knew that loads of optimization is happening under the hood, it seemed like there's nothing to worry about.

Although the concepts were known since the 80's, it was not until recent years real object-, document-, and graph-databases came into life. It took big players like Facebook and Twitter to get those ideas to mature and become production ready. Someone (or a handful of them) realized a shift in thinking is essential, and real-world problems like replication and sharding suddenly seemed a lot less complicated. As a result the NoSQL movement (or whatever it has become) is now full-steam ahead, and data-access best practices are being re-written.

Each NoSQL brand introduces some cool unique features, never seen in RDBMSes before. Document-oriented databases introduced the "schema-less" concept. That is, unlike in traditional RDBMSes, defining a data scheme is no longer required. The DocDB would either figure it out on its own, or it wouldn't even bother to. Data schemes are required in RDBMSes to define the table structure and allow for efficient indexing; DocDBs have a different go at it - Map/Reduce.

SisoDB choosing the wrong battle

SisoDB is the new face in town, but it looks like it is choosing the wrong battle. The problems it tries to solve are not real problems. Let me explain.

The SisoDB website explains the motivation behind SisoDB: the need of a real schema-less solution for data storage, while at the same time making sure the powerful tools offered by SQL Server are still available. ORMs are deemed evil because they require mappings, which contradicts the notion of schema-less, and non-MS-SQL backend is probably deemed irrelevant too. This is probably why there are no providers for Oracle nor MySQL.

So, in SisoDB data is now schema-less, but it spans over 3 tables per entity. This is how it looks (taken from the SisoDB site):

And the question arises: if real schema-less database is what you're after, a direct-POCO-to-storage-and-back-again solution, why would you use SisoDB with SQL Server in the first place? You can just use a NoSQL schema-less database, and if you treasure MSSQL's reporting tools that much just find a way to still be able to use them! When resorting to not using a NoSQL database, you lose ALL the possible sweet spots such products have to offer - which MsSQL offers none. And there are so many of them.

Nowadays it doesn't make sense to use SisoDB, neither in new development nor in existing applications. It may feel like being schema-less, but its fundaments are too deep in the RDBMS world, and it shows - to name a few:

Deep hierarchies and enumerables are not supported
Entity ids ought to be named SisoDb, making it harder to integrate with existing code
You can't specify string ids for entities (ids have to be int or Guids)
You have to CREATE your databases
For every model change you have to tell SisoDB to update the model; it will not be detected automatically, and a schema update is still required.
Various SQL common faults, like SELECT N+1 or not batching where possible.
Sharding and replication, other strong characteristics of NoSQL databases, are by definition one mile behind.

Some performance numbers were posted by the author comparing SisoDB and other ORMs for inserts. But queries are what you should really care about; and you are going to be disappointed. The most extensive indexing feature SQL has - relations between tables - is not being used in SisoDB by design. SisoDB doesn't define FKs, and doesn't operate JOINs. Put simply this means that by design SisoDB harms lookups performance, which is hands down the most crucial part of your application. You don't want this.

Just for comparison: RavenDB is a document database written in .NET, schema-less too and uses POCOs or raw JSON, with no mapping whatsoever, which uses Linq for querying. But it is real NoSQL, and as such it is offering much more natural replication and sharding functionality. Other features include full-text search out-of-the-box, entity versioning, REST API, complex and super-fast indexes, embedded mode, Silverlight support, and much more. And RavenDB comes with the ability to replicate its indexes to MsSQL so the reporting tools can still be used even though you're in NoSQL land.

If you were able to convince your bosses to use NoSQL, go with a real NoSQL solution. If not, try again. If you still fail, just keep using your favorite ORM and if mapping annoys you find ways to automate that process instead.

I took a quick look at SISO and agree that it misses the boat. What's the point if you still need to build out your objects? Just so you don't have to worry about your ORM? These days most ORMs are smart enough that automapping takes care of 99% of your problem. And the overhead of the ORM is smallish compared to the database queries themselves.

Where I do think SISO could make a name for itself are those edge cases where you don't know the schema in advance - ie: users can build out their own schema - and want the strength/stability of MSSQL. They had added some hooks for dynamic objects but it never seems to have gone anywhere. Too bad.

If you are looking for schemaless, go NOSQL route. If you want the stability/history of an RDBMS, use an ORM with automapping. I don't really see the point of SISO.

Aaron B. May 16th, 2011

[...] some critisicm to SisoDb: http://www.code972.com/blog/?p=201 It means that someone has looked at SisoDb and taken a stand wheter they like it or not. For me [...]

Finally some critisicm to SisoDb! « Daniel Wertheim May 16th, 2011

Hi,

Thanks for the criticism. Great since I thinks it's important that people think over their data access strategies and that they should selected them not based on being cool or new. Tried to answer some of your opinions here:

http://daniel.wertheim.se/2011/05/16/finally-some-critisicm-to-sisodb/

Regards Daniel

Daniel May 16th, 2011

Code 972

SisoDB: The wrong solution to the wrong problems

Dealing with data explosion

SisoDB choosing the wrong battle

Comments

Comments are now closed