Storing hierarchical data in relational databases, such as MySQL, is an age old question. There have been some solutions such as Adjacency Model or, to a certain extent, Entity Relationship Model. Oracle, on the other hand, although being a relational database has a good support for hierarchical data with its connect by clause but with an express edition being somewhat limited, and the price tag for a standard edition is a hefty $5′800 (enterprise going for $47′500), for an average open source developer its not an option.
The problem with Adjacency Model is that its very resource intensive. It may not be so bad for querying but for inserting new data having to update left and right columns for every one of the potentially thousands of records may not sound so good. You can also keep a separate table to track relationships but the queries can get very complex very fast and, again, take time if lots of data.
Finally, what happens if a child is a part of more than one parent? Take an example of a database of languages and GUI frameworks used by them where Qt, python and JavaScript are in multiple unrelated categories:
client side/c++/Qt
client side/python/Qt
client side/Java/Swing
client side/python/GTK
server side/PHP/Java Script
server side/python/Java Script
Hierarchal data manipulation, such as making a tree view or extracting a piece of chain, can be very complex.
So why continue struggling with relational DB when there is a technology geared specifically to do what we need: LDAP lightweight directory access protocol.
There is however a catch. LDAP is so different from RDBMS that you will have to relearn everything you knew about databases.
A typical relational table to outline above languages and corresponding GUIs could be something like:

Yes, I know I broke the fundamental rule of normalization. The example is purely to illustrate a point about linear relationships you get from tables. It is simple enough now but what if we want to extend it like this:
client side/compiled/c++/Qt
client side/interpreted/python/Qt
client side/compiled/Java/Swing
client side/interpreted/python/GTK
server side/interpreted/PHP/Java Script/jQuery
server side/interpreted/python/Java Script/jQuery
Basic tables will be more difficult to design around that data.
LDAP solves the problems of linear tables by storing all data like this:

Yep, thats exactly the way its stored. But it does not end there. This layout still does not help with Qt, JS and python being members of more than one category. What comes into rescue is users and groupsrelationships. Yes, the same ones that dictate what user is a member of what department on your network. Whats more, is that a user can be a part of multiple groups which in term can be a part of other groups. Thus, you do not only have a “physical” storage in tree format but you can extend it with even more detailed “virtual” relationships that transcend the parent categories the record is in. Here’s how the principle will be applied. To better illustrate the point I will shift towards the more complex example I proposed earlier:
client side/compiled/c++/Qt
client side/interpreted/python/Qt
client side/compiled/Java/Swing
client side/interpreted/python/GTK
server side/interpreted/PHP/Java Script/jQuery
server side/interpreted/python/Java Script/jQuery
In this example individual languages are not going to be holding their own branch but will rather be groups. GUI toolkits will then be members of those groups.

OK, so you may not agree with my way of characterising these langauges but the point is illustrated nevertheless. Since GUI toolkits can be both loaded at runtime and built into the program, I decided to leave them out of any specific character except for client/server. So what you see is how a toolkint can be a member of one “main” category while having relations to others. Keep in mind that the diagram is not some logical representation of a more complex layout: this is the way data is stored.
But whats up with having JavaScript both as a group and as a user? Web developers would know that JS can be used both as a language to program client side interfaces manually as well as a backbone for more higher level toolkits such as AJAX frameworks, in this case jQuery. Since a user cannot be a member of another user a technique used to solve this problem is creating a group and a user with the same name. Don’t worry, they are two independent types of structures so they can have the same name. This allows us to use JavaScript both as a language or as a backbone when necessary. In fact, in practice I would have done the same for all languages listed: create a group and a user with the same name and then link them. That way I am never in danger of hitting a wall if for whatever reason I need to add some unique details to C++.
Finally, do not forget that each “user” is able to have its own set of data. So to a certain extent you can think of each “user” icon in that tree as a record.
Hopefully, this introduction sparked some interest. Stay tuned for part 2 were we look at ldap syntax used to create this example.
| < Prev | Next > |
|---|
