Dynamic-Sql Act IV
For those who’ve continued interest in my mini series on SQL, thanks for reading. The saga continues below.
Our RDB now contains two paths from Baby (B) to Deceased (D), one including the Person (P) entity. We also presume, though our system does not outright, that there are roughly 10:1 paths from B->P->D than B->D. We’ve been trying to solve one of the old problems of a User-generated SQL query: power and risk versus ease and safety. These are not necessarily exclusive, and I have argued why. We have the technology, now lets put it to good use!
The semantics, as stated, is the easy part: We will store them in our foreign key properties in the code. When the user requests a path be made, we’ll retrieve and display the semantics that relate to the links we’ve chosen.
Now comes the tricky part: choosing which links are best. By default, “average” queries follow a consistent path. Trying to determine if Alice and Bob work for Acme generally means we’re looking for the same link. As is the case if we’re looking for anyone who works for Acme and Borax. In our second query, we’d presumably cross the same link twice. This isn’t a coincidence, it’s a natural state of the human expectation.
Let’s put this into practice. Suppose we cache a rough estimate as to how many unique foreign keys are in one table for its neighbors. (This can be achieved through indexing and is a very fast query in modern RDBMS’s.) Traditionally, when attempting to discern a path from one entity to another, shortest-distance was applied. However, given our RDB situation above, this is probably flawed: What we really wanted was from B->P->D, two hops instead of one. Given that we have a fast lookup that informs us we have an alternate path out of B, let’s try that.
Graph theorists recognize this as a max-flow/min-cut problem. We are looking to maximize our probability that the path chosen is correct; likely the path with the most foreign keys. If I had the time and energy, (maybe I will later), I’d make up a diagram or code snippet to illustrate; but you get my point. Some times the shortest path isn’t always the right one, and with the proper tool set in place we can offer a much more elegant – and noticeably more user-friendly – interface.
In the next part, (which I don’t expect to have finished anytime soon), I’ll be addressing some caveats I think would hinder our progress – most notably worst-case scenarios and impossible Source-Sink choices. It’s the conclusion to a thought constructed more than a month ago, and possibly one of the funnest thought experiments I’ve had in quite some time. I hope you enjoy thinking outside the box!
No comments
Jump to comment form | comments rss | trackback uri