First Weapon – Understanding the Relational Model

For years I have been and will continue to recommend that anyone working with databases and queries read the following articles

The Fundamentals of Relational Database Design by Paul Litwin

and

Understanding SQL Joins @ devshed.com

These two articles helped solidify my formal education in databases and for years I thought I understood the Relational Model and some of the more advanced SQL Joins…which I did, to a point. However, to be Platypus Hunters, we must delve deeper and understand how databases are represented by the relational model.

Over the past few weeks I have been reading Exam 70-461: Querying MS SQL Server 2012 Training Kit preparing to become a Microsoft Certified Solutions Associate: SQL Server 2012. After reading Chapter 1 – Foundations of Querying I feel I have an even better understanding of the relational model and what makes something relational. So, my Hunters-to-be, let’s discover Cobb and his relations.

There were several “aha” moments reading this first chapter. The first came from reading this:

A relation in the relational model is what SQL calls a table. The two are not synonymous. You could say that a table is an attempt by SQL to represent a relation….

That made me re-think some of my preconceived notions about what I knew about databases, tables and the relational model.

So going forward with an open mind, I read:

Some of the most important principals to understand about T-SQL stem from the relational model’s core foundations-set theory and predicate logic.

Remember that the heading of a relation is a set of attributes, and the body a set of tuples. So what is a set? According to the creator of mathematical set theory, Georg Cantor, a set is described as follows:

“By a ‘set’ we mean any collection of M into a whole of definite, distinct objects m (which are called the ‘elements’ of M) of our perception or of our thought.”

Set Theory

Reading this chapter exposed me to concepts that I had knowledge of but no understanding. I had heard “real” database people refer to tuples but had never really “gotten it”. I had to go look up “tuple” on the All-Knowing Wikipedia. As it relates to math it

is an ordered list of elements. Tuples are usually written by listing the elements within parentheses ‘()’ and separated by commas; for example, (2,7,4,1,7) denotes a 5-tuple.

That was another “aha” for me. I could see that this was a set of data, like what is returned from a query. That made sense to me.

Looking into set theory I found that a relation has a heading and a body. The heading is a set of attributes. These attributes are referred to by name and type . The body is a set of tuples. These concepts become transformed by SQL into something recognizable. The relation is a table showing the heading as column names and tuples as rows in the table.

(So, remember in the last episode when we said that thinking of a database like a spreadsheet was ok for non-Hunters, this concept is the real deal.)

Do you remember back in elementary school when you learned about VENN diagrams? You had to draw all those circles that represented MAMMALS and EGG LAYING CREATURES and come up with the INTERSECTION showing Mammals that are Egg Laying Creatures?

Venn Diagram

That’s Set Theory – see you know this – you just learned it so long ago, you’ve forgotten!

Back to Cantor’s definition of a set. It should be considered as a single, whole thing….a SET of something, in this case tuples. It should be a set with no duplicates (there are exceptions to this in the real world, but theoretically…) and while not explicitly stated, no order to the set – as one of the tek-tip gurus would say, think of a result set as a bag of marbles, all the records jumbled up together – (when, in a query, you add sorting you have created a sequence, not a relation – a topic for another day!)

Predicate Logic

A predicate is defined as an expression that when applied to an object makes a proposition evaluate to either true or false. For instance, ‘hire date greater than 1/1/2012’ is an example of a predicate. For each employee you can evaluate this predicate and create a proposition that returns true if that employee was hired after that date. Any expression that you can write to evaluate criteria to true or false is a predicate. Is it greater than another number, does a name match, any other criteria.

Predicate Logic is a core element of the relational model. Using predicates you can enforce data integrity, filter data, and even define the data model.

While I didn’t study these concepts formally, I did have an intuitive understanding of them. Understanding these concepts will be the first weapon in your arsenal to fight the Platypus Database.

Developer’s Golden Egg

I have decided that when I have some tidbit that might be helpful to a certain group of my readership, the Platypus will give you a Golden Egg – the first one goes to the Developers! In fact, for a while, most of them will probably be for developers…that is what I did for a long time!

One thing that I tried to do as an application developer was iterate through the results of my queries in my application and not on the database. Remember, think of the results of your query as a set. So, if you need to get all the rows that meet some criteria try to figure out a way using a query to just return to your program the information in which you are interested. Conversely, try not to return ALL the rows and have your application determine which ones meet your criteria, let the database do what databases are good at doing.

Also. try to use aggregate functions when possible. Don’t bring back a whole record set to your application and count the number of rows, if you just need to know the total number of rows use the COUNT function and just return what you need.

The Platypus Database

A database with a design that appears to have had as much thought put into it as….the Platypus