# Relational Model and Why It Matters

The relational model has been immensely influential in computer science and programming. There are many papers, books, and classes related to the theory of relational database systems. We can be successful SQL developers without a thorough understanding of the relational model, but we should at least have an introductory understanding.

## History

The relational model is the theoretical foundation for the Oracle database. It was first described by E.F. Codd’s 1970 paper “A Relational Model of Data for Large Shared Data Banks.” I recommend you read that paper; it’s surprisingly accessible and still relevant. Most of this section is based on that original paper, although E.F. Codd and others have expanded on the relational model in other works.

## Terminology

Understanding the relational model can at least help us understand other people. There’s rarely a good reason to use the theoretical words relation, tuple, and attribute, instead of the more common table, row, and column. Even E.F. Codd’s paper uses those common words in several places. But since some people are going to insist on using the fancy words, we might as well learn them.

Table 1 is copied straight from Codd’s original paper. There’s a relation (table) named `SUPPLY`. It has three tuples (arrays or rows) and four simple domains (attributes, fields, or columns), meaning it has a degree of four. The primary key uniquely identifies the row and is the combination of `SUPPLIER`, `PART`, and `PROJECT`, each of which is also a foreign key to another relation. Foreign keys ensure that lookup values actually exist in the tables they refer to.

Table 1 Supply

 Supplier Part Project Quantity 1 2 5 17 1 3 5 23 2 3 7 9

## Simplicity

The relational model is all about simplicity. It’s not more powerful than other systems, just easier to use. (So don’t worry, there won’t be any proofs or formulas in this section.)

Simplicity is achieved by removing redundancy and nonsimple domains through normalization with primary keys. In practice those concepts translate into two rules: do not store lists of values and do not repeat a column. In the preceding `SUPPLY` table, it would be a huge mistake to add columns like `DELIVERY_DATES` or `SUPPLIER_NAME`, even though those columns may seem convenient at first. `DELIVERY_DATES`, possibly a comma-separated list of values, would be easy to read but difficult to filter or join. `SUPPLIER_NAME` might look good right next to the supplier number, but it would be a duplicate of the value already in the `SUPPLIER` table.

The rules of the relational model can be summarized as follows: make the schema smart but keep the tables dumb. It’s the relationships between the tables that matter. Those relationships take place through constraints and joins, but those constraints and joins can’t work if the data is not simple.

## Sets and Tables

Thinking about the database in terms of simple relationships, or just as simple sets, is one of the keys to writing great Oracle SQL. We use the relational model to retrieve sets, join them into new sets, join those sets together, etc. Like with user interface design, two simple choices are better than one difficult choice. Forming a good mental model of database sets and tables is going to take a while, don’t feel bad if you don’t get it yet. The topic is discussed a lot throughout the book.

It’s easy to take tables for granted because they’re so common now. We use them not just in databases but in spreadsheets, HTML, and many other places. We shouldn’t avoid important data structures just because they appear too simple at first glance. When we run into a difficult problem putting data into a table, the answer is to create more tables.

## Problems Implementing a Relational Model

Codd’s paper predicted difficulties implementing a relational model. His fears have proved to be well founded.

Abstracting the storage mechanisms gives the database system a lot of responsibilities. The database must create and maintain the data structures and choose algorithms to access those data structures. If Niklaus Wirth’s book title Algorithms + Data Structures = Programs is true, databases have a lot of work to do. This is a problem, but a problem the Oracle database has already solved for us. We can use the relational model and let Oracle do the heaving lifting for us. For example, when we update an indexed column, we don’t need to know exactly how the index is maintained. Index maintenance algorithms are a problem for Oracle Corporation programmers and the reason why we pay them for their software.

Unfortunately Oracle cannot automate everything. Oracle provides many storage options like indexes, caches, in-memory column stores, etc. And it provides ways to gather data about those relationships through things like multicolumn statistics. There are many complicated trade-offs between performance and storage, and we must help decide when a trade-off is worthwhile. When dealing with N columns, there are at least N! permutations of how to store them.

Despite Codd’s insistence on normalization in his paper, Codd also knew that redundant data was inevitable. Inconsistency of redundant data is hard to prevent because inconsistency is a state, not a specific action. E.F. Codd predicted the use of a metadata system to identify redundancies, and those redundancies could be prevented either in real time or as a batch job. As far as I know, an automated system to prevent redundancies does not exist. Perhaps some future version of Oracle will have an `ASSERT` functionality to fill this gap in our relational model implementation. For now, it is up to us to be aware of inconsistency problems and use a disciplined approach to mitigate them. Simply stated: if we create a wrong column, it’s our problem.

## Вас заинтересует / Intresting for you:

753 views Александров Попков Tue, 21 Jul 2020, 19:36:45
1202 views Ирина Светлова Thu, 17 May 2018, 18:18:17
1549 views Aida Wed, 25 Apr 2018, 11:12:33
3251 views Aida Mon, 18 Jun 2018, 10:58:54