Passionate about data

Data and its implications on software design and development.

To Allow NULLs or NOT

In any greenfield application (existing production application is a topic for another post). When you design table(s) lets say you have Item and Manufacturer table as shown below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CREATE TABLE Item
(ItemID NUMBER NOT NULL,
ManufacturerID NUMBER,
Name VARCHAR2(128),
Rate NUMBER,
CONSTRAINT PK_ITEM
              PRIMARY KEY (ItemID)
);

CREATE TABLE Manufacturer
(ManufacturerID NUMBER NOT NULL,
Name VARCAHR2(128),
CONSTRAINT PK_MANUFACTURER
              PRIMARY KEY (ManufacturerID)
);

Database designs I have seen tend to not constrain the data in the database. For example make the Item.ManufacturerID non-nullable and make it a foreign key to the Manufacturer table. Similarly Manufacturer.Name and Item.Rate as non-nullable columns.

For now let’s talk about NOT NULL constraint. Many say they don’t make columns non-nullable, because they don’t know the requirements at design time, or they don’t want their tests to create elaborate sets of data, or that the application enforces the constraint then why enforce the constraint on the database?

When this design goes into production, bad data gets into the database over time. Since databases are used as integration points, users or other applications put data in to the database without using the application, you don’t have data entry screens for many setup kind of tables so data has to be put into the database without using the application. When this happens your data in no longer protected by the constraints on the application side and you are forced to counter the effects of bad data with more application changes like making joins with child tables as outer joins.

1
2
SELECT item.Name,Manufacturer.Name FROM Item, Manufacturer
	WHERE Item.ManufacturerID = Manufacturer.ManufacturerID(+)

Having null checks in the application code like

1
2
3
if (item.getManufacturer() != null) {
		return item.getManufacturer().getName();
}

You also have the very difficult task of fixing data later on. Like what Item should have which ManufacturerID

So why not make the columns non-nullable to begin with. If you don’t have the requirements when desiging the tables, then you should make them non-nullable when the requirement becomes clear later on or when you start enforcing constraints on the application side. Unit tests can be changed or made to work with not-null constraints on the database.

Making good decisions about Data Quality in your applications database design will better serve you in the long term maintainability of the application.