Normalization

Other topics

First normal form (1NF)

A relation(or relation schema) in a given database is in first normal form, if the domain of all attributes of that relation is atomic. A domain is atomic if all the elements of that domain are considered to indivisible units. Suppose a relation employee, has attribute name, then the relation is not in first normal form, because the elements of domain of attribute name , can be divided into first name and last name.

In a nutshell, if a relation has composite attributes, then it is not in first normal form. Suppose we have the following relation:

EmpIdfirst namelast namesalaryposition
Deptx-101Johnsmith12000intermediate
Depty-201CarolyneWilliams18900manager

The EmpId of first row can be broken into : Deptx (which is used to identiy department) and 101, is a unique number assigned within the organization. Clearly, the domain of attribute EmpId is not atomic and hence our relation is not in first normal form.

Disadvantes faced:

  1. When such employee id's are used, the department of an employee can be found by writing code that breaks up the structure of EmpId into Deptx and 101, which requires extra programming. Also information gets encoded in program rather than in database.
  2. Suppose a particular employee has to change department, then the attribute EmpId, would have to be updated everywhere it is used.

We can make our relation satisfy first normal form, by splitting it into following two relations:

Relation 1

EmpIdfirst namelast namedepartment
101JohnsmithDeptx
201CarolyneWilliamsDepty

Relation 2

EmpIdsalaryposition
10112000intermediate
20118900manager

Now, if we have to change the department, we have to do it only once in the relation 1, also determining department is easier now.

Second Normal Form (2NF)

To normalize the database in the second form, there must not be any partial dependency of any column on primary key.

Let's consider the following example:

idnamedobsubject
1Mark1-1-1981Physics
2Jack2-2-1982Math
2Jack2-2-1982Biology
3John3-3-1983Math

This table is considered to have a composite primary key (id and subject), but the *name and dob columns only depends on the id, not the subject, so they have partial dependency on the primary key. As a result, we can see the redundancy of information in the table. To normalize the database in the second form, we must split this table into two tables like this:

Students table

idnamedob
1Mark1-1-1981
2Jack2-2-1982
3John3-3-1983

Subjects table

student_idsubject
1Physics
2Math
2Biology
3Math

The student_id column of the Subjects table is a foreign-key that references the primary key id of the Students table.

Contributors

Topic Id: 5111

Example Ids: 18055,18056

This site is not affiliated with any of the contributors.