๐Ÿ—ƒ๏ธ Database Normalization: 1NF, 2NF, 3NF, BCNF, 4NF, 5NF | ๐Ÿš€ Exam Prep

Exam PrepDatabaseDBNormalizationDenormalization
Read in about 4 min read
Published: 2025-07-12
Last modified: 2025-07-12
View count: 57

Summary

Master database normalization, a key concept for data processing exams. This guide easily explains everything from 1NF to 5NF, functional dependencies, and denormalization with examples and practice questions.

๐Ÿ’ก Normalization is a frequent topic on data processing certification exams! Master it with this single post. A thorough understanding of each normal form, examples, and the final practice questions will bring you one step closer to passing.

๐Ÿ“ˆ What is Database Normalization?

Normalization is the process of systematically decomposing tables to minimize data redundancy and ensure data integrity in a relational database. Normalization helps prevent anomalies and improves the stability and maintainability of the data structure.

  • Anomalies: Problems that can occur due to data redundancy.
    • Insertion Anomaly: The inability to insert desired data without some other unnecessary data.
    • Deletion Anomaly: The loss of unintended data when certain information is deleted.
    • Update Anomaly: Data inconsistency that occurs when only some of the redundant data is updated.

๐Ÿ”‘ Functional Dependency

To understand normalization, you must first grasp the concept of Functional Dependency.

In a relation, if the value of attribute A uniquely determines the value of attribute B, we say "attribute B is functionally dependent on attribute A," denoted as A โ†’ B. Here, A is the Determinant, and B is the Dependent.

  • Full Functional Dependency: The dependent is determined by the entire primary key, not by any part of it. (Key to 2NF)
  • Partial Functional Dependency: The dependent is determined by a part of the primary key.
  • Transitive Functional Dependency: If A โ†’ B and B โ†’ C, then A โ†’ C holds. (Key to 3NF)

๐Ÿ”ข Summary of Normalization Stages

Normal FormCore ConditionTarget for Elimination
1NF (First Normal Form)All attribute values must be atomic.Multi-valued attributes
2NF (Second Normal Form)Must be in 1NF and eliminate partial functional dependencies. (Full functional dependency)Partial functional dependencies
3NF (Third Normal Form)Must be in 2NF and eliminate transitive functional dependencies.Transitive functional dependencies
BCNF (Boyce-Codd NF)Must be in 3NF and every determinant must be a candidate key.Determinants that are not candidate keys
4NF (Fourth Normal Form)Must be in BCNF and eliminate multi-valued dependencies.Multi-valued dependencies
5NF (Fifth Normal Form)Must be in 4NF and uses join dependencies.Join dependencies

1๏ธโƒฃ First Normal Form (1NF)

First Normal Form requires that all attribute values in a relation be atomic. This means a single attribute cannot hold multiple values.

  • Example: If you store the courses a student is taking, putting multiple values like 'Data Structures, Algorithms' in a Courses attribute violates 1NF.

[Before Normalization]

StudentIDNameCourses
1001John DoeData Structures, Algorithms
1002Jane SmithOperating Systems

[After 1NF Normalization]

StudentIDNameCourse
1001John DoeData Structures
1001John DoeAlgorithms
1002Jane SmithOperating Systems

2๏ธโƒฃ Second Normal Form (2NF)

Second Normal Form requires that the relation is in 1NF and all partial functional dependencies are removed. This means all non-key attributes must be fully functionally dependent on the entire primary key. (If the primary key is a single attribute, the relation is always in 2NF).

  • Example: In a course registration relation with (StudentID, CourseID) as the primary key, Grade depends on both (StudentID, CourseID), but ProfessorName depends only on CourseID (a partial functional dependency).

[Before Normalization] (Primary Key: StudentID, CourseID)

StudentIDCourseIDProfessorNameGrade
1001CS101Prof. KimA+
1001CS202Prof. LeeA0
1002CS101Prof. KimB+

[After 2NF Normalization]

[Registration Table]

StudentIDCourseIDGrade
1001CS101A+
1001CS202A0
1002CS101B+

[Course Table]

CourseIDProfessorName
CS101Prof. Kim
CS202Prof. Lee

3๏ธโƒฃ Third Normal Form (3NF)

Third Normal Form requires that the relation is in 2NF and all transitive functional dependencies are removed. A transitive dependency exists when A โ†’ B and B โ†’ C, which leads to A โ†’ C.

  • Example: If StudentID โ†’ DepartmentID and DepartmentID โ†’ DepartmentName, then there is a transitive dependency StudentID โ†’ DepartmentName.

[Before Normalization] (Primary Key: StudentID)

StudentIDNameDepartmentIDDepartmentName
1001John DoeCSComputer Science
1002Jane SmithEEElectrical Engineering
1003Peter PanCSComputer Science

[After 3NF Normalization]

[Student Table]

StudentIDNameDepartmentID
1001John DoeCS
1002Jane SmithEE
1003Peter PanCS

[Department Table]

DepartmentIDDepartmentName
CSComputer Science
EEElectrical Engineering

BCNF (Boyce-Codd Normal Form)

BCNF is a stricter version of 3NF. It requires that for every functional dependency, the determinant must be a candidate key. It addresses certain rare anomalies not handled by 3NF.

  • Example: In a relation where (StudentID, Course) is a candidate key, if there's a constraint that a professor can only teach one course (Professor โ†’ Course), then Professor is a determinant but not a candidate key. This violates BCNF.

[Before Normalization] (Candidate Key: (StudentID, Course))

StudentIDCourseProfessor
1001Data StructuresProf. Kim
1002Data StructuresProf. Kim
1001AlgorithmsProf. Lee

[After BCNF Normalization]

[Enrollment Table]

StudentIDProfessor
1001Prof. Kim
1002Prof. Kim
1001Prof. Lee

[ProfessorCourse Table]

ProfessorCourse
Prof. KimData Structures
Prof. LeeAlgorithms

4๏ธโƒฃ Fourth Normal Form (4NF) & 5๏ธโƒฃ Fifth Normal Form (5NF)

4NF and 5NF are rarely encountered in practice, but it's good to know the concepts.

  • Fourth Normal Form (4NF): Eliminates multi-valued dependencies (MVD). This dependency arises in 1:N relationships.
  • Fifth Normal Form (5NF): Deals with join dependencies (JD). It requires that every join dependency is implied by the candidate keys.

๐Ÿ”„ Denormalization

Denormalization is the process of intentionally violating normalization rules by combining tables or allowing redundancy to improve database performance. It is used to avoid costly joins, but it must be applied cautiously as it can risk data inconsistency.

  • Common Techniques: Table merging, table splitting (horizontal/vertical), adding redundant columns.

๐Ÿ“ Exam Prep Practice Questions

ProblemWhich normal form ensures that all attribute values in a relation are atomic?
Your Answer
Correct AnswerReveal Answer
ProblemWhich normal form aims to eliminate partial functional dependencies, ensuring all attributes are fully dependent on the primary key?
Your Answer
Correct AnswerReveal Answer
ProblemWhat is the normal form that focuses on removing transitive functional dependencies?
Your Answer
Correct AnswerReveal Answer
ProblemWhich normal form strengthens 3NF by requiring that every determinant in the relation must be a candidate key?
Your Answer
Correct AnswerReveal Answer
ProblemWhat is the process of decomposing table structures to minimize data redundancy and ensure integrity called?
Your Answer
Correct AnswerReveal Answer
ProblemWhat is the practice of intentionally adding redundancy or merging tables in a normalized database to improve system performance called?
Your Answer
Correct AnswerReveal Answer