Pre-Requisite: Transaction Management
The Checkpoint is used to declare a point before which the DBMS was in a consistent state, and all transactions were committed. During transaction execution, such checkpoints are traced. After execution, transaction log files will be created. Upon reaching the savepoint/checkpoint, the log file is destroyed by saving its update to the database. Then a new log is created with upcoming execution operations of the transaction and it will be updated until the next checkpoint and the process continues.
Why do We Need Checkpoints?
Whenever transaction logs are created in a real-time environment, it eats up lots of storage space. Also keeping track of every update and its maintenance may increase the physical space of the system. Eventually, the transaction log file may not be handled as the size keeps growing. This can be addressed with checkpoints. The methodology utilized for removing all previous transaction logs and storing them in permanent storage is called a Checkpoint.
Steps to Use Checkpoints in the Database
- Write the begin_checkpoint record into a log.
- Collect checkpoint data in stable storage.
- Write the end_checkpoint record into a log.
The behavior when the system crashes and recovers when concurrent transactions are executed is shown below:

Understanding Checkpoints in multiple Transactions
Transactions and operations of the above diagram:
Transaction 1 (T1) |
Transaction 2 (T2) |
Transaction 3 (T3) |
Transaction 4 (T4) |
START |
|
|
|
|
START |
|
|
|
COMMIT |
|
|
|
|
START |
|
|
|
COMMIT |
|
|
|
|
START |
|
|
|
FAILURE |
- The recovery system reads the logs backward from the end to the last checkpoint i.e. from T4 to T1.
- It will keep track of two lists – Undo and Redo.
- Whenever there is a log with instructions <Tn, start>and <Tn, commit> or only <Tn, commit> then it will put that transaction in Redo List. T2 and T3 contain <Tn, Start> and <Tn, Commit> whereas T1 will have only <Tn, Commit>. Here, T1, T2, and T3 are in the redo list.
- Whenever a log record with no instruction of commit or abort is found, that transaction is put to Undo List <Here, T4 has <Tn, Start> but no <Tn, commit> as it is an ongoing transaction. T4 will be put on the undo list.
All the transactions in the redo list are deleted with their previous logs and then redone before saving their logs. All the transactions in the undo list are undone and their logs are deleted.
Types of Checkpoints
There are basically two main types of Checkpoints:
- Automatic Checkpoint
- Manual Checkpoint
1. Automatic Checkpoint: These checkpoints occur very frequently like every hour or every day. These intervals are set by the database administrator. They are generally used by heavy databases as they are frequently updated, and we can recover the data easily in case of failure.
2. Manual Checkpoint: These are the checkpoints that are manually set by the database administrator. Manual checkpoints are generally used for smaller databases. They are updated very less frequently only when they are set by the database administrator.
Relevance of Checkpoints
A checkpoint is a feature that adds a value of C in ACID-compliant to RDBMS. A checkpoint is used for recovery if there is an unexpected shutdown in the database. Checkpoints work on some intervals and write all dirty pages (modified pages) from logs relay to data file from i.e from a buffer to a physical disk. It is also known as the hardening of dirty pages. It is a dedicated process and runs automatically by SQL Server at specific intervals. The synchronization point between the database and transaction log is served with a checkpoint.
Advantages of Checkpoints
- Checkpoints help us in recovering the transaction of the database in case of a random shutdown of the database.
- It enhancing the consistency of the database in case when multiple transactions are executing in the database simultaneously.
- It increasing the data recovery process.
- Checkpoints work as a synchronization point between the database and the transaction log file in the database.
- Checkpoint records in the log file are used to prevent unnecessary redo operations.
- Since dirty pages are flushed out continuously in the background, it has a very low overhead and can be done frequently.
- Checkpoints provide the baseline information needed for the restoration of the lost state in the event of a system failure.
- A database checkpoint keeps track of change information and enables incremental database backup.
- A database storage checkpoint can be mounted, allowing regular file system operations to be performed.
- Database checkpoints can be used for application solutions which include backup, recovery or database modifications.
Disadvantages of Checkpoints
1. Database storage checkpoints can only be used to restore from logical errors (E.g. a human error).
2. Because all the data blocks are on the same physical device, database storage checkpoints cannot be used to restore files due to a media failure.
Real-Time Applications of Checkpoints
- Backup and Recovery
- Performance Optimization
- Auditing
1. Checkpoint and Recovery
A checkpoint is one of the key tools which helps in the recovery process of the database. In case of a system failure, DBMS can find the information stored in the checkpoint to recover the database till its last known stage.
The speed of recovery in case of a system failure depends on the duration of the checkpoint set by the database administrator. For Example, if the checkpoint interval is set to a shorter duration, it helps in faster recovery and vice-versa. If more frequent checkpoint has to be written to disk, it can also impact the performance.
2. Importance of Checkpoint in Performance Optimization
Checkpoint plays an essential role in the Recovery of the database. Still, it also plays a vital role in improving the performance of DBMS, and this can be done by reducing the amount of work that should be done during recovery. It can discard any unnecessary information which helps to keep the database clean and better for optimization purposes.
Another way in which checkpoint is used to improve the performance of the database is by reducing the amount of data that is to be read from the disk in case of recovery. Analyzing the checkpoints clearly helps in minimizing the data that is to be read from the disk, which improves the recovery time. and in that way, it helps in Performance Optimization.
3. Checkpoints and Auditing
Checkpoints can be used for different purposes like Performance Optimization, it can also be used for Auditing Purposes. Checkpoints help view the database’s history and identify any problem that had happened at any particular time.
In case of any type of failure, database administrators can use the checkpoint to determine when it has happened and what amount of data has been affected.
Conclusion
Checkpoints are one of the essential elements of a Database Management System. The major contribution of Checkpoints in DBMS is, it helps in faster recovery of the system in case of system failure or crash of the system. Checkpoints mark the position till the consistency of the transaction is maintained. It is a useful component in the recovery of the database.
Similar Reads
Consistency in DBMS
Data integrity and reliability are key in the domain of Database Management Systems (DBMS). Consistency, one of the core principles in DBMS, ensures that every transaction is made according to predefined rules and limits thus preserving the accuracy and authenticity of data kept within. The change t
6 min read
Concurrency Control in DBMS
In a database management system (DBMS), allowing transactions to run concurrently has significant advantages, such as better system resource utilization and higher throughput. However, it is crucial that these transactions do not conflict with each other. The ultimate goal is to ensure that the data
7 min read
Cascadeless in DBMS
In a Database Management System (DBMS), maintaining data consistency and avoiding unnecessary complications during transaction execution are critical. A cascadeless schedule is a type of transaction schedule that ensures reliability by preventing cascading rollbacks. Cascading rollbacks occur when t
5 min read
Commit Protocol in DBMS
This article covers topics related to the database management system. In this article, we will learn about the commit protocols that are in the subject of database management systems. This article then describes the types of Commit protocols in database management systems. It then talks about the ad
5 min read
RDBMS Architecture
RDBMS stands for Relational Database Management System and it implements SQL. In the real-world scenario, people use the Relational Database Management System to collect information and process it, to provide service. E.g. In a ticket processing system, details about us (e.g. age, gender) and our jo
3 min read
Blind Write in DBMS
Blind writing is like what its name suggests means writing data blindly in the database. This refers to a feature in the database in which the write operation of the database system can be performed without any confirmation or verification of existing data. This is highly suitable in work-heavy appl
6 min read
Database Buffer in DBMS
An information base cushion is a transitory stockpiling territory in primary memory. It permits putting away information incidentally while moving to start with one spot and then onto the next. An information base buffer stores a duplicate of plate blocks. However, a variant of square duplicates on
3 min read
Transaction Control in DBMS
The transaction is a single logical unit that accesses and modifies the contents of the database. Transactions access data using read and write operations. Transaction is a single operation of processing that can have many operations. Transaction is needed when more than one user wants to access sam
5 min read
Codd's Rules in DBMS
Codd's rules are proposed by a computer scientist named Dr. Edgar F. Codd and he also invent the relational model for database management. These rules are made to ensure data integrity, consistency, and usability. This set of rules basically signifies the characteristics and requirements of a relati
3 min read
Failure Classification in DBMS
Failure in terms of a database can be defined as its inability to execute the specified transaction or loss of data from the database. A DBMS is vulnerable to several kinds of failures and each of these failures needs to be managed differently. There are many reasons that can cause database failures
2 min read