Hibernate

Hibernate Batch Processing

Suppose there is one situation in which you have to insert 1000000 records in to database in a time. So what to do in this situation…

In Native Solution in the Hibernate

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<1000000; i++ )
{
    Student student = new Student(.....);
    session.save(student);
}
tx.commit();
session.close();
Because by default, Hibernate will cache all the persisted objects in the session-level cache and ultimately your application would fall over with an OutOfMemoryException somewhere around the 50,000th row. You can resolve this problem if you are using batch processing with Hibernate.
To use the batch processing feature, first set hibernate.jdbc.batch_size as batch size to a number either at 20 or 50 depending on object size. This will tell the hibernate container that every X rows to be inserted as batch. To implement this in your code we would need to do little modification as follows:
Session session = SessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<1000000; i++ ) 
{
    Student student = new Student(.....);
    session.save(employee);
    if( i % 50 == 0 ) // Same as the JDBC batch size
    { 
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}
tx.commit();
session.close();
Above code will work fine for the INSERT operation, but if you want to make UPDATE operation then you can achieve using the following code:
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
ScrollableResults studentCursor = session.createQuery("FROM STUDENT").scroll();
int count = 0;
while(studentCursor .next())
 {
   Student student = (Student) studentCursor.get(0);
   student.setName("DEV");
   seession.update(student); 
   if ( ++count % 50 == 0 ) {
      session.flush();
      session.clear();
   }
}
tx.commit();
session.close();

In Batch Processing Solution in the Hibernate 

If you are undertaking batch processing you will need to enable the use of JDBC batching. This is absolutely essential if you want to achieve optimal performance. Set the JDBC batch size to a reasonable number (10-50).
hibernate.jdbc.batch_size 50
 You can also do this kind of work in a process where interaction with the second-level cache is completely disabled:
hibernate.cache.use_second_level_cache false

hibernate.cfg.xml

<hibernate-configuration> 
 <session-factory> 
  
   <property name="connection.driver_class">com.mysql.jdbc.Driver</property> 
   <property name="connection.url">jdbc:mysql://localhost:3306/hibernateDB2</property> 
   <property name="connection.username">root</property> 
   <property name="connection.password">root</property> 

  
   <property name="connection.pool_size">1</property> 
   
   
   <property name="hibernate.jdbc.batch_size"> 50 </property>

  
   <property name="dialect">org.hibernate.dialect.MySQLDialect</property> 

  
    <property name="current_session_context_class">thread</property> 
  
  
   <property name="hibernate.cache.use_second_level_cache">false</property>
   <property name="cache.provider_class">org.hibernate.cache.EhCacheProvider</property> 

  
   <property name="show_sql">true</property> 
  
  
   <property name="hbm2ddl.auto">update</property> 
   
   
   <mapping class="com.sdnext.hibernate.tutorial.dto.Student">
      
  </mapping></session-factory> 
 </hibernate-configuration>

Student.java

package com.sdnext.hibernate.tutorial.dto;

import java.io.Serializable;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name="STUDENT")
public class Student implements Serializable 
{
 /**
  * serialVersionUID
  */
 private static final long serialVersionUID = 8633415090390966715L;
 @Id
 @Column(name="ID")
 @GeneratedValue(strategy=GenerationType.AUTO)
 private int id;
 @Column(name="STUDENT_NAME")
 private String studentName;
 @Column(name="ROLL_NUMBER")
 private int rollNumber;
 @Column(name="COURSE")
 private String course;
 public int getId() {
  return id;
 }
 public void setId(int id) {
  this.id = id;
 }
 public String getStudentName() {
  return studentName;
 }
 public void setStudentName(String studentName) {
  this.studentName = studentName;
 }
 public int getRollNumber() {
  return rollNumber;
 }
 public void setRollNumber(int rollNumber) {
  this.rollNumber = rollNumber;
 }
 public String getCourse() {
  return course;
 }
 public void setCourse(String course) {
  this.course = course;
 }
 public String toString()
 {
  return "ROLL Number: "+rollNumber+"| Name: "+studentName+"| Course: "+course;
 }
}

HibernateTestDemo.java

package com.sdnext.hibernate.tutorial;

import org.hibernate.Session;
import org.hibernate.SessionFactory;
import org.hibernate.Transaction;
import org.hibernate.cfg.AnnotationConfiguration;

import com.sdnext.hibernate.tutorial.dto.Student;


public class HibernateTestDemo {

 /**
  * @param args
  */
 public static void main(String[] args) 
 {
  SessionFactory sessionFactory = new AnnotationConfiguration().configure().buildSessionFactory();
  Session session = sessionFactory.openSession();
  Transaction transaction = session.beginTransaction();
  
  for ( int i=0; i<100000; i++ )
  {
            String studentName = "DINESH " + i;
            int rollNumber = 9 + i;
            String course = "MCA " + i;
            Student student = new Student();
            student.setStudentName(studentName);
            student.setRollNumber(rollNumber);
            student.setCourse(course);
            session.save(student);
          if( i % 50 == 0 ) 
          {
               session.flush();
               session.clear();
            }
  }
  transaction.commit();
  session.close();
 }

}
Output: ……………………………
…………………………….
………………………………
Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) Hibernate: insert into STUDENT (COURSE, ROLL_NUMBER, STUDENT_NAME) values (?, ?, ?) ………………………………
……………………………
…………………………

This will create 100000 records in STUDENT table.
Hibernate batch processing is powerful but it has many pitfalls that developers must be aware of in order to use it properly and efficiently. Most people who use batch probably find out about it by trying to perform a large operation and finding out the hard way why batching is needed. They run out of memory. Once this is resolved they assume that batching is working properly. The problem is that even if you are flushing your first level cache, you may not be batching your SQL statements.
Hibernate flushes by default for the following reasons:

1. Before some queries
2. When commit() is executed
3. When session.flush() is executed

The thing to note here is that until the session is flushed, every persistent object is placed into the first level cache (your JVM’s memory). So if you are iterating over a million objects you will have at least a million objects in memory.

To avoid this problem you need to call the flush() and then clear() method on the session at regular intervals. Hibernate documentation recommends that you flush every n records where n is equal to the hibernate.jdbc.batch_size parameter. A Hibernate Batch example shows a trivial batch process.
There are two reasons for batching your hibernate database interactions. The first is to maintain a reasonable first level cache size so that you do not run out memory. The second is that you want to batch the inserts and updates so that they are executed efficiently by the database. The example above will accomplish the first goal but not the second.

Student student = new Student();
Address address = new Address();
student.setName("DINESH RAJPUT");
address.setCity("DELHI");
student.setAddress(address);
session.save(student);

The problem is Hibernate looks at each SQL statement and checks to see if it is the same statement as the previously executed statement. If they are and if it hasn’t reached the batch_size it will batch those two statements together using JDBC2 batch. However, if your statements look like the example above, hibernate will see alternating insert statements and will flush an individual insert statement for each record processed. So 1 million new students would equal a total of 2 million insert statements in this case. This is extremely bad for performance.

<<Previous Chapter 33

 

Previous
Next
Dinesh Rajput

Dinesh Rajput is the chief editor of a website Dineshonjava, a technical blog dedicated to the Spring and Java technologies. It has a series of articles related to Java technologies. Dinesh has been a Spring enthusiast since 2008 and is a Pivotal Certified Spring Professional, an author of a book Spring 5 Design Pattern, and a blogger. He has more than 10 years of experience with different aspects of Spring and Java design and development. His core expertise lies in the latest version of Spring Framework, Spring Boot, Spring Security, creating REST APIs, Microservice Architecture, Reactive Pattern, Spring AOP, Design Patterns, Struts, Hibernate, Web Services, Spring Batch, Cassandra, MongoDB, and Web Application Design and Architecture. He is currently working as a technology manager at a leading product and web development company. He worked as a developer and tech lead at the Bennett, Coleman & Co. Ltd and was the first developer in his previous company, Paytm. Dinesh is passionate about the latest Java technologies and loves to write technical blogs related to it. He is a very active member of the Java and Spring community on different forums. When it comes to the Spring Framework and Java, Dinesh tops the list!

Share
Published by
Dinesh Rajput

Recent Posts

Strategy Design Patterns using Lambda

Strategy Design Patterns We can easily create a strategy design pattern using lambda. To implement…

2 years ago

Decorator Pattern using Lambda

Decorator Pattern A decorator pattern allows a user to add new functionality to an existing…

2 years ago

Delegating pattern using lambda

Delegating pattern In software engineering, the delegation pattern is an object-oriented design pattern that allows…

2 years ago

Spring Vs Django- Know The Difference Between The Two

Technology has emerged a lot in the last decade, and now we have artificial intelligence;…

3 years ago

TOP 20 MongoDB INTERVIEW QUESTIONS 2022

Managing a database is becoming increasingly complex now due to the vast amount of data…

3 years ago

Scheduler @Scheduled Annotation Spring Boot

Overview In this article, we will explore Spring Scheduler how we could use it by…

3 years ago