Spring Batch

Spring Batch ItemReader and ItemWriter Example

In this tutorial we will discuss about the three most important interfaces of spring batch and an overview of Spring Batch item reader and writer with a sample application. One of the important goals of a batch processing framework is to read large amounts of data, perform some business processing/transformation and write out the result. Spring Batch Framework supports this bulk reading, processing and writing using three key interfaces: ItemReader, ItemProcessor and ItemWriter.

1. ItemReader

ItemReader is the means for providing data from many different types of input. ItemReader interface is the means for reading bulk data in a bulk processing system. There are many different implementations of ItemReader interface. All implementations are expected to be stateful and will be called multiple times for each batch, with each call to read() returning a different value and finally returning null when all input data is exhausted. Below are few frequently used implementations of ItemReader.

ItemReader Implementation Description
FlatFileItemReader Reads lines of data from input file. Typically read line describe records with fields of data defined by fixed positions in the file or delimited by some special character (e.g. Comma (,), Pipe (|) etc).
JdbcCursorItemReader Opens a JDBC cursor and continually retrieves the next row in the ResultSet.
StoredProcedureItemReader Executes a stored procedure and then reads the returned cursor and continually retrieves the next row in the ResultSet.

All the above implementations override the read() method from the ItemReader interface. The read method defines the most essential contract of the ItemReader. It returns one item or null if no more items are left. An item might represent a line in a file, a row in a database and so on.

2. ItemWriter

ItemWriter is similar in functionality to an ItemReader, but with inverse operations. ItemWriter is a interface for generic output operations. Implementation class will be responsible for serializing objects as necessary. Resources still need to be located, opened and closed but they differ in that an ItemWriter writes out, rather than reading in. For databases these may be inserts or updates.

The write method defines the most essential contract of the ItemWriter. It will attempt to write out the list of items passed in as long as it is open. As it is expected that items will be ‘batched’ together into a chunk and then output, the interface accepts a list of items, rather than an item by itself. Once the items are written out , any flushing that may be necessary can be performed before returning from the write method.

 

ItemWriter Implementation Description
FlatFileItemWriter Writes data to a file or stream. Uses buffered writer to improve performance.
StaxEventItemWriter An implementation of ItemWriter which uses StAX and Marshaller for serializing object to XML.

3. ItemProcessor

The ItemReader and ItemWriter interfaces are both very useful for their specific tasks, but what if you want to insert business logic before writing? An ItemProcessor is very simple interface for item transformation. Given one object, transform it and return another. Any business/transformation logic can be plugged into this component. Assume an ItemReader provides a class of type User, and it needs to be converted to type Employee before being written out. An ItemProcessor can be written that performs the conversion. Another typical use for an item processor is to filter out records before they are passed to the ItemWriter. Filtering simply indicates that a record should not be written.

Create Sample Application Example-

In this sample application we will describe all three interfaces implementation.
We required following technologies

  • Spring Tool Suite (STS)
  • JDK 1.6
  • Spring Core 3.2.2.RELEASE
  • Spring Batch 2.2.0.RELEASE

Project Directory Structure-

Review the final project structure, a standard java project.

Create Custom Item Reader-

Below is our custom item reader. Each time it is called, it returns the next element from the list and returns null if the list is exhausted.

CustomItemReader.java

package com.doj.batch.reader;

import java.util.List;

import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;

/**
 * @author Dinesh Rajput
 *
 */
public class CustomItemReader implements ItemReader<String>{
 
 private List<String> bookNameList;
 private int bookCount = 0;
 @Override
 public String read() throws Exception, UnexpectedInputException,
   ParseException {
  if(bookCount < bookNameList.size()){
   return bookNameList.get(bookCount++);
  }else{
   return null;
  }
 }
 public List<String> getUserNameList() {
  return bookNameList;
 }
 public void setBookNameList(List<String> bookNameList) {
  this.bookNameList = bookNameList;
 }
 
}

Create Custom Item Processor-

CustomItemProcessor is simple custom ItemProcessor which transforms every element returned by the ItemReader. Here book name return with respective author name.

CustomItemProcessor.java

package com.doj.batch.processor;

import org.springframework.batch.item.ItemProcessor;

/**
 * @author Dinesh Rajput
 *
 */
public class CustomItemProcessor implements ItemProcessor<String, String> {

 @Override
 public String process(String bookNameWithoutAuthor) throws Exception {
  String bookNameWithAuthor = "Book Name - "+bookNameWithoutAuthor+" | Author Name - ";
  if("Effective Java".equalsIgnoreCase(bookNameWithoutAuthor)){
   bookNameWithAuthor += "Joshua Bloch";
  }else if("Design Patterns".equalsIgnoreCase(bookNameWithoutAuthor)){
   bookNameWithAuthor += "Erich Gamma";
  }else if("Refactoring".equalsIgnoreCase(bookNameWithoutAuthor)){
   bookNameWithAuthor += "Martin Fowler";
  }else if("Head First Java".equalsIgnoreCase(bookNameWithoutAuthor)){
   bookNameWithAuthor += "Kathy Sierra";
  }else if("Thinking in Java".equalsIgnoreCase(bookNameWithoutAuthor)){
   bookNameWithAuthor += " Bruce Eckel";
  }
  return bookNameWithAuthor;
 }

}

Create Custom Item Writer-

CustomItemWriter is custom ItemWriter which outputs the transformed item(s) returned by our CustomItemProcessor.

CustomItemWriter.java

package com.doj.batch.writer;

import java.util.List;

import org.springframework.batch.item.ItemWriter;

/**
 * @author Dinesh Rajput
 *
 */
public class CustomItemWriter implements ItemWriter<String> {

 @Override
 public void write(List<? extends String> bookNameWithAuthor) throws Exception {
  System.out.println(bookNameWithAuthor);
 }

}

Create Application Context XML-

Below is the applicationContext.xml which is required to create JobRepository, JobLauncher and TransactionManager.

JobRepository

Repository is responsible for persistence of batch meta-data information. SimpleJobRepository is an implementation of JobRepository that stores JobInstances, JobExecutions, and StepExecutions information using the DAOs injected via constructure arguments. Spring Batch supports two implementation of these DAOs: Map based (in-memory) and Jdbc based. In real enterprise application the Jdbc variants are preffered but we will use simpler in-memory alternatives (MapJobInstanceDao, MapJobExecutionDao, MapStepExecutionDao, MapExecutionContextDao) in this example.

JobLauncher

As name suggests it is responsible for launching batch job. We are using SimpleJobLauncher implementation which requires only one dependency, a JobRepository. JobRepository is used to obtain a valid JobExecution. Repository must be used because the provided Job could be a restart of an existing JobInstance, and only the Repository can reliably recreate it.

TransactionManager

As this example won’t be dealing with transactional data, we are using ResourcelessTransactionManager which is mainly used for testing purpose.

applicationContext.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
 xmlns:context="http://www.springframework.org/schema/context"
 xmlns:p="http://www.springframework.org/schema/p" 
 xmlns:mvc="http://www.springframework.org/schema/mvc" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-4.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-4.0.xsd
http://www.springframework.org/schema/mvc
http://www.springframework.org/schema/mvc/spring-mvc-4.0.xsd">

 <bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
 
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
 
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
        <property name="transactionManager" ref="transactionManager"/>
    </bean>
 
    <bean id="simpleJob" class="org.springframework.batch.core.job.SimpleJob" abstract="true">
        <property name="jobRepository" ref="jobRepository" />
    </bean>
 
</beans>

Create Job configuration XML-

It’s time to wire the above 3 components together into a job which will perform reading, processing and transformation work for us.
simple-job.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
 xmlns:context="http://www.springframework.org/schema/context"
 xmlns:p="http://www.springframework.org/schema/p" 
 xmlns:batch="http://www.springframework.org/schema/batch"
 xmlns:mvc="http://www.springframework.org/schema/mvc" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-4.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-4.0.xsd
http://www.springframework.org/schema/mvc
http://www.springframework.org/schema/mvc/spring-mvc-4.0.xsd
http://www.springframework.org/schema/batch 
http://www.springframework.org/schema/batch/spring-batch-2.0.xsd">

 <import resource="applicationContext.xml"/>
 
 <bean id="customReader" class="com.doj.batch.reader.CustomItemReader" >
        <property name="bookNameList" >
            <list>
                <value>Effective Java</value>
                <value>Design Patterns</value>
                <value>Refactoring</value>
                <value>Thinking in Java</value>
                <value>Head First Java</value>
            </list>
        </property>
    </bean>
 
    <bean id="customProcessor" class="com.doj.batch.processor.CustomItemProcessor" />
 
    <bean id="customWriter" class="com.doj.batch.writer.CustomItemWriter" /> 
    
    <batch:job id="simpleDojJob" job-repository="jobRepository" parent="simpleJob">
     <batch:step id="step1">
      <batch:tasklet transaction-manager="transactionManager">
       <batch:chunk reader="customReader" processor="customProcessor" writer="customWriter" commit-interval="1"/>
      </batch:tasklet>
     </batch:step>
    </batch:job>   
</beans>

First we create three beans (customReader, customWriter, customProcessor) corresponding to CustomItemReader, CustomItemWriter and CustomItemProcessor. Note that the customReader is injected with the list of book names. This list is the source of data for the customReader bean.

Later we created a simpleStep bean using the SimpleStepFactoryBean class. Most common configuration options for simple steps should be found in this factory class. We injected the jobRepository, transactionManager, customReader, customWriter and customProcessor in this simple step bean. Note the property commitInterval which is set to 1. This tells Spring Batch that the commit should happen after 1 element .i.e. writer will write 1 item at a time.

Launching Batch Job-

Spring Batch comes with a simple utility class called CommandLineJobRunner which has a main() method which accepts two arguments. First argument is the spring application context file containing job definition and the second is the name of the job to be executed.

Now run as a java application with both two arguments.
org.springframework.batch.core.launch.support.CommandLineJobRunner
simple-job.xml simpleDojJob

Following output-

[Book Name – Effective Java | Author Name – Joshua Bloch]
[Book Name – Design Patterns | Author Name – Erich Gamma]
[Book Name – Refactoring | Author Name – Martin Fowler]
[Book Name – Thinking in Java | Author Name – Bruce Eckel]
[Book Name – Head First Java | Author Name – Kathy Sierra]

Download Source Code with Libs
SpringBatchReaderWriterExample.zip

 

Previous
Next
Dinesh Rajput

Dinesh Rajput is the chief editor of a website Dineshonjava, a technical blog dedicated to the Spring and Java technologies. It has a series of articles related to Java technologies. Dinesh has been a Spring enthusiast since 2008 and is a Pivotal Certified Spring Professional, an author of a book Spring 5 Design Pattern, and a blogger. He has more than 10 years of experience with different aspects of Spring and Java design and development. His core expertise lies in the latest version of Spring Framework, Spring Boot, Spring Security, creating REST APIs, Microservice Architecture, Reactive Pattern, Spring AOP, Design Patterns, Struts, Hibernate, Web Services, Spring Batch, Cassandra, MongoDB, and Web Application Design and Architecture. He is currently working as a technology manager at a leading product and web development company. He worked as a developer and tech lead at the Bennett, Coleman & Co. Ltd and was the first developer in his previous company, Paytm. Dinesh is passionate about the latest Java technologies and loves to write technical blogs related to it. He is a very active member of the Java and Spring community on different forums. When it comes to the Spring Framework and Java, Dinesh tops the list!

Share
Published by
Dinesh Rajput

Recent Posts

Strategy Design Patterns using Lambda

Strategy Design Patterns We can easily create a strategy design pattern using lambda. To implement…

2 years ago

Decorator Pattern using Lambda

Decorator Pattern A decorator pattern allows a user to add new functionality to an existing…

2 years ago

Delegating pattern using lambda

Delegating pattern In software engineering, the delegation pattern is an object-oriented design pattern that allows…

2 years ago

Spring Vs Django- Know The Difference Between The Two

Technology has emerged a lot in the last decade, and now we have artificial intelligence;…

3 years ago

TOP 20 MongoDB INTERVIEW QUESTIONS 2022

Managing a database is becoming increasingly complex now due to the vast amount of data…

3 years ago

Scheduler @Scheduled Annotation Spring Boot

Overview In this article, we will explore Spring Scheduler how we could use it by…

3 years ago