In Spring Batch, we often need read data from CSV file and write it into relational database or NoSQL or convert it into another format like XML or JSON. There are few source systems like MainFrame which generates FlatFile and we must need to read data from the file, processes or transform it and save to the location where its intended.
Lets begin – In this tutorial we’ll use MySQL as a database for persistent store.
pom.xml – This file contains a list of dependencies required to run the project.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>io.spring.batch</groupId>
<artifactId>database-output</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>databaseOutput</name>
<description>JDBC based item writing</description>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.1.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<java.version>1.8</java.version>
</properties>
<dependencies>
<!-- Spring Batch -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<!-- JDBC -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<!-- MYSQL DB -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<!-- Test -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- Lombok -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Person.java – Its a domain class which holds Person details.
@Data
@AllArgsConstructor
@NoArgsConstructor
@Builder
public class Person {
private long id;
private String firstName;
private String lastName;
private Date birthdate;
}
PersonFieldSetMapper – Interface that is used to map data obtained from a FieldSet
into an object.
public class PersonFieldSetMapper implements FieldSetMapper<Person> {
@Override
public Person mapFieldSet(FieldSet fieldSet) throws BindException {
return new Person(fieldSet.readLong("id"),
fieldSet.readString("firstName"),
fieldSet.readString("lastName"),
fieldSet.readDate("birthdate", "yyyy-MM-dd HH:mm:ss"));
}
}
application.properties – This file holds the database related information which helps spring boot to create datasource.
spring.batch.job.enabled=false means we’re disabling batch to run on start of application context.
spring.batch.initialize-schema=always – means we’re asking Spring Batch to create the metadata tables. Metadata tables holds information about Batch Job instance, execution and parameter details.
spring.datasource.driverClassName=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/test
spring.datasource.username=root
spring.datasource.password=root
spring.datasource.platform=mysql
spring.datasource.continueOnError=false
spring.batch.initialize-schema=always
spring.batch.job.enabled=false
JobConfiguration.java
- FlatFileItemReader – Restartable
ItemReader
that reads lines from inputsetResource(Resource)
. Line is defined by thesetRecordSeparatorPolicy(RecordSeparatorPolicy)
and mapped to item usingsetLineMapper(LineMapper)
. If an exception is thrown during line mapping it is rethrown asFlatFileParseException
adding information about the problematic line and its line number. - JdbcBatchItemWriter – The writer is thread-safe after its properties are set (normal singleton behavior), so it can be used to write in multiple concurrent transactions.
ItemWriter
that uses the batching features fromNamedParameterJdbcTemplate
to execute a batch of statements for all items provided. The user must provide an SQL query and a special callback for either ofItemPreparedStatementSetter
orItemSqlParameterSourceProvider
. You can use either named parameters or the traditional ‘?’ placeholders. - Step – Batch domain interface representing the configuration of a step. As with the
Job
, aStep
is meant to explicitly represent the configuration of a step by a developer, but also the ability to execute the step. - Job – Batch domain object representing a job. Job is an explicit abstraction representing the configuration of a job specified by a developer. It should be noted that restart policy is applied to the job as a whole and not to a step.
import javax.sql.DataSource;
import io.spring.batch.domain.Person;
import io.spring.batch.domain.PersonFieldSetMapper;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
@Configuration
public class JobConfiguration {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Autowired
public DataSource dataSource;
@Bean
public FlatFileItemReader<Person> personItemReader() {
FlatFileItemReader<Person> reader = new FlatFileItemReader<>();
reader.setLinesToSkip(1);
reader.setResource(new ClassPathResource("/data/person.csv"));
DefaultLineMapper<Person> customerLineMapper = new DefaultLineMapper<>();
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setNames(new String[] {"id", "firstName", "lastName", "birthdate"});
customerLineMapper.setLineTokenizer(tokenizer);
customerLineMapper.setFieldSetMapper(new PersonFieldSetMapper());
customerLineMapper.afterPropertiesSet();
reader.setLineMapper(customerLineMapper);
return reader;
}
@SuppressWarnings({ "rawtypes", "unchecked" })
@Bean
public JdbcBatchItemWriter<Person> personItemWriter() {
JdbcBatchItemWriter<Person> itemWriter = new JdbcBatchItemWriter<>();
itemWriter.setDataSource(this.dataSource);
itemWriter.setSql("INSERT INTO PERSON VALUES (:id, :firstName, :lastName, :birthdate)");
itemWriter.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider());
itemWriter.afterPropertiesSet();
return itemWriter;
}
@Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<Person, Person>chunk(10)
.reader(personItemReader())
.writer(personItemWriter())
.build();
}
@Bean
public Job job() {
return jobBuilderFactory.get("job")
.start(step1())
.build();
}
}
MainApp – This is spring boot main class, just right click and run the app
- JobLauncher – Simple interface for controlling jobs, including possible ad-hoc executions, based on different runtime identifiers. It is extremely important to note that this interface makes absolutely no guarantees about whether or not calls to it are executed synchronously or asynchronously. The javadocs for specific implementations should be checked to ensure callers fully understand how the job will be run.
import java.util.UUID;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
@EnableBatchProcessing
public class SaveinDBApplication implements CommandLineRunner {
@Autowired
private JobLauncher jobLauncher;
@Autowired
private Job job;
public static void main(String[] args) {
SpringApplication.run(DatabaseOutputApplication.class, args);
}
@Override
public void run(String... args) throws Exception {
JobParameters jobParameters = new JobParametersBuilder()
.addString("date", UUID.randomUUID().toString())
.addLong("JobId",System.currentTimeMillis())
.addLong("time",System.currentTimeMillis()).toJobParameters();
JobExecution execution = jobLauncher.run(job, jobParameters);
System.out.println("STATUS :: "+execution.getStatus());
}
}
person.csv – This file holds sample data to be read. Create this file under data folder under src/main/resources.
id,firstName,lastName,birthdate
1,Neha,Limay,1964-10-19 14:11:03
2,Gaurav,Nene,1977-12-11 21:44:30
3,Sandeep,Joshi,1986-12-25 11:54:28
4,Parag,Rane,1959-07-24 06:00:16
5,Sachit,Patil,1956-09-14 06:49:28
6,Suchita,Vinchurkar,1984-08-30 04:18:10
7,Aditi,Nerkar,1973-02-04 05:26:05
8,Shankar,Parate,1953-04-26 11:16:26
9,Tania,Datta,1951-06-24 14:56:51
10,Sandhya,Potthuri,1953-08-27 13:15:08
11,Santosh,Pande,1957-09-05 21:36:47
12,Ajinkya,Deshpande,1979-01-21 18:31:27
13,Kiran,Giradkar,1965-07-18 15:05:22
14,Deepak,Patil,1990-09-11 15:52:54
15,Suraj,Bhamre,1979-06-01 06:58:54
16,Anup,Tarone,1990-07-02 17:36:35
17,Pradeep,Rodge,1959-12-19 20:23:12
18,Pawan,Bawankar,1984-12-27 10:36:49
19,Sheetal,Kale,1962-06-23 20:03:40
20,Reetisha,Hedau,1988-11-12 19:05:13
schema.sql – Please make sure to create the table before you run the app.
CREATE TABLE `person` (
`id` mediumint(8) unsigned NOT NULL,
`firstName` varchar(255) default NULL,
`lastName` varchar(255) default NULL,
`birthdate` varchar(255),
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
Console – You should be able to see below logs. This shows batch job has executed successfully.
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v2.1.1.RELEASE)
2021-01-04 14:31:30.519 INFO 21148 --- [ main] i.s.batch.DatabaseOutputApplication : Starting DatabaseOutputApplication on with PID 21148 ()
2021-01-04 14:31:30.522 INFO 21148 --- [ main] i.s.batch.DatabaseOutputApplication : No active profile set, falling back to default profiles: default
2021-01-04 14:31:31.095 INFO 21148 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2021-01-04 14:31:31.595 INFO 21148 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2021-01-04 14:31:31.725 INFO 21148 --- [ main] o.s.b.c.r.s.JobRepositoryFactoryBean : No database type set, using meta data indicating: MYSQL
2021-01-04 14:31:31.784 INFO 21148 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
2021-01-04 14:31:31.885 INFO 21148 --- [ main] i.s.batch.DatabaseOutputApplication : Started DatabaseOutputApplication in 1.574 seconds (JVM running for 2.627)
2021-01-04 14:31:31.998 INFO 21148 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=job]] launched with the following parameters: [{date=874ffd69-f6d4-4f9c-b27a-c7dd692e969d, JobId=1609750891887, time=1609750891887}]
2021-01-04 14:31:32.034 INFO 21148 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step1]
2021-01-04 14:31:32.129 INFO 21148 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=job]] completed with the following parameters: [{date=874ffd69-f6d4-4f9c-b27a-c7dd692e969d, JobId=1609750891887, time=1609750891887}] and the following status: [COMPLETED]
STATUS :: COMPLETED
2021-01-04 14:31:32.132 INFO 21148 --- [ Thread-3] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...
2021-01-04 14:31:32.135 INFO 21148 --- [ Thread-3] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.
Result from MySQL DB – From the result we can conclude that Spring Batch has read the data from CSV file and loaded into MySQL DB.
mysql> use test;
Database changed
mysql> select * from person;
+----+-----------+------------+-----------------------+
| id | firstName | lastName | birthdate |
+----+-----------+------------+-----------------------+
| 1 | Neha | Limay | 1964-10-19 14:11:03.0 |
| 2 | Gaurav | Nene | 1977-12-11 21:44:30.0 |
| 3 | Sandeep | Joshi | 1986-12-25 11:54:28.0 |
| 4 | Parag | Rane | 1959-07-24 06:00:16.0 |
| 5 | Sachit | Patil | 1956-09-14 06:49:28.0 |
| 6 | Suchita | Vinchurkar | 1984-08-30 04:18:10.0 |
| 7 | Aditi | Nerkar | 1973-02-04 05:26:05.0 |
| 8 | Shankar | Parate | 1953-04-26 11:16:26.0 |
| 9 | Tania | Datta | 1951-06-24 14:56:51.0 |
| 10 | Sandhya | Potthuri | 1953-08-27 13:15:08.0 |
| 11 | Santosh | Pande | 1957-09-05 21:36:47.0 |
| 12 | Ajinkya | Deshpande | 1979-01-21 18:31:27.0 |
| 13 | Kiran | Giradkar | 1965-07-18 15:05:22.0 |
| 14 | Deepak | Patil | 1990-09-11 15:52:54.0 |
| 15 | Suraj | Bhamre | 1979-06-01 06:58:54.0 |
| 16 | Anup | Tarone | 1990-07-02 17:36:35.0 |
| 17 | Pradeep | Rodge | 1959-12-19 20:23:12.0 |
| 18 | Pawan | Bawankar | 1984-12-27 10:36:49.0 |
| 19 | Sheetal | Kale | 1962-06-23 20:03:40.0 |
| 20 | Reetisha | Hedau | 1988-11-12 19:05:13.0 |
+----+-----------+------------+-----------------------+
20 rows in set (0.00 sec)