Get started with Java and Spring
Part II. Data
You can find all the sources for this chapter here
This article series is designed mostly for computer engineers with non-Java/Spring programming background who want to get started with new technology fast and flawlessly.
Though it doesn’t cover most valuable principles of development, migrating and deployment, it can be used as good introductory view on new technology.
I won’t explain every step of this guide deeply but will try to serve you with valuable links where you would get much more useful information.
- Bootstrap a new modular project with Maven
- Read and write data. Interact with YouTube API ← this article
Throughout our journey we’ll build a simple Tube application which will crawl some videos from YouTube and present it on our website.
Introduction
Spring Data JPA is the best way to bootstrap your database interaction facilities in the application.
JPA itself provides you with the ability to describe entities, relations and many more while Spring Data JPA is used to interact with entities easily through Repositories.
Action
Defining entities
Let’s start with defining of Video entity.
At the beginning it will store the base information about our videos like title, thumbnail and duration.
Also we need to store the external video id to make sure that we don’t need to fetch this video second time.
./library/src/main/java/com/company/library/domain/Video.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
package com.company.library.domain;
import org.hibernate.annotations.CreationTimestamp;
import org.hibernate.annotations.UpdateTimestamp;
import javax.persistence.*;
import java.util.Date;
@Entity
public class Video {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
@Column(nullable = false)
private String title;
@Column(unique = true, nullable = false)
private String externalId;
@Column(nullable = false)
private String imageUri;
@Column(nullable = false)
private String duration;
@CreationTimestamp
@Column(updatable = false)
private Date createdDate;
@UpdateTimestamp
@Column
private Date modifiedDate;
public Integer getId() {
return id;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public String getExternalId() {
return externalId;
}
public void setExternalId(String externalId) {
this.externalId = externalId;
}
public String getImageUri() {
return imageUri;
}
public void setImageUri(String imageUri) {
this.imageUri = imageUri;
}
public String getDuration() {
return duration;
}
public void setDuration(String duration) {
this.duration = duration;
}
public Date getCreatedDate() {
return createdDate;
}
public void setCreatedDate(Date createdDate) {
this.createdDate = createdDate;
}
public Date getModifiedDate() {
return modifiedDate;
}
public void setModifiedDate(Date modifiedDate) {
this.modifiedDate = modifiedDate;
}
}
Here we have defined some key abilities for our main entity like Generated Id and Creation/Update Timestamps.
Also we have an unique External ID
that we will use to avoid duplicates in our data storage.
Configuring the Scheduler
First of all we have to fetch some videos to store them in our database.
We need to run a periodic task that will run with one minute intervals and watch the updates on external website.
We’ll start with defining of Spring Boot Application searching for any scheduled tasks and running them as we configure.
./scheduler/src/main/java/com/company/scheduler/Scheduler.java
1
2
3
4
5
6
7
8
9
10
11
package com.company.scheduler;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class Scheduler {
public static void main(String[] args) {
SpringApplication.run(Scheduler.class, args);
}
}
Here we have defined our Spring Boot Application and allowed to run it using Scheduler class.
But first we have to define our database connection in application.properties
file - the main
Spring Boot configuration file.
BTW, you can find more about possible properties here.
./scheduler/src/main/resources/application.properties
1
2
3
4
5
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
spring.datasource.url=jdbc:mysql://localhost:3306/project?autoReconnect=true&useSSL=false
spring.datasource.username=project
spring.datasource.password=project
Here we have defined our DSN with auto reconnection feature and without using SSL just to suppress the non-secured connection warning.
Also we specified the SQL Dialect used by Hibernate data framework.
That’s almost all but we need to have a database running so we can kickstart it using Docker with Docker Compose like this.
If you are a Mac OS user you definitely want to use Docker for Mac. Especially its Edge version.
./docker-compose.yml
1
2
3
4
5
6
7
8
9
10
11
12
version: "3"
services:
db:
image: mysql:latest
ports:
- "3306:3306"
environment:
MYSQL_ROOT_PASSWORD: toor
MYSQL_DATABASE: project
MYSQL_USER: project
MYSQL_PASSWORD: project
Start the database container using docker-compose up
command from inside the directory where docker-compose.yml
is
located and we’ll have our shiny new database server running and listening on our host 3306 port.
Sure we need to include the MySQL dependency in case to be able to interact with MySQL database.
We can do it inside our main POM file to share the connector between modules.
./pom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<!-- ... -->
<dependencies>
<!-- ... -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
</dependencies>
</project>
And now we are ready to run our Scheduler application.
Preparations
Right now our application doesn’t make any sense. It runs and dies without any useful goals achieved.
So we have to schedule our operations.
Let’s write a basic crawler interface and service that will implement the crawling method.
Then we just call this service from our scheduled task and actually write data to the database.
./scheduler/src/main/java/com/company/scheduler/crawler/Crawler.java
1
2
3
4
5
package com.company.scheduler.crawler;
public interface Crawler {
void crawl();
}
Now we need our crawlers to be able to do just one thing - crawl. No remorse.
./scheduler/src/main/java/com/company/scheduler/crawler/VideoCrawler.java
1
2
3
4
5
6
7
8
9
10
11
12
13
package com.company.scheduler.crawler;
import org.springframework.stereotype.Component;
@Component
public class VideoCrawler implements Crawler {
@Override
public void crawl() {
System.out.println("New videos crawling started");
System.out.println("New videos crawling ended");
}
}
This is the place where all the hard work will be done.
Our crawler is a Component that will be found by Spring Boot Component Scan and can be Autowired in other components.
./scheduler/src/main/java/com/company/scheduler/schedule/CrawlSchedules.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
package com.company.scheduler.schedule;
import com.company.scheduler.crawler.VideoCrawler;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
@Component
@EnableScheduling
public class CrawlSchedules {
private final VideoCrawler videoCrawler;
@Autowired
public CrawlSchedules(VideoCrawler videoCrawler) {
this.videoCrawler = videoCrawler;
}
@Scheduled(fixedDelay = 1 * 60 * 1000)
public void scheduleNewVideosCrawl() {
videoCrawler.crawl();
}
}
And this is our almighty Video Crawler that will be called after 60 seconds has passed since last call or at the start of our application.
Now we can run our Scheduler application and realise that we have our Video Crawler running.
Fetching data
Now we can start to actually crawl videos from external source and store them in our object layer.
As we getting started with YouTube we will need an API key to be able to get videos updates.
When we have obtained API key we can start to implement our fetcher and we need to store our API key somewhere not in source code.
So we will do this inside our application.properties
file.
./scheduler/src/main/resources/application.properties
1
2
3
4
5
6
7
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
spring.datasource.url=jdbc:mysql://localhost:3306/project?autoReconnect=true&useSSL=false
spring.datasource.username=project
spring.datasource.password=project
com.company.scheduler.you-tube-api-key=YOUR_GOOGLE_YOUTUBE_API_KEY_WITHOUT_QUOTES
Now we need to forward this property to our application layer. We will make this happen by defining a Configuration Properties component.
./scheduler/src/main/java/com/company/scheduler/properties/SchedulerProperties.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
package com.company.scheduler.properties;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Configuration;
@Configuration
@ConfigurationProperties("com.company.scheduler")
public class SchedulerProperties {
private String youTubeApiKey = "";
public String getYouTubeApiKey() {
return youTubeApiKey;
}
public void setYouTubeApiKey(String youTubeApiKey) {
this.youTubeApiKey = youTubeApiKey;
}
}
Now our YouTube API key will be available from inside our Scheduler Properties Configuration Component instance.
We will fetch our videos with self-written YouTube API interaction provider.
./scheduler/src/main/java/com/company/scheduler/provider/YouTubeVideoProvider.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
package com.company.scheduler.provider;
import com.company.scheduler.properties.SchedulerProperties;
import com.google.api.client.http.HttpRequest;
import com.google.api.client.http.HttpRequestInitializer;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.json.jackson2.JacksonFactory;
import com.google.api.services.youtube.YouTube;
import com.google.api.services.youtube.YouTubeRequestInitializer;
import com.google.api.services.youtube.model.Video;
import com.google.api.services.youtube.model.VideoListResponse;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
@Component
public class YouTubeVideoProvider {
private YouTube youTube;
@Autowired
public YouTubeVideoProvider(SchedulerProperties schedulerProperties) {
youTube = new YouTube.Builder(
new NetHttpTransport(),
new JacksonFactory(),
new HttpRequestInitializer() {
@Override
public void initialize(HttpRequest httpRequest) throws IOException {
}
}
).setYouTubeRequestInitializer(new YouTubeRequestInitializer(schedulerProperties.getYouTubeApiKey())).build();
}
public List<Video> getRecentTrendingVideoList()
{
try {
YouTube.Videos.List youTubeVideosList = youTube.videos().list("contentDetails,snippet");
youTubeVideosList.setChart("mostPopular");
VideoListResponse youtubeVideosListResponse = youTubeVideosList.execute();
return youtubeVideosListResponse.getItems();
} catch (IOException e) {
e.printStackTrace();
return new ArrayList<>();
}
}
}
Here we have defined our YouTube video provider that will create a YouTube connector instance and implement some methods for videos loading.
We have to create Video converter that will convert the YouTube video to our Video entity object.
./scheduler/src/main/java/com/company/scheduler/converter/VideoConverter.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
package com.company.scheduler.converter;
import com.company.library.domain.Video;
import com.google.api.services.youtube.model.Thumbnail;
import com.google.api.services.youtube.model.ThumbnailDetails;
import com.google.api.services.youtube.model.VideoContentDetails;
import com.google.api.services.youtube.model.VideoSnippet;
import org.springframework.stereotype.Component;
@Component
public class VideoConverter {
public static class ThumbnailUrlResolver {
static String resolveThumbnailUrl(ThumbnailDetails thumbnailDetails) {
Thumbnail thumbnail = thumbnailDetails.getMaxres();
if (null != thumbnail) {
return thumbnail.getUrl();
}
thumbnail = thumbnailDetails.getHigh();
if (null != thumbnail) {
return thumbnail.getUrl();
}
thumbnail = thumbnailDetails.getMedium();
if (null != thumbnail) {
return thumbnail.getUrl();
}
thumbnail = thumbnailDetails.getStandard();
if (null != thumbnail) {
return thumbnail.getUrl();
}
thumbnail = thumbnailDetails.getDefault();
if (null != thumbnail) {
return thumbnail.getUrl();
}
return null;
}
}
public Video createFromYoutubeVideo(com.google.api.services.youtube.model.Video youtubeVideo) {
VideoSnippet youtubeVideoSnippet = youtubeVideo.getSnippet();
VideoContentDetails youtubeVideoContentDetails = youtubeVideo.getContentDetails();
Video video = new Video();
video.setExternalId(youtubeVideo.getId());
if (null != youtubeVideoSnippet) {
video.setTitle(youtubeVideoSnippet.getTitle());
ThumbnailDetails youtubeVideoThumbnailDetails = youtubeVideoSnippet.getThumbnails();
if (youtubeVideoThumbnailDetails.size() > 0) {
video.setImageUri(ThumbnailUrlResolver.resolveThumbnailUrl(youtubeVideoThumbnailDetails));
}
}
if (null != youtubeVideoContentDetails) {
video.setDuration(youtubeVideo.getContentDetails().getDuration());
}
return video;
}
}
And now we can assemble all the written components and reach our primary goal - crawl the videos.
./scheduler/src/main/java/com/company/scheduler/crawler/VideoCrawler.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
package com.company.scheduler.crawler;
import com.company.library.domain.Video;
import com.company.library.domain.VideoRepository;
import com.company.scheduler.converter.VideoConverter;
import com.company.scheduler.provider.YouTubeVideoProvider;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import java.util.List;
@Component
public class VideoCrawler implements Crawler {
private final VideoConverter videoConverter;
private final YouTubeVideoProvider youTubeVideoProvider;
private final VideoRepository videoRepository;
@Autowired
public VideoCrawler(
VideoConverter videoConverter,
YouTubeVideoProvider youTubeVideoProvider,
VideoRepository videoRepository
) {
this.videoConverter = videoConverter;
this.youTubeVideoProvider = youTubeVideoProvider;
this.videoRepository = videoRepository;
}
@Override
public void crawl() {
System.out.println("New videos crawling started");
List<com.google.api.services.youtube.model.Video> recentTrendingVideoList =
youTubeVideoProvider.getRecentTrendingVideoList();
for (com.google.api.services.youtube.model.Video youtubeVideo : recentTrendingVideoList) {
if (null != videoRepository.findByExternalId(youtubeVideo.getId())) {
break;
}
Video video = videoConverter.createFromYoutubeVideo(youtubeVideo);
videoRepository.save(video);
System.out.println(String.format("Video %s (%s) saved", video.getTitle(), video.getExternalId()));
}
System.out.println("New videos crawling ended");
}
}
Everything is ready to start the crawling but the database schema.
Hibernate has very handy feature for development environments - Automatic DDL Update.
./scheduler/src/main/resources/application.properties
1
2
3
4
5
6
7
8
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5InnoDBDialect
spring.jpa.hibernate.ddl-auto=update
spring.datasource.url=jdbc:mysql://localhost:3306/project?autoReconnect=true&useSSL=false
spring.datasource.username=project
spring.datasource.password=project
com.company.scheduler.you-tube-api-key=YOUR_GOOGLE_YOUTUBE_API_KEY_WITHOUT_QUOTES
Finally we can start our Scheduler and fetch our first five videos.
In the next chapter we will learn how to use Spring with Jedis to store and retrieve fast data.
You can find all the sources for this chapter here