Overview of JPA [video]
Start course

This course provides you with a deep dive into the various JDK features for accessing different resources when developing with Java. We’ll cover areas such as JDBC, Annotations, CDI, and JPA.

After completing this course, why not do our Accessing Resources using Java Annotations, CDI, JDBC, and JPA lab to put your knowledge into practice?

Learning Objectives

  • Understand what the JDBC API is how to implement it to access databases
  • Understand what Annotations are and the benefits of working with them to access resources
  • Understand what CDI is and how you can use it for dependency injection and interception
  • And understand what JPA is and how you can use it for object relational mapping and as a persistence framework


  • A basic understanding of the Java programming language
  • A basic understanding of software development
  • A basic understanding of the software development life cycle

Intended Audience

  • Software Engineers interested in advancing their Java skills
  • Software Architects interested in using advanced features of Java to design and build both applications and frameworks
  • Anyone interested in advanced Java application development and associated tooling
  • Anyone interested in understanding the advanced areas and features of the Java SDK

Okay, welcome back. In this lecture, we'll explore the Java Persistence API in detail and how it can and should be used to perform object relational mapping and persistent services. In particular, we'll review the following topics. We'll provide a discussion on object to relational mapping. We'll explore the Java Persistence API. We'll then explain the ORM framework configuration. We'll demonstrate mapping a simple entity to a database table. And finally, we'll examine how to read, write and search for entities.

At the conclusion of this lecture, you should be able to perform each of the items listed above. Take a moment to rate yourself on each of these items on a scale of one through to five. At the conclusion of this lecture, these objectives will be reviewed. You should rate yourself again to see how much benefit you've received.

To start with, at runtime, typical Java-based applications consist of thousands of objects which carry state, and for most applications, the state of these objects will be persisted into a relational database. If you have ever implemented the persistence layer of an application using JDBC, you know how much time it can take for you to write the code for persisting, updating and querying the state of a single object type, let alone writing it for all entities in your application.

On top of the amount of time it takes to write the code, the data structure used in the database is often very different from the OO model used in the application. Object-relational mapping frameworks, or ORM in abbreviated form, attempt to address the impedance between the two models. In 2006, Sun released the Java Persistence API specification defining a standard API for ORM mapping frameworks. The JPA API became part of the Java Enterprise specification with JEE 5, but can also be used outside of a JEE container. Over the years, several object to relational mapping frameworks have been developed to simplify the implementation of the persistence layer. JPA actually takes care of the mapping of object state to tables and columns within a database. At the same time, JPA performs the mapping between the Java data types and the data types that are used in the underlying data store.

Naturally, a persistence framework must be capable of doing more than just reading and writing data to and from a database. It must also provide an API that allows developers to query the data in the database. More importantly, it must provide a query language that developers can use without the need to know everything about the underlying data store. Developers need to be able to query for objects using properties on those objects. It will be JPA's responsibility to translate these into SQL statements that are executed against the database. Since JPA is only a specification, an implementation is needed at runtime.

Even though several implementations are available, the one known most to Java developers is Hibernate. Hibernate's goal is to ease the development of the persistence layer. Even when the framework can only replace 95% of the JDBC code that we would normally have to write, this will result in a huge amount of code that does not need to be written and maintained. When writing an application, you want to maintain focus on solving a particular business problem and not have to spend time solving data persistence and storage issues.

In most cases, our object structures will differ entirely from our back-end data storage structures. Translating between the different structures takes time and effort away from our main need, that of implementing solutions to business problems. Another common issue when dealing with data is the need to share it concurrently. Applications often require access to the same set of data multiple times when performing a task. Since the data we are working with is often shared among multiple processes within the application, we want to make sure that this data is available to all processes all the time. We simply cannot lock a set of data away for an extended period of time.

Even though JDBC has given developers a standard API to communicate with a variety of data sources, writing the persistence layer still requires writing considerable boilerplate code. Even though newer Java language features like the try-with-resources code construct introduced in Java 7 and the RowSet implementation provide some relief, writing the persistence layer without the use of an ORM framework is still very cumbersome. When trying to map your object model to a relational database, you will soon discover that it is not a perfect match. While a single object might have a certain state, this state does not necessarily have to be stored in a single database table. Also, while in the OO world, it is not uncommon to define a bi-directional relationship between two objects, this is not something that you will find in many relational data models.

Naturally, we could make our OO model more relational to make it a better fit to the relational data model, but this means that we sacrifice OO modeling just because we are using a relational data model to persist the data in. Even though the goal should be to work with a pure OO model, while using a relational database model at the backend, this will often result in a complex mapping model. In the end, we must conclude that the consequence of using a relational database is that our OO model is not completely what we would like it to be. Maybe we have to sacrifice some of the OO model to simplify the mapping model. Here we should strive for the goal to be a good, instead of perfect, OO model with maintainable mapping.

JPA is a specification that defines object to relational mapping. In other words, JPA implementations can automatically synchronize the state of objects within the database. In order for our framework to be useful, it must be able to deal with all the techniques that we use to define our OO model. In the OO world, we rely on interfaces, associations, compositions, and inheritance to define the complete OO model. The mapping of the OO model to the relational model must allow for these to be mapped to a relational model. One of the problems with OO in relation to databases is the mistaken belief that one could take an object model and use a straightforward class to table mapping approach. With JPA, the mapping between the database and the object model can be managed in such a way that neither the OO model nor the database model need to influence each other.

An existing normalized or flat relational model can be mapped to an OO model, allowing flexibility in your OO design. Tools do exist to build your object model from an existing database model, but there is a risk that this will give you a poor OO model and is not recommended. JPA does not constrain your OO model, allowing you to map your pure OO model to an existing database. In general, persistable entities are just plain old Java objects, POJOs. However, the JPA specification defines several rules that must be followed in order for the persistence framework to properly manage the state of these objects throughout their lifecycle.

All entities must be annotated with the entity annotation and must contain at least a public or protected default no-argument constructor. In addition to this, only real classes can be persisted. The JPA specification does not cover the persistence of interfaces nor numerations. In order for the framework to be able to proxy and cache entities, the entity should not be declared as final and should implement the interface. The persistence framework can manage the state of an entity both through the use of property accessors, getter, and setter, or through direct access of the instance variable.

When managing the state of the entity through instance variables, these variables should be declared with a private, protected, or default package access modifier. When using property-based access, the accessor methods should be declared as either public or protected and should follow the JavaBean naming convention. Properties that are of a collection type should always be declared by the interface type, allowing the persistence framework to replace their implementation by proxies. The JavaBean convention defines very detailed rules for the creation of beans. All properties of a JavaBean are accessed through accessor getter and setter methods.

Read-only properties only have a getter method. Writeable properties also contain a setter method. The naming convention also defines that the names of the methods should start with get and set, followed by the property name. It is important to realize that when accessing the state of a JavaBean through the accessor methods, the property names of the bean are inferred from the method and not the name of the instance variable. To obtain the property name, the get or set prefix is dropped from the method name and the first letter of the remainder is put in lowercase. However, there is one exception to this naming rule. When the first two letters after the get or set are both in uppercase, the first letter is not put in lowercase when inferring the property name.

There are a couple of things that we need in order to use JPA. First, we need an implementation of the specification. We need a vendor that provides an entity manager. Since JPA is part of the Java EE specification, Java EE application servers will have a persistence manager available. JBoss comes with Hibernate as the implementation. Other application servers might use EclipseLink, OpenJPA, or one of the other implementations that are available. We also have to define persistence units. This allows users to define which entities are stored in which data sources. Naturally, we also need to specify the actual mapping between the object and relational database. For example, what table the state is stored in, which properties are put into which columns, etc.

In a JPA environment, the configuration of the framework must be done in a file called persistence.xml which must be located in the META-INF folder. One advantage of using the persistence.xml configuration file is that it allows for the definition of multiple persistence units within this single file. Each persistence unit must have a unique name. A persistence unit references either the same or a different database, can have its own settings and can manage their own entities. Within the persistence.xml file, the persistence provider used must explicitly be defined using the provider element.

When using Hibernate as your persistence provider, the provider class used is org.hibernate.jpa.HibernatePersistenceProvider. When a different persistence provider is used, you will have to look in the documentation of that provider for the implementation class that must be configured. As you can see, several implementations of JPA are available. You will see that different application servers use different implementations by default. Each persistence unit requires a connection to a database. Within a managed environment like a Java EE container, the database connection is most often obtained from a connection port that is registered in JNDI.

When running the persistence provider outside of a managed container, persistence unit properties must be used to define the driver, URL, username, and password that are used to make a connection. Until the introduction of JPA 2.1, vendor-specific properties had to be used to define information about the schema maintenance and generation. Since JPA 2.1, these properties have become standardized. A lot of these properties can be very useful during development, for example, the javax.persistence.schema-generation.database.action property can be used to specify that the table should be dropped and recreated each time the application is started or deployed.

Other possible values for this property are none, create, drop-and-create, and drop. The property javax.persistence.schema-generation.create-source is used to define how the framework should determine the layout of the database. For example, tables and columns, etc. In the example shown here, we tell the framework to use the metadata, annotations, and or all mapping files for the creation. Additional properties can be used to have the framework generate DDL, database definition language scripts, containing the statements to create and drop the database. JPA 2.1 even defines a property, javax.persistence.sql-load-script-source, that can be used to reference a SQL script that is executed once the database has been created, allowing for the initial test data to be imported into the database.

Several other properties are available to further fine-tune the persistence framework. The JPA specification defines a relatively small set of configuration properties. Specific JPA implementations, like Hibernate, provide their own set of configuration properties. Each JPA implementation comes with vendor-specific configuration properties to optimize the ORM framework. These properties can range from properties that allow developers to obtain more debugging information during development to properties that deal with caching configurations to optimize the performance of the framework. 

Hibernate does provide properties that can be used to make all the SQL statements that are used or generated by the framework visible to the console. In the example shown here, we make use of a few specific hibernate properties to define that all SQL statements generated and executed on behalf of the Hibernate framework must be made visible in the log and or console. Each entity must be mapped to a database table and its properties must be mapped to columns within that table. JPA uses the concept of configuration by exception. Default mapping strategies have been defined to make sure that a minimum amount of initial configuration is required.

Introspection is used to determine the name of the class and its properties. This information is then used to come up with a default name of both the table and its columns. Also, the type of each column is determined from the Java type. Only when a different name is required for a table or column, or a datatype altered, is explicit configuration required. To define the mapping between the state of the object and the columns in a database table, two approaches can be used. Until Java 5, the only available option was the use of an XML configuration file, which was used to define which database column each property of the entity was to be mapped to.

With the introduction of annotations in Java 5, it has now become possible to define the mapping using annotations directly within the class definition. Both approaches have their advantages. The use of annotations is often preferred by developers because this way they only have to maintain a single file. The use of an XML configuration file is often preferred from an architectural standpoint. It provides a much cleaner separation between code and configuration, but also allows for entities to be reused in different services where each service uses its own underlying data store. In other words, by using XML, it is possible to map the same object to different database schemas.

The javax.persistence.entity annotation is used to mark a class as an entity, indicating that a persistence manager should be used to manage its life cycle. In other words, it is up to the persistence manager to create instances of this class and set its state according to the information that was read from the database. The entity annotation is also used to define the name of the entity. This name can be used when defining queries for this entity. When the name is not defined, the unqualified name of the class, e.g. the class without its package name, is used as default. The javax.persistence.table can be used to define the table in which the state must be stored. By default, the entity name as used is table, but when the entity is to be mapped to a different table or a database schema, the table annotation can be used.

An entity must have a primary key, so a property of the entity must be annotated using the ID annotation. When the primary key is made up of multiple fields, different annotations and different strategies can be used to map the identifier. In addition to defining the primary key field, you should also define how this identifier is created by adding the generated value annotation to the ID field. This way, we can define how a new entity will obtain its identifier. The options we have available to us are auto. Here, the persistence provider should pick an appropriate strategy for the particular database. Identity. In this case, the persistence provider must assign primary keys for the entity using a database identity column. Sequence. Here, the persistence provider must assign primary keys for the entity using a database sequence. And table. Here, the persistence provider must assign primary keys for the entity using an underlying database table to ensure uniqueness.

In the example shown here, the sequence generator annotation is used to define a particular sequence generation strategy. This generator is then referenced by the generated value annotation. All entity fields that are not explicitly marked as transient, either by using the transient annotation or by using the transient Java keyword, are considered to be persistent fields. As a result, the value of each non-transient field will end up in a column in the database table. The column name and type of each column in the database is determined using introspection and a default mapping strategy is used. When the name of the column needs to be changed, or when additional column configuration is required, the column annotation is used.

The JPA specification defines that all entities that are to be managed must be declared in the persistence unit. By doing so, each persistence unit will be responsible for managing its own set of entities. Vendor extensions like those provided by Hibernate do not require the explicit definition of the entities, as long as only a single persistence unit is used. Even so, it is still recommended to define all entities within their appropriate persistence unit in order to stay independent of the persistence implementation used. At the same time, this allows for easy definition of additional persistence units in the future.

Defining entities can be done by either defining each individual class, referencing a jar file containing one or more entities, or by referencing an additional mapping file in which the mapping for the entity has been done. An instance of entity manager is used to obtain and control the lifecycle of entities. Within a Java EE container, an instance of the entity manager can be obtained using dependency injection. The persistence context annotation is used to inform the container that an instance of a particular entity manager is needed. The name attribute of the annotation references one of the persistence units defined in the persistence.xml configuration file, so the entity manager that is injected knows what entities are to be managed and which data source is to be used to read and write entity data. The persistence class, located in the javax.persistence package, is used to bootstrap the CDI container within Java SE environments.

Getting access to an entity manager instance is performed via an entity manager factory instance. Once you have access to the entity manager factory, you can then use it to create an entity manager object. Working with data stores most often requires working with transactions. We want to make sure that several changes are made to the database within a single transaction or that all the changes are rolled back when something goes wrong. When running a ORM framework, we must also configure how transactions are managed. How these transactions are managed depends on the type of the environment in which you will be running the application.

When running within a JEE environment, the transaction type should be set to JTA, indicating that the container will manage the transactions. When running in a Java SE environment, the persistence units should be configured using a transaction type of resource local. The entity manager persist method is used to persist an entity instance. When the instance you are trying to persist already exists in the database, an entity exists exception is thrown. The entity manager is also used to read entities.

Two methods exist to read an entity using its identifier. The find method can be used to read an entity that might or might not exist under the identifier provided. When no entity can be found for the given identifier, a null pointer is returned. The get reference method, on the other hand, throws an exception when no entity data can be found for the given identifier. As can be seen here, the entity manager uses generics for type casting. As a result, the resulting type of the method is determined by the class type specified by the first parameter of these methods.

The remove method is used to remove data from the underlying data store. Naturally, this does not mean that the entity in memory is removed. By invoking the remove method, you only delete the row from the database table. When entities need to be read from the database using properties other than the primary key, or when the database is to be queried to search for entities, the JPA query language can be used. By using the JPA query language, you do not need to know anything about the tables and columns in the database. All queries are written against entity names and their properties. JPA QL is a query language that looks similar to the SQL query language and provides a lot of functionality that is also available in SQL. JPA QL can not only be used to select data from the database, it can also be used to perform batch update and delete statements.

Okay, before we complete this lecture, pause this video and consider the following questions to test yourself on the content that we have just reviewed. Write down your answers for each question and then resume the video to compare answers. Okay, the answers to the above questions are: One, an ORM framework can be relatively transparent while optimizing performance through such features as lazy loading, optimized locking mechanisms, and caching. Two, ORM mappings can be established through annotations in the code and through XML elements in a separate configuration file. Three, it configures one or more persistence units. Four, a lazy loading strategy specifies that the persistence provider runtime only fetches the data when it is accessed. As an example, when a department contains a list of employees, when using lazy loading, the department is only loaded and not the employees. Loading of the employees is done only when first requested. Five, the persist method.

About the Author
Learning Paths

Jeremy is a Content Lead Architect and DevOps SME here at Cloud Academy where he specializes in developing DevOps technical training documentation.

He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 25+ years. In recent times, Jeremy has been focused on DevOps, Cloud (AWS, Azure, GCP), Security, Kubernetes, and Machine Learning.

Jeremy holds professional certifications for AWS, Azure, GCP, Terraform, Kubernetes (CKA, CKAD, CKS).

Covered Topics