Thursday, May 14, 2015

Queries in the Java Persistence Query Language - an Overview

In this message I want to give an overview of the Java Persistence Query Language, by presenting examples that make use of all of the six available clauses. The general structure of a SELECT query is the following (I recommend this url for details):

SELECT ... FROM ...
[WHERE ...]
[GROUP BY ... [HAVING ...]]
[ORDER BY ...]

SELECT and FROM are mandatory in retrieval queries, the remaining ones are optional (although the SELECT might be left implicit in the queries, as we shall see).

Setup

Before we can run any queries, we need to have persisted data. For the sake of exemplifying, I will use two classes related by a many-to-many relation (see this other message for a different example): a Professor and a Student class. Let us start by the professor. We should notice that we defined a named query, called "orderProfs", which we will use ahead. We need a @Temporal annotation to convert from java.util.Date to the database date format and vice-versa. The other annotations are sort of standard, @Id for the primary key, @GeneratedValue, to let the database generate ids automatically, and @ManyToMany to define the number of the relation with the students. This means that a professor may have any number of students and vice-versa, i.e., each student may have any number of professors (including zero).

package data;

import java.util.Date;
import java.util.List;

import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.ManyToMany;
import javax.persistence.NamedQueries;
import javax.persistence.NamedQuery;
import javax.persistence.Temporal;
import javax.persistence.TemporalType;

@Entity
@NamedQueries({
@NamedQuery(name="orderProfs", query="select p from Professor p order by p.name")
}
)
public class Professor {
@Id @GeneratedValue(strategy=GenerationType.AUTO)
private int id;
@Temporal(TemporalType.DATE)
private Date birthdate;
private float salary;
private String name;
@ManyToMany(mappedBy="profs")
private List<Student> students;


public Professor() {}

public Professor(String name, Date birthdate, float salary) {
super();
this.name = name;
this.birthdate = birthdate;
this.salary = salary;
}

public List<Student> getStudents() {
return students;
}
public void setStudents(List<Student> students) {
this.students = students;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}

public Date getBirthdate() {
return birthdate;
}

public void setBirthdate(Date birthdate) {
this.birthdate = birthdate;
}

public float getSalary() {
return salary;
}

public void setSalary(float salary) {
this.salary = salary;
}

public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}

@Override
public String toString() {
return this.name;
}

}


Now the Student class, which is simpler. We just have a name and an average (grade), which we will use for a few more complex queries. 

package data;

import java.util.ArrayList;
import java.util.List;

import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.ManyToMany;

@Entity
public class Student {
@Id @GeneratedValue(strategy=GenerationType.AUTO)
private int id;
private String name;
private int average;
@ManyToMany
private List<Professor> profs;
public Student() {}
public Student(String name, int average) {
super();
this.name = name;
this.average = average;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}

public List<Professor> getProfs() {
return profs;
}

public void setProfs(List<Professor> profs) {
this.profs = profs;
}
public void addProf(Professor prof) {
if (this.profs == null)
this.profs = new ArrayList<>();
this.profs.add(prof);
}
public int getAverage() {
return average;
}
public void setAverage(int average) {
this.average = average;
}
@Override
public String toString() {
return this.name;
}
}


Time for some management now. I used EclipseLink as my persistence engine, with the following configuration (in the persistence.xml file). Note that this will not create the database, only the tables, you need to do the database creation manually. The name of the database is PlayJPQL. You also need to set your username and password properly:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1"
xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
<persistence-unit name="PlayJPA-Aula"
transaction-type="RESOURCE_LOCAL">
<class>data.Professor</class>
<class>data.Student</class>
<properties>
<property name="javax.persistence.jdbc.user" value="youruser" />
<property name="javax.persistence.jdbc.password" value="yourpassword" />
<property name="javax.persistence.jdbc.url" value="jdbc:mysql://localhost:3306/PlayJPQL" />
<property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver" />

<property name="eclipselink.ddl-generation" value="create-tables" />
<property name="eclipselink.ddl-generation.output-mode"
value="database" />
</properties>
</persistence-unit>
</persistence>


This said, let us start by populating the database. Since we use the "mappedBy" keyword on the Professor side, the Student class owns the relation. This means that we need to add professors to the students. Adding students to the professors could result in data losses:

package data;

import java.text.ParseException;
import java.text.SimpleDateFormat;

import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.EntityTransaction;
import javax.persistence.Persistence;

public class TestStudents {
public static void main(String[] args) throws ParseException {
SimpleDateFormat ft = new SimpleDateFormat ("yyyy-MM-dd"); 
Professor [] professors = { 
new Professor("Paula Queiroz", ft.parse("1969-11-06"), 35000), 
new Professor("Antonio Fagundes", ft.parse("1953-02-16"), 41000), 
new Professor("Silvia Matos", ft.parse("1976-01-04"), 28000),
new Professor("Alder Dante", ft.parse("1984-11-24"), 48344),
new Professor("Paula Alves", ft.parse("1971-12-31"), 19836),
new Professor("Mario Castro", ft.parse("1975-06-09"), 21236),
};
Student [] students5 = {
new Student("Paulo Silva", 16),
new Student("Artur Bastos", 13),
new Student("Carlos Manuel", 11),
new Student("Antonia Lopes", 19),
new Student("Tiago Andre", 15),
new Student("Ricardo Santos", 14),
new Student("Andreia Lopes", 17),
new Student("Luis Fonseca", 16),
new Student("Mar Maduke", 12),
new Student("Al Cid", 11) };

EntityManagerFactory emf = Persistence.createEntityManagerFactory("PlayJPA-Aula");
EntityManager em = emf.createEntityManager();
EntityTransaction transaction = em.getTransaction();
transaction.begin();
students5[0].addProf(professors[0]);
students5[0].addProf(professors[1]);
students5[1].addProf(professors[1]);
students5[1].addProf(professors[2]);
students5[1].addProf(professors[3]);
students5[1].addProf(professors[4]);
students5[2].addProf(professors[2]);
students5[3].addProf(professors[3]);
students5[4].addProf(professors[0]);
students5[5].addProf(professors[4]);
students5[6].addProf(professors[0]);
students5[6].addProf(professors[1]);
students5[6].addProf(professors[2]);
students5[6].addProf(professors[3]);
students5[7].addProf(professors[0]);
students5[8].addProf(professors[1]);
students5[9].addProf(professors[3]);
students5[9].addProf(professors[4]);
for (Student s : students5)
em.persist(s);

for (Professor p : professors)
em.persist(p);

transaction.commit();

}
}



To give us a little help, I will add immediately the output of a query that lists all the professors and students (note that Mario Castro has no students, hence a 'null' is showing up):

Professor Paula Queiroz Student Paulo Silva
Professor Paula Queiroz Student Tiago Andre
Professor Paula Queiroz Student Andreia Lopes
Professor Paula Queiroz Student Luis Fonseca
Professor Antonio Fagundes Student Paulo Silva
Professor Antonio Fagundes Student Artur Bastos
Professor Antonio Fagundes Student Andreia Lopes
Professor Antonio Fagundes Student Mar Maduke
Professor Silvia Matos Student Artur Bastos
Professor Silvia Matos Student Carlos Manuel
Professor Silvia Matos Student Andreia Lopes
Professor Alder Dante Student Artur Bastos
Professor Alder Dante Student Antonia Lopes
Professor Alder Dante Student Andreia Lopes
Professor Alder Dante Student Al Cid
Professor Paula Alves Student Artur Bastos
Professor Paula Alves Student Ricardo Santos
Professor Paula Alves Student Al Cid
Professor Mario Castro Student null


The Queries

Once we successfully run the TestStudents program we can start querying the database. This is the beginning of our Queries program. Afterwards, we show one query at a time :

package data;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;

import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.Persistence;
import javax.persistence.Query;

public class Queries {
@SuppressWarnings("unchecked")
public static void main(String[] args) throws ParseException {
int firstid = -1;
EntityManagerFactory emf = Persistence.createEntityManagerFactory("PlayJPA-Aula");
EntityManager em = emf.createEntityManager();
//EntityTransaction transaction = em.getTransaction();


All professors

First, we list all the professors (note that we omit the select clause, but it is implicit):

System.out.println("\n\nAll professors");
Query q = em.createQuery("from Professor p");
List<Professor> list = q.getResultList();
for (Professor p : list) {
if (firstid == - 1)
firstid = p.getId();
System.out.println("Professor " + p.getName());
}


With the following result

All professors
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Silvia Matos
Professor Alder Dante
Professor Paula Alves
Professor Mario Castro


Limit the number of professors

Next, only the three first professors
System.out.println("\n\nThe first 3 professors");
q = em.createQuery("from Professor p");
q.setMaxResults(3);
list = q.getResultList();
for (Professor p : list)
System.out.println("Professor " + p.getName());

With the following result:


The first 3 professors
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Silvia Matos

A named query with a sort

Next, we use the named query "orderProfs", which me mentioned before. The advantage of using a named query is that we can use it multiple times in the code having only one definition. 

We don't need the firstid manipulation here. We get the id of the first professor here, for a future query.
System.out.println("\n\nSorted professors");
q = em.createNamedQuery("orderProfs");
list = q.getResultList();
for (Professor p : list) {
if (firstid == - 1)
firstid = p.getId();
System.out.println("Professor " + p.getName());
}

The result is now:

Sorted professors
Professor Alder Dante
Professor Antonio Fagundes
Professor Mario Castro
Professor Paula Alves
Professor Paula Queiroz
Professor Silvia Matos



A WHERE clause with a parameter

We now want to list the professors whose name starts with an 'A'. We will use a parameter to specify the pattern we are looking for. The use of this parameter, lets the JDBC driver sanitize the input (not that there is any in this case, because we are using a fixed 'A%'), thus reducing or even eliminating the possibility of SQL injection attacks:
System.out.println("\n\nProfessors whose name starts with letter A");
q = em.createQuery("from Professor p where p.name like ?1");
q.setParameter(1, "A%");
list = q.getResultList();
for (Professor p : list)
System.out.println("Professor " + p.getName());


Which results in:

Professors whose name starts with letter A
Professor Antonio Fagundes
Professor Alder Dante


Another way of doing a similar query is by means of a named parameter:
System.out.println("\n\nProfessor whose name starts with letters Ant");
q = em.createQuery("from Professor p where p.name like :name");
q.setParameter("name", "Ant%");
list = q.getResultList();
for (Professor p : list)
System.out.println("Professor " + p.getName());


This gives us a single professor:

Professor whose name starts with letters Ant
Professor Antonio Fagundes

Native SQL Query

Using SQL queries is still possible. In this case, we count the number of results that we get from the 'FROM' clause
System.out.println("\n\nNumber of professors");
q = em.createNativeQuery("select count(*) from Professor");
System.out.println(q.getSingleResult());

This will give us a six:

Number of professors
6


Counting (revisited)

However, we can stick to JPQL to do the counting:
System.out.println("\n\nCount the professors (revisited)");
q = em.createQuery("select count(p) from Professor p");
System.out.println(q.getSingleResult());


A six, again:

Count the professors (revisited)
6



Finding by primary key

We can retrieve an entity from its primary key without resorting to any query (remember the firstid that we use a few queries ago? Here it is):

System.out.println("\n\nUtilization of find()");
Professor pp = em.find(Professor.class, firstid);

System.out.println("First professor: " + pp);


And the result is:

Utilization of find()
First professor: Paula Queiroz


Another WHERE involving comparison of Dates

We set the date 1-1-1970 and look for all birthdates prior to that one:

SimpleDateFormat ft = new SimpleDateFormat ("yyyy-MM-dd");
Date threshold = ft.parse("1970-01-01");
System.out.println("\n\nProfessors born before 1970");
q = em.createQuery("from Professor p where p.birthdate < ?1");
q.setParameter(1, threshold);
list = q.getResultList();
for (Professor p : list)

System.out.println("Professor " + p.getName());


Which gives us two professors:

Professors born before 1970
Professor Paula Queiroz
Professor Antonio Fagundes



A WHERE clause with a 'size()' operator

We now want to list professors that have more than 3 students:

System.out.println("\n\nProfessors with more than 3 students");
q = em.createQuery("from Professor p where size(p.students) > 3");
list = q.getResultList();
for (Professor p : list)
System.out.println("Professor " + p);


We get three:

Professors with more than 3 students
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Alder Dante


The 'avg()' operator

Instead of listing all the professors, we want to compute their average salary instead:
System.out.println("\n\nThe average salary per professor");
q = em.createQuery("select avg(p.salary) from Professor p");
System.out.println(q.getSingleResult());


This gives us:

The average salary per professor
32236.0


A similar example that gives us the average number of students per professor:

System.out.println("\n\nThe average number of students per professor");
q = em.createQuery("select avg(size(p.students)) from Professor p");
System.out.println(q.getSingleResult());


This gives us:

The average number of students per professor
3.0



The 'min()' operator

A similar example, but to compute the minimum salary:

System.out.println("\n\nMinimum salary");
q = em.createQuery("select min(p.salary) from Professor p");
System.out.println(q.getSingleResult());


The minimum salary is:

Minimum salary
19836.0


Two objects in the SELECT

This changes the return type a little bit. We get an array of Objects. In this case, both are float numbers, and can be cast as such, because we defined the salary to be a float:

System.out.println("\n\nMinimum and maximum salary of the professors");
q = em.createQuery("select min(p.salary), max(p.salary) from Professor p");
Object[] res = (Object[]) q.getSingleResult();
System.out.println("Minimum = " + (float) res[0]);
System.out.println("Maximum = " + (float) res[1]);


Here they are:

Minimum and maximum salary of the professors
Minimum = 19836.0
Maximum = 48344.0



An INNER JOIN

This example is, I am afraid, far more difficult to understand. We are looking for all the combinations of professors and students:

System.out.println("\n\nList of professors and students");
q = em.createQuery("select p, s from Professor p inner join p.students s");
List<Object[]> listobjs = q.getResultList();
for (Object[] o : listobjs)
System.out.println("Professor " + (Professor) o[0] + " Student " + (Student) o[1]);


This query works as if we had two for() cycles, like this:

for p in Professor:
   for s in p.students

i.e., for each professor p we have another for each for the students of p. Note the definition of s inside the join, which we use in the select. The result is a list of professors and their students:

List of professors and students
Professor Paula Queiroz Student Paulo Silva
Professor Paula Queiroz Student Tiago Andre
Professor Paula Queiroz Student Andreia Lopes
Professor Paula Queiroz Student Luis Fonseca
Professor Antonio Fagundes Student Paulo Silva
Professor Antonio Fagundes Student Artur Bastos
Professor Antonio Fagundes Student Andreia Lopes
Professor Antonio Fagundes Student Mar Maduke
Professor Silvia Matos Student Artur Bastos
Professor Silvia Matos Student Carlos Manuel
Professor Silvia Matos Student Andreia Lopes
Professor Alder Dante Student Artur Bastos
Professor Alder Dante Student Antonia Lopes
Professor Alder Dante Student Andreia Lopes
Professor Alder Dante Student Al Cid
Professor Paula Alves Student Artur Bastos
Professor Paula Alves Student Ricardo Santos
Professor Paula Alves Student Al Cid

We could add the student average to this query:


System.out.println("\n\nList of professors, students and averages");
q = em.createQuery("select p, s, s.average from Professor p inner join p.students s");
listobjs = q.getResultList();
for (Object[] o : listobjs)

System.out.println("Professor " + (Professor) o[0] + " Student " + (Student) o[1] + " average = " + (int) o[2]);


To get this:

List of professors, students and averages
Professor Paula Queiroz Student Paulo Silva average = 16
Professor Paula Queiroz Student Tiago Andre average = 15
Professor Paula Queiroz Student Andreia Lopes average = 17
Professor Paula Queiroz Student Luis Fonseca average = 16
Professor Antonio Fagundes Student Paulo Silva average = 16
Professor Antonio Fagundes Student Artur Bastos average = 13
Professor Antonio Fagundes Student Andreia Lopes average = 17
Professor Antonio Fagundes Student Mar Maduke average = 12
Professor Silvia Matos Student Artur Bastos average = 13
Professor Silvia Matos Student Carlos Manuel average = 11
Professor Silvia Matos Student Andreia Lopes average = 17
Professor Alder Dante Student Artur Bastos average = 13
Professor Alder Dante Student Antonia Lopes average = 19
Professor Alder Dante Student Andreia Lopes average = 17
Professor Alder Dante Student Al Cid average = 11
Professor Paula Alves Student Artur Bastos average = 13
Professor Paula Alves Student Ricardo Santos average = 14

Professor Paula Alves Student Al Cid average = 11


A LEFT OUTER JOIN

You may have noticed that professor Mario Castro didn't show up in the former query, because he has no students. To make all the professors show up, even those without students, we may use a left outer join:

System.out.println("\n\nList of professors and students");
q = em.createQuery("select p, s from Professor p left join p.students s");
listobjs = q.getResultList();
for (Object[] o : listobjs)

System.out.println("Professor " + (Professor) o[0] + " Student " + (Student) o[1]);


We now have all the professors. Pay special attention to the last line, where the student shows up as 'null':

List of professors and students
Professor Paula Queiroz Student Paulo Silva
Professor Paula Queiroz Student Tiago Andre
Professor Paula Queiroz Student Andreia Lopes
Professor Paula Queiroz Student Luis Fonseca
Professor Antonio Fagundes Student Paulo Silva
Professor Antonio Fagundes Student Artur Bastos
Professor Antonio Fagundes Student Andreia Lopes
Professor Antonio Fagundes Student Mar Maduke
Professor Silvia Matos Student Artur Bastos
Professor Silvia Matos Student Carlos Manuel
Professor Silvia Matos Student Andreia Lopes
Professor Alder Dante Student Artur Bastos
Professor Alder Dante Student Antonia Lopes
Professor Alder Dante Student Andreia Lopes
Professor Alder Dante Student Al Cid
Professor Paula Alves Student Artur Bastos
Professor Paula Alves Student Ricardo Santos
Professor Paula Alves Student Al Cid

Professor Mario Castro Student null


An INNER JOIN with a WHERE

We can also define a condition over the inner cycle, e.g., to look for the names of the professors teaching a student named 'Paulo something' (or simply 'Paulo'):
System.out.println("\n\nProfessors of Paulo something");
q = em.createQuery("select p from Professor p inner join p.students s where s.name like 'Paulo%'");
list = q.getResultList();
for (Professor p : list)
System.out.println("Professor " + p);


The variable s allows the search for the name 'Paulo'. The result is a list of professors:

Professors of Paulo something
Professor Paula Queiroz
Professor Antonio Fagundes



MEMBER OF and NOT MEMBER OF

Something similar could be achieved with the member of, which checks whether some element belongs to a list. For instance, we could look for the professors of Andreia Lopes or for the professors that do not teach her.


q = em.createQuery("from Student s where s.name like 'Andreia Lopes'");
Student s = (Student) q.getSingleResult();
System.out.println("\n\nProfessors of " + s);
q = em.createQuery("select p from Professor p where :student member of p.students");
q.setParameter("student",s);
proflist = q.getResultList();
for (Professor p : proflist)

System.out.println(p);


And the result is:
Professors of Andreia Lopes
Paula Queiroz
Antonio Fagundes
Silvia Matos

Alder Dante


If we use a not:

System.out.println("\n\nProfessors not teaching " + s);
q = em.createQuery("select p from Professor p where :student not member of p.students");
q.setParameter("student",s);
proflist = q.getResultList();
for (Professor p : proflist)
System.out.println(p);


And the result is:


Professors not teaching Andreia Lopes
Paula Alves
Mario Castro


DISTINCT

Let us continue. Now, instead of a name, we will look for students averaging more then 15 in their course grades:

System.out.println("\n\nProfessors with students having average above 15");
q = em.createQuery("select p from Professor p join p.students s where s.average > 15");
list = q.getResultList();
for (Professor p : list)

System.out.println("Professor " + p);


There is the problem:

Professors with students having average above 15
Professor Paula Queiroz
Professor Paula Queiroz
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Antonio Fagundes
Professor Silvia Matos
Professor Alder Dante

Professor Alder Dante


Lots of students have more than 15 of average, and this will cause professors to show more than once. To overcome this problem we may do:


System.out.println("\n\nProfessors with students having average above 15 (with distinct)");
q = em.createQuery("select distinct p from Professor p join p.students s where s.average > 15");
list = q.getResultList();
for (Professor p : list)

System.out.println("Professor " + p);



And this solves the issue:

Professors with students having average above 15 (with distinct)
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Silvia Matos

Professor Alder Dante


GROUP BY

It is also possible to solve the aforementioned problem using GROUP BY, although this clause is typically intended for a different purpose:

System.out.println("\n\nProfessors with students having average above 15 (group by)");
q = em.createQuery("select p from Professor p join p.students s where s.average > 15 group by p");
list = q.getResultList();
for (Professor p : list)

System.out.println("Professor " + p);


And the same result:

Professors with students having average above 15 (group by)
Professor Paula Queiroz
Professor Antonio Fagundes
Professor Silvia Matos

Professor Alder Dante


But DISTINCT is still the intended way of solving the problem. GROUP BY nonetheless can give us some help in slightly different cases. For example to use an operator over the aggregation of GROUP BY:

System.out.println("\n\nProfessors and their students average");
q = em.createQuery("select p, avg(s.average) from Professor p join p.students s group by p");
listobjs = q.getResultList();
for (Object[] o : listobjs)

System.out.println("Professor " + (Professor) o[0] + " Students average " + (double) o[1]);


This gives us the professors and the averages of their students:

Professors and their students average
Professor Paula Queiroz Students average 16.0
Professor Antonio Fagundes Students average 14.5
Professor Silvia Matos Students average 13.6667
Professor Alder Dante Students average 15.0

Professor Paula Alves Students average 12.6667



GROUP BY HAVING

And what if we wanted the list of professors that have at least two students with an average above 15?

System.out.println("\n\nProfessors with two or more students having average above 15");
q = em.createQuery("select p from Professor p join p.students s where s.average > 15 group by p having count(p) >= 2");
list = q.getResultList();
for (Professor p : list)

System.out.println("Professor " + p);


This gives the following result:

Professors with two or more students having average above 15
Professor Paula Queiroz
Professor Antonio Fagundes

Professor Alder Dante



This is example is quite long, so I provide the complete version here in a .zip Eclipse project. It will not run on your computer unless you setup the database, the persistence.xml and download the MySQL (or other) JDBC driver.

1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete