nth largest value from group in PostgreSQL
In this post I will show you how to use window function in PostgreSQL to find the nth largest value from every group. You can use pgAdmin tool to create the database. First open the pgAdmin tool from start menu. After that from the object browser on the left side under Servers select ‘localhost’. Right click on it and click connect. You may be prompted to enter the password, enter it and submit. Then you will see the databases list under it. Right click on the databases and select ‘New databases’. You will see a dialog box. Under properties tab enter the name of the database and press ok button. In my example I will create the database as ‘mydb’.
Next create another column with name ‘project_name’ and Data type as ‘character varying’. Now right click on the ‘project’ table and click constraints tab and add primary key.
Similarly create table ‘milestone’ with columns milestone_id (serial) and milestone_name (character varying). Create table ‘project_milestone’ with columns id (serial), project_id (integer), milestone_id (integer), completed_date (timestamp without time zone). Column id is the primary key and Columns project_id and milestone_id in table ‘project_milestone’ are foreign keys. You can add foreign keys by right clicking on the ‘project_milestone’ table and clicking constraints tab and add foreign key. Then from the actions tab add on update and on delete constraints.
Here I will show you how to find the nth largest completed_date from project_milestone. Now we can run queries to insert values into the tables. First right click on the mydb from the menu and select create script. Right below query to insert values into project table.
insert into project(project_name)values('Project A'); insert into project(project_name)values('Project B');
Next we will insert values into milestone.
insert into milestone(milestone_name)values('Milestone 1'); insert into milestone(milestone_name)values('Milestone 2'); insert into milestone(milestone_name)values('Milestone 3'); insert into milestone(milestone_name)values('Milestone 4'); insert into milestone(milestone_name)values('Milestone 5'); insert into milestone(milestone_name)values('Milestone 6'); insert into milestone(milestone_name)values('Milestone 7'); insert into milestone(milestone_name)values('Milestone 8'); insert into milestone(milestone_name)values('Milestone 9'); insert into milestone(milestone_name)values('Milestone 10');
Next we will insert values into project_milestone.
insert into project_milestone(project_id,milestone_id,completed_date)values(1,1,'2015-01-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,2,'2015-02-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,3,'2015-03-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,4,'2015-04-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,5,'2015-05-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,6,'2015-06-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,7,'2015-07-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,8,'2015-08-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,9,'2015-09-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(1,10,'2015-10-01'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,1,'2015-01-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,2,'2015-02-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,3,'2015-03-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,4,'2015-04-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,5,'2015-05-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,6,'2015-06-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,7,'2015-07-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,8,'2015-08-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,9,'2015-09-15'); insert into project_milestone(project_id,milestone_id,completed_date)values(2,10,'2015-10-15');
If you want find the nth largest date value grouped by project, you can use the below query. Replace pm.seqnum = 1 by the nth value required. For example to find the 2nd largest date use pm.seqnum = 2
select p.project_id, pm.milestone_id, pm.completed_date from project p left join (select pm.*, row_number() over (partition by project_id order by completed_date desc) as seqnum from project_milestone pm where pm.completed_date is not null ) pm on p.project_id = pm.project_id and pm.seqnum = 1;