proc sql;
create table public_school as
select ID,time_period,score,class from data
where ID = (select ID from data
where time_period = (select max(time_period )from data))
group by class
order by time_period
;quit;
ID | time_period | section | percentage |
281 | 0.25 | a_section | 83 |
281 | 0.5 | a_section | 82 |
281 | 0.75 | a_section | 98 |
281 | 1 | a_section | 88 |
281 | 1.25 | a_section | 81 |
282 | 0.25 | a_section | 80 |
282 | 0.5 | a_section | 82 |
282 | 0.75 | a_section | 84 |
282 | 1 | a_section | 86 |
282 | 1.06 | a_section | 54 |
282 | 1.75 | a_section | 80 |
282 | 2 | a_section | 82 |
282 | 2.25 | a_section | 84 |
282 | 2.5 | a_section | 86 |
282 | 2.75 | a_section | 88 |
282 | 3.0 | a_section | 90 |
282 | 3.25 | a_section | 92 |
301 | 0.25 | b_section | 70 |
301 | 0.5 | b_section | 76 |
301 | 0.75 | b_section | 82 |
301 | 1 | b_section | 88 |
302 | 0.25 | b_section | 90 |
302 | 0.75 | b_section | 100 |
303 | 0.25 | b_section | 84 |
In Sample table shown above, I have different no. of students in each section(a,b,c,d) and school_type( public, private) and their final % obtained for each quarterly held exam. Objective is to select student ID with maximum number of time period from each section. Output table should should look like(including score also) :
ID | time_period | section |
282 | 0.25 | a_section |
282 | 0.5 | a_section |
282 | 0.75 | a_section |
282 | 1 | a_section |
282 | 1.25 | a_section |
282 | 1.5 | a_section |
282 | 1.75 | a_section |
282 | 2 | a_section |
282 | 2.25 | a_section |
282 | 2.5 | a_section |
282 | 2.75 | a_section |
301 | 0.25 | b_section |
301 | 0.5 | b_section |
301 | 0.75 | b_section |
301 | 1 | b_section |
: | : | : |
I have tried code displayed above , but getting error as:
A GROUP BY clause has been discarded because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function
Also, if u can tell how to create macro on this code for school type, it will be quite helpful
The remerging message is intended and expected, it's just a feature of SAS/SQL. When I run this test, I get 16 rows into table Public_School:
data have;
input ID time_period section $ percentage;
datalines;
281 0.25 a_section 83
281 0.5 a_section 82
281 0.75 a_section 98
281 1 a_section 88
281 1.25 a_section 81
282 0.25 a_section 80
282 0.5 a_section 82
282 0.75 a_section 84
282 1 a_section 86
282 1.06 a_section 54
282 1.75 a_section 80
282 2 a_section 82
282 2.25 a_section 84
282 2.5 a_section 86
282 2.75 a_section 88
282 3.0 a_section 90
282 3.25 a_section 92
301 0.25 b_section 70
301 0.5 b_section 76
301 0.75 b_section 82
301 1 b_section 88
302 0.25 b_section 90
302 0.75 b_section 100
303 0.25 b_section 84
;
proc sql;
create table public_school as
select a.*
from
have as a inner join
(select section, id, count(time_period) as total_period
from have
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
select * from public_school;
quit;
To get all the data from the "student ID with maximum number of time period from each section" :
proc sql;
create table public_school as
select a.*
from
data as a inner join
(select section, id, count(time_period) as total_period
from data
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
quit;
(untested)
or maybe you'l want to replace count(time_period) with sum(time_period).
I have got this note: The query requires remerging summary statistics back with the original data & dataset created with 0 rows.
The remerging message is intended and expected, it's just a feature of SAS/SQL. When I run this test, I get 16 rows into table Public_School:
data have;
input ID time_period section $ percentage;
datalines;
281 0.25 a_section 83
281 0.5 a_section 82
281 0.75 a_section 98
281 1 a_section 88
281 1.25 a_section 81
282 0.25 a_section 80
282 0.5 a_section 82
282 0.75 a_section 84
282 1 a_section 86
282 1.06 a_section 54
282 1.75 a_section 80
282 2 a_section 82
282 2.25 a_section 84
282 2.5 a_section 86
282 2.75 a_section 88
282 3.0 a_section 90
282 3.25 a_section 92
301 0.25 b_section 70
301 0.5 b_section 76
301 0.75 b_section 82
301 1 b_section 88
302 0.25 b_section 90
302 0.75 b_section 100
303 0.25 b_section 84
;
proc sql;
create table public_school as
select a.*
from
have as a inner join
(select section, id, count(time_period) as total_period
from have
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
select * from public_school;
quit;
thankyou!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.