proc sql;
create table public_school as
select ID,time_period,score,class from data
where ID = (select ID from data
where time_period = (select max(time_period )from data))
group by class
order by time_period
;quit;
ID | time_period | section | percentage |
281 | 0.25 | a_section | 83 |
281 | 0.5 | a_section | 82 |
281 | 0.75 | a_section | 98 |
281 | 1 | a_section | 88 |
281 | 1.25 | a_section | 81 |
282 | 0.25 | a_section | 80 |
282 | 0.5 | a_section | 82 |
282 | 0.75 | a_section | 84 |
282 | 1 | a_section | 86 |
282 | 1.06 | a_section | 54 |
282 | 1.75 | a_section | 80 |
282 | 2 | a_section | 82 |
282 | 2.25 | a_section | 84 |
282 | 2.5 | a_section | 86 |
282 | 2.75 | a_section | 88 |
282 | 3.0 | a_section | 90 |
282 | 3.25 | a_section | 92 |
301 | 0.25 | b_section | 70 |
301 | 0.5 | b_section | 76 |
301 | 0.75 | b_section | 82 |
301 | 1 | b_section | 88 |
302 | 0.25 | b_section | 90 |
302 | 0.75 | b_section | 100 |
303 | 0.25 | b_section | 84 |
In Sample table shown above, I have different no. of students in each section(a,b,c,d) and school_type( public, private) and their final % obtained for each quarterly held exam. Objective is to select student ID with maximum number of time period from each section. Output table should should look like(including score also) :
ID | time_period | section |
282 | 0.25 | a_section |
282 | 0.5 | a_section |
282 | 0.75 | a_section |
282 | 1 | a_section |
282 | 1.25 | a_section |
282 | 1.5 | a_section |
282 | 1.75 | a_section |
282 | 2 | a_section |
282 | 2.25 | a_section |
282 | 2.5 | a_section |
282 | 2.75 | a_section |
301 | 0.25 | b_section |
301 | 0.5 | b_section |
301 | 0.75 | b_section |
301 | 1 | b_section |
: | : | : |
I have tried code displayed above , but getting error as:
A GROUP BY clause has been discarded because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function
Also, if u can tell how to create macro on this code for school type, it will be quite helpful
The remerging message is intended and expected, it's just a feature of SAS/SQL. When I run this test, I get 16 rows into table Public_School:
data have;
input ID time_period section $ percentage;
datalines;
281 0.25 a_section 83
281 0.5 a_section 82
281 0.75 a_section 98
281 1 a_section 88
281 1.25 a_section 81
282 0.25 a_section 80
282 0.5 a_section 82
282 0.75 a_section 84
282 1 a_section 86
282 1.06 a_section 54
282 1.75 a_section 80
282 2 a_section 82
282 2.25 a_section 84
282 2.5 a_section 86
282 2.75 a_section 88
282 3.0 a_section 90
282 3.25 a_section 92
301 0.25 b_section 70
301 0.5 b_section 76
301 0.75 b_section 82
301 1 b_section 88
302 0.25 b_section 90
302 0.75 b_section 100
303 0.25 b_section 84
;
proc sql;
create table public_school as
select a.*
from
have as a inner join
(select section, id, count(time_period) as total_period
from have
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
select * from public_school;
quit;
To get all the data from the "student ID with maximum number of time period from each section" :
proc sql;
create table public_school as
select a.*
from
data as a inner join
(select section, id, count(time_period) as total_period
from data
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
quit;
(untested)
or maybe you'l want to replace count(time_period) with sum(time_period).
I have got this note: The query requires remerging summary statistics back with the original data & dataset created with 0 rows.
The remerging message is intended and expected, it's just a feature of SAS/SQL. When I run this test, I get 16 rows into table Public_School:
data have;
input ID time_period section $ percentage;
datalines;
281 0.25 a_section 83
281 0.5 a_section 82
281 0.75 a_section 98
281 1 a_section 88
281 1.25 a_section 81
282 0.25 a_section 80
282 0.5 a_section 82
282 0.75 a_section 84
282 1 a_section 86
282 1.06 a_section 54
282 1.75 a_section 80
282 2 a_section 82
282 2.25 a_section 84
282 2.5 a_section 86
282 2.75 a_section 88
282 3.0 a_section 90
282 3.25 a_section 92
301 0.25 b_section 70
301 0.5 b_section 76
301 0.75 b_section 82
301 1 b_section 88
302 0.25 b_section 90
302 0.75 b_section 100
303 0.25 b_section 84
;
proc sql;
create table public_school as
select a.*
from
have as a inner join
(select section, id, count(time_period) as total_period
from have
group by section, id) as b on a.section=b.section and a.id=b.id
group by a.section
having total_period = max(total_period)
;
select * from public_school;
quit;
thankyou!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.