Hello all,
I want to write proc sql for find first and last row for some operations with group by two variable. I found the SAS example in the https://blogs.sas.com/content/iml/2018/02/26/how-to-use-first-variable-and-last-variable-in-a-by-gro... . But I couldn't find the proc sql version.
How would I approach this?
Thank you
Looks like you are just aggregating three variables over groups defined by the values of ID and TYPE.
So the normal SAS syntax would be something like:
proc summary data=group_sort nway;
by id type;
var credit_before_semester credit_after_semester semester_credit;
output out=group_1
n=type_count
sum=sum_credit_before_semester sum_credit_after_semester sum_semester_credit
;
run;
Or you could use PROC SQL.
proc sql;
create table group_1 as
select id,type
, count(*) as type_count
, sum(credit_before_semester) as sum_credit_before_semester
, sum(credit_after_semester) as sum_credit_after_semester
, sum(semester_credit) as sum_semester_credit
group by id,type
;
quit;
There is no such thing as FIRST.variable and LAST.variable in PROC SQL. You should do this in a DATA step.
And I will repeat ... the easiest way and the best way to do this is in a DATA step.
PROC SQL does not guarantee the order of the observations. Therefore, which one is first within a group (and which one is last within a group) is subject to change, and not reliable.
If you really must use SQL, you will have to supply a more specific transpose problem. Wide to long? Long to wide? How many variables involved? Give a specific example of what you would like to accomplish. But as you have already been told, SQL is the wrong tool for the job.
As Paige said, the best tool is data step,NOT sql.
Anyway, there is some sql code could get first last. But I don't like it.
proc sort data=sashelp.class out=have;by sex;run; ods select none; ods output sql_results=sql_results; proc sql number; select * from have; quit; ods select all; proc sql; create table want as select * from sql_results group by sex having row=min(row) or row=max(row); quit;
My Table has 12k columns and 72 rows table sorted by id and type.
If table grouped by single variable, i can be use monotonic for min and max row. But table grouped by two variable. So, i can not use monotonic for min and max for groups.
My table like:
ID TYPE NAME GPA CREDIT FACULTY DEPARTMANT SCHOLARSHIP CLUB_FLAG …
13416 T1 Micheal 3.00 93.00 ENG. FAC. COMP.ENG. 1 0 …
13416 T3 Micheal 3.00 93.00 ENG. FAC. COMP.ENG. 0 0 …
13417 T2 Sam 1.68 123.00 ENG. FAC. COMP.ENG. 1 0 …
13418 T1 Keanu 3.51 135.00 ENG. FAC. COMP.ENG. 1 0 …
13418 T2 Keanu 3.51 135.00 ENG. FAC. COMP.ENG. 0 1 …
13418 T4 Keanu 3.51 135.00 ENG. FAC. COMP.ENG. 0 1 …
13419 T3 Ann 1.85 84.00 ENG. FAC. COMP.ENG. 0 0 …
13420 T4 George 2.10 27.00 ENG. FAC. COMP.ENG. 0 0 …
13421 T1 Jennifer 0.41 112.00 ENG. FAC. COMP.ENG. 1 1 …
13421 T2 Jennifer 0.41 112.00 ENG. FAC. COMP.ENG. 0 1 …
13422 T1 Ann 2.13 37.00 ENG. FAC. COMP.ENG. 0 1 …
13422 T2 Ann 2.13 37.00 ENG. FAC. COMP.ENG. 0 0 …
13422 T3 Ann 2.13 37.00 ENG. FAC. COMP.ENG. 0 0 …
13422 T4 Ann 2.13 37.00 ENG. FAC. COMP.ENG. 1 1 …
… … … … … … … … … …
Thank you for your answers @PaigeMiller @Astounding @Ksharp . I have to writed by PROC SQL.
Why do you have to use PROC SQL?
Is it that you are actually pushing the SQL code into some external database? If so then you should check if the external database's implementation of SQL has features that might help.
Is it that you are using some tool only can generate SQL code? What tool?
If it is because someone told you to use PROC SQL, then educate them that it is best to use the right tool for the job and SQL is NOT the tool for any job that is dependent on an ordering of observations that is not based on the values of the variables in the data.
Add the sequence number first using a data step ( or data step view) and then you can use the sequence number to detect FIRST (seq=1) and LAST (seq=max(seq).
data for_sql_step;
set have;
by id1 id2;
if first.id2 then seq=0;
seq+1;
run;
OK. It is not a big deal. If you have two or more variables need sorted ,try this one :
ods select none;
ods output sql_results=sql_results;
proc sql number;
select * from sashelp.heart /*the dataset you have*/
order by status,sex ; /*two variables need to sorted*/
quit;
ods select all;
proc sql;
create table want as
select * from sql_results
group by status,sex
having row=min(row) or row=max(row);
quit;
I'm using Sas Ondemand. Actually my task is kind of like a homework, which is to convert SAS to PROC SQL. The outputs of the SAS must match the outputs of the SQL.
You can see the SAS block below. I am new in Community so I apologize if I can't articulate very well what I need in this question page - thanks in advance for all the help!
data Group_1;
set group_sort;
by id type;
if first.type then do;
sum_credit_before_semester=0;
sum_credit_after_semester=0;
sum_semester_credit;
type_COUNT = 0;
end;
sum_credit_before_semester + credit_before_semester ;
sum_credit_after_semester+credit_after_semester;
sum_semester_credit + semester_credit;
type_COUNT+1;
if last.type;
drop
credit_before_semester credit_after_semester semester_credit
run;
Looks like you are just aggregating three variables over groups defined by the values of ID and TYPE.
So the normal SAS syntax would be something like:
proc summary data=group_sort nway;
by id type;
var credit_before_semester credit_after_semester semester_credit;
output out=group_1
n=type_count
sum=sum_credit_before_semester sum_credit_after_semester sum_semester_credit
;
run;
Or you could use PROC SQL.
proc sql;
create table group_1 as
select id,type
, count(*) as type_count
, sum(credit_before_semester) as sum_credit_before_semester
, sum(credit_after_semester) as sum_credit_after_semester
, sum(semester_credit) as sum_semester_credit
group by id,type
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.