Hi All,
I have the following table. can anyone help me with the code so that I can get student list (and their course) who has taken English 1 or English 2 first.
I have written the following code, but t's not working.
data have;
set out;
if first.Course='English 2' or if first.course='English 1';
run;
Student_ID Course
101 English 2
101 Math 1
101 Math 3
102 English 1
102 Math 2
102 Math 2
103 Math 2
103 English 1
104 English 1
104 Math 3
My expected output would be:
Student_ID Course
101 English 2
101 Math 1
101 Math 3
102 English 1
102 Math 2
102 Math 2
104 English 1
104 Math 3
There might be a better way to do this, but this should work:
data have;
set out;
by Student_ID;
retain write;
if first.Student_ID then do;
if course in ('English 1', 'English 2') then write=1;
else write=0;
end;
if write then output;
drop write;
run;
Thanks for the help. It worked!!!
Hi WendyCzika,
Would you please describe me a bit how the following part of your code worked, as I want to learn so that it help me in future.
Thanks,
data have;
set out;
by Student_ID;
retain write;
if first.Student_ID then do;
if course in ('English 1', 'English 2') then write=1;
else write=0;
end;
if write then output;
drop write;
run;
Adding comments to the code, hope this helps!
data have;
set out;
/****************************************************************************************************************************
Use the BY statement to identify Student_ID as a BY variable so that first.student_id and last.student_id
temporary variables are created:
SAS sets the value of FIRST.variable to 1 when it reads the first observation in a BY group, and sets the value of
LAST.variable to 1 when it reads the last observation in a BY group. These temporary variables are
available for DATA step programming but are not added to the output data set.
****************************************************************************************************************************/
by Student_ID;
/****************************************************************************************************************************
Specify a variable in the RETAIN statement to keep the same value from observation to the next, instead of resetting to missing at each iteration of the
DATA step - so the WRITE variable will keep the same value for all observations with the same student_ID since it is only being changed for the first obs
of each student_id
****************************************************************************************************************************/
retain write;
/****************************************************************************************************************************
So at the first observation in each BY group (first obs for each student_ID), set the variable WRITE to 1 if the course is English 1 or English 2, otherwise
set WRITE to 0, and keep this value for the rest of the obs with this same student_ID.
****************************************************************************************************************************/
if first.Student_ID then do;
if course in ('English 1', 'English 2') then write=1;
else write=0;
end;
/****************************************************************************************************************************
Output all the obs that have WRITE=1.
****************************************************************************************************************************/
if write then output;
drop write;
run;
Hi WendyCzika,
I need one more help here. Can you please tell me the code to find out the students who has taken "English 1" and "Math 3"?
output would be:
Student_ID Course
104 English 1
104 Math 3
proc sql;
select * from have group by student_id having course='English 1' and course='Math 3';
quit;
data have;
input Student_ID Course & $20.;
cards;
101 English 2
101 Math 1
101 Math 3
102 English 1
102 Math 2
102 Math 2
103 Math 2
103 English 1
104 English 1
104 Math 3
;
run;
data want;
do until(last.Student_ID);
set have;
by Student_ID;
if Course='English 1' then a=1;
if Course='Math 3' then b=1;
end;
do until(last.Student_ID);
set have;
by Student_ID;
if a and b then output;
end;
drop a b;
run;
Hi Xia,
Thanks for your code, it's working perfect!!! Would you please answer the following two things:
1. describe the following highlighted section how it works? It will help me understanding you code.
data want;
do until(last.Student_ID);
set have;
by Student_ID;
if Course='English 1' then a=1;
if Course='Math 3' then b=1;
end;
do until(last.Student_ID);
set have;
by Student_ID;
if a and b then output;
end;
drop a b;
run;
2. If I want student who has taken 'English I' or 'Math 3' should I just write 'if a or b then output' instead of 'if a and b then output'?
1. describe the following highlighted section how it works? It will help me understanding you code.
data want;
do until(last.Student_ID);
set have;
by Student_ID;
if Course='English 1' then a=1; /*Mark this Student_ID group has English 1 */
if Course='Math 3' then b=1; /*Mark this Student_ID group has Math 3 */
end;
do until(last.Student_ID);
set have;
by Student_ID;
if a and b then output; /*If these two conditions are satisfied then output. NOTE: it will output the whole Student_ID group */
end;
drop a b;
run;
2. If I want student who has taken 'English I' or 'Math 3' should I just write 'if a or b then output' instead of 'if a and b then output'?
Yes. You can do that .
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.