Hello All
I am working with student data, and want to categorize students on the bases of whether they are enrolled in all English as a Second Language (ESL) courses or not. I have gotten so far as to categorizing their individual courses as ESL or not. Please see the table below, I think it will help:
StudentID Course ESL_Filter
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
Please notice a few things:
- The goal is to create the ESL_Filter column
- Each student has multiple observations (i.e., they're enrolled in multiple courses)
- I want to assign "Remove" only the students who are enrolled in all ESL courses
I have the table sorted on StudentID.
Thanks in advance!
proc sql;
create table want as
select *,ifc(count(distinct course)=1 and course='ESL','REMOVE',"KEEP") as ESL_Filter
from have
group by StudentID;
quit;
StudentID | Course | ESL_Filter |
---|---|---|
1234 | ESL | KEEP |
1234 | Non-ESL | KEEP |
1234 | ESL | KEEP |
5678 | Non-ESL | KEEP |
5678 | Non-ESL | KEEP |
5678 | Non-ESL | KEEP |
9012 | ESL | REMOVE |
9012 | ESL | REMOVE |
9012 | ESL | REMOVE |
data have;
input (StudentID Course ) ($);* ESL_Filter;
cards;
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
;
proc sql;
create table want as
select *
from have
group by StudentID
having not (count(distinct course)=1 and course='ESL');
quit;
StudentID | Course |
---|---|
1234 | ESL |
1234 | Non-ESL |
1234 | ESL |
5678 | Non-ESL |
5678 | Non-ESL |
5678 | Non-ESL |
proc sql;
create table want as
select *,ifc(count(distinct course)=1 and course='ESL','REMOVE',"KEEP") as ESL_Filter
from have
group by StudentID;
quit;
StudentID | Course | ESL_Filter |
---|---|---|
1234 | ESL | KEEP |
1234 | Non-ESL | KEEP |
1234 | ESL | KEEP |
5678 | Non-ESL | KEEP |
5678 | Non-ESL | KEEP |
5678 | Non-ESL | KEEP |
9012 | ESL | REMOVE |
9012 | ESL | REMOVE |
9012 | ESL | REMOVE |
proc sql;
create table want as
select *
from StudentID
where studentID in (select studentID from have having max(course)=min(course) and min(course) = "ESL");
quit;
Another possible option,
proc sort data=have;
by studentID desc course;
run;
data want;
set have;
by studentID desc Course;
retain flag_delete 0;
if first.studentID and Course = 'ESL' then flag_delete=1;
if flag_delete = 0;
run;
@bourdeax wrote:
Hello All
I am working with student data, and want to categorize students on the bases of whether they are enrolled in all English as a Second Language (ESL) courses or not. I have gotten so far as to categorizing their individual courses as ESL or not. Please see the table below, I think it will help:
StudentID Course ESL_Filter
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
Please notice a few things:
- The goal is to create the ESL_Filter column
- Each student has multiple observations (i.e., they're enrolled in multiple courses)
- I want to assign "Remove" only the students who are enrolled in all ESL courses
I have the table sorted on StudentID.
Thanks in advance!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.