Hello All
I am working with student data, and want to categorize students on the bases of whether they are enrolled in all English as a Second Language (ESL) courses or not. I have gotten so far as to categorizing their individual courses as ESL or not. Please see the table below, I think it will help:
StudentID Course ESL_Filter
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
Please notice a few things:
- The goal is to create the ESL_Filter column
- Each student has multiple observations (i.e., they're enrolled in multiple courses)
- I want to assign "Remove" only the students who are enrolled in all ESL courses
I have the table sorted on StudentID.
Thanks in advance!
proc sql;
create table want as
select *,ifc(count(distinct course)=1 and course='ESL','REMOVE',"KEEP") as ESL_Filter
from have
group by StudentID;
quit;
| StudentID | Course | ESL_Filter |
|---|---|---|
| 1234 | ESL | KEEP |
| 1234 | Non-ESL | KEEP |
| 1234 | ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 9012 | ESL | REMOVE |
| 9012 | ESL | REMOVE |
| 9012 | ESL | REMOVE |
data have;
input (StudentID Course ) ($);* ESL_Filter;
cards;
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
;
proc sql;
create table want as
select *
from have
group by StudentID
having not (count(distinct course)=1 and course='ESL');
quit;
| StudentID | Course |
|---|---|
| 1234 | ESL |
| 1234 | Non-ESL |
| 1234 | ESL |
| 5678 | Non-ESL |
| 5678 | Non-ESL |
| 5678 | Non-ESL |
proc sql;
create table want as
select *,ifc(count(distinct course)=1 and course='ESL','REMOVE',"KEEP") as ESL_Filter
from have
group by StudentID;
quit;
| StudentID | Course | ESL_Filter |
|---|---|---|
| 1234 | ESL | KEEP |
| 1234 | Non-ESL | KEEP |
| 1234 | ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 5678 | Non-ESL | KEEP |
| 9012 | ESL | REMOVE |
| 9012 | ESL | REMOVE |
| 9012 | ESL | REMOVE |
proc sql;
create table want as
select *
from StudentID
where studentID in (select studentID from have having max(course)=min(course) and min(course) = "ESL");
quit;
Another possible option,
proc sort data=have;
by studentID desc course;
run;
data want;
set have;
by studentID desc Course;
retain flag_delete 0;
if first.studentID and Course = 'ESL' then flag_delete=1;
if flag_delete = 0;
run;
@bourdeax wrote:
Hello All
I am working with student data, and want to categorize students on the bases of whether they are enrolled in all English as a Second Language (ESL) courses or not. I have gotten so far as to categorizing their individual courses as ESL or not. Please see the table below, I think it will help:
StudentID Course ESL_Filter
1234 ESL Keep
1234 Non-ESL Keep
1234 ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
5678 Non-ESL Keep
9012 ESL Remove
9012 ESL Remove
9012 ESL Remove
Please notice a few things:
- The goal is to create the ESL_Filter column
- Each student has multiple observations (i.e., they're enrolled in multiple courses)
- I want to assign "Remove" only the students who are enrolled in all ESL courses
I have the table sorted on StudentID.
Thanks in advance!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.