BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Buzzy_Bee
Quartz | Level 8

I feel like this should be a fairly simple problem, but I can't get it right unfortunately.

Say I have got some schools with a few students in each, and some students belong to group A and others belong to group B. The first school ID is 12345, and all three students are in the A group, so I want this school and all three of its students to go into the a_students output.

The next school, with school ID 99888, has three students also, but Jane is in group A and the other two are in group B. Since not every student is in A group, that school doesn't meet the criteria and so the school and its three students get put in the 'other' output. The last school has on B group students so goes in the other output.

For my first part of code, I've tried to flag it if first.SchoolID is group A. But for the second part I just can't get it right so I have put ****** in the part that I can't fix. Thanks for your help.

data have;
	infile datalines dlm=',' dsd truncover;
	input SchoolID $ StudentID :$4. Name :$7. Group :$10. ;
	datalines;
12345,1336,Li,A
12345,2337,Amy,A
12345,3338,Rahul,A
99888,2222,Tim,B
99888,2222,Jane,A
99888,2222,Mary,B
11777,1123,George,B
11777,1124,Xi,B
;
DATA A_students 
		other;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if first.SchoolID then _check_A=Group;
     if Group='A' then _flag=1;
end;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if ******** then output A_students;
     else output other;
end;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You only need a single flag variable that holds the group and can be used to determine 'other' status:

data
  a_students
  b_students
  other
;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if first.schoolid then _group = group;
  if _group ne group then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  select (_group);
    when ('A') output a_students;
    when ('B') output b_students;
    when ('O') output other;
  end;
end;
drop _group;
run;

View solution in original post

11 REPLIES 11
Buzzy_Bee
Quartz | Level 8
Sorry this line has a typo. Should say "The last school has ONLY B group students so goes in the other output."
Buzzy_Bee
Quartz | Level 8

Hi Kurt,

Any technique at all that solves it would make me happy. When I did the SAS programming course a while back, I thought the first and last variables looked so handy, and yet some problems I just can't get right using them. When I tried this, it outputted any occurrence of a student in group A into the output. I just can't solve it for the case of every student must be in group A or that school and its students don't go into the A_students output.

 

But any quick and dirty technique is welcome. It's a simplified version of something I'm trying to do at work, but due to lockdown I'm working alone with no experts to ask. 

Buzzy_Bee
Quartz | Level 8

Sorry - I don't think I answered your question properly. Yes - they'll always either be flagged as A or B. There won't be any other options in there and there is no chance of a group C or D etc in the future. Group A means the student met a certain criteria, so B is just the 'else' condition for students that didn't fit into group A.

Kurt_Bremser
Super User

Then you can simplify the code I posted to:

data
  a_students
  other
;
_group = 'A';
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if group ne 'A' then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if _group = 'A'
  then output a_students;
  else output other;
end;
drop _group;
run;
Kurt_Bremser
Super User

And if you only need to find the schoolid, it's even simpler:

data
  a_students
  other
;
_group = 'A';
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if group ne 'A' then _group = 'O';
end;
if _group = 'A'
then output a_students;
else output other;
keep schoolid;
run;
Shmuel
Garnet | Level 18

Are you trying to create a separate dataset per group for those rows that fit other condition(s) too?

Something like:

data stud_a
        stud_b
       other;
set have;
      if group="A" and <other condition> then output stud_a; else
      if group="B" and <other condition> then output stud_b;
      else output other;
run;
Buzzy_Bee
Quartz | Level 8

Hi Shmuel, 

No, I don't need an output for 'b_students'. I'm only wanting to create two output groups in this case. It's basically if every student has met the A group condition, then I'll output them as A_students and those students will considered students who doing okay at school. Then you can think of the B students as those who need more attention or tuition. So if a school has only A students then they'll go into a list of schools/students who don't need further improvement so I can forget about them. The schools with B's would be reviewed.

 

Thanks for your help.

Shmuel
Garnet | Level 18

Depending on the complexity of your condindition, it seems to me that the most simple code

will be:

 

data Stud_a Other;

 set have;

       if <condition> then output stud_a;

       else output stud_b;

 run;

 

Without knowing the <condition> - I cannot understand why to read twice the group.

Kurt_Bremser
Super User

You only need a single flag variable that holds the group and can be used to determine 'other' status:

data
  a_students
  b_students
  other
;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if first.schoolid then _group = group;
  if _group ne group then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  select (_group);
    when ('A') output a_students;
    when ('B') output b_students;
    when ('O') output other;
  end;
end;
drop _group;
run;
Buzzy_Bee
Quartz | Level 8

Thanks so much for writing that. Now I can see where I was going wrong - I should have used _check not equal to in that first do until loop.

If I rewrite mine now using your logic, but so I just get the A_students and other group that I'm needing, this below works perfectly. I do need to retain all of the variables in both outputs, not just schoolID.

 

DATA A_students
		other;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if first.SchoolID then _check=Group;
     if _check NE Group then _check='B';
end;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
  		select (_check);
    	when ('A') output a_students;
    	when ('B') output other;
  		end;
end;
drop _:;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2572 views
  • 4 likes
  • 3 in conversation