BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Buzzy_Bee
Quartz | Level 8

I feel like this should be a fairly simple problem, but I can't get it right unfortunately.

Say I have got some schools with a few students in each, and some students belong to group A and others belong to group B. The first school ID is 12345, and all three students are in the A group, so I want this school and all three of its students to go into the a_students output.

The next school, with school ID 99888, has three students also, but Jane is in group A and the other two are in group B. Since not every student is in A group, that school doesn't meet the criteria and so the school and its three students get put in the 'other' output. The last school has on B group students so goes in the other output.

For my first part of code, I've tried to flag it if first.SchoolID is group A. But for the second part I just can't get it right so I have put ****** in the part that I can't fix. Thanks for your help.

data have;
	infile datalines dlm=',' dsd truncover;
	input SchoolID $ StudentID :$4. Name :$7. Group :$10. ;
	datalines;
12345,1336,Li,A
12345,2337,Amy,A
12345,3338,Rahul,A
99888,2222,Tim,B
99888,2222,Jane,A
99888,2222,Mary,B
11777,1123,George,B
11777,1124,Xi,B
;
DATA A_students 
		other;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if first.SchoolID then _check_A=Group;
     if Group='A' then _flag=1;
end;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if ******** then output A_students;
     else output other;
end;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You only need a single flag variable that holds the group and can be used to determine 'other' status:

data
  a_students
  b_students
  other
;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if first.schoolid then _group = group;
  if _group ne group then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  select (_group);
    when ('A') output a_students;
    when ('B') output b_students;
    when ('O') output other;
  end;
end;
drop _group;
run;

View solution in original post

11 REPLIES 11
Buzzy_Bee
Quartz | Level 8
Sorry this line has a typo. Should say "The last school has ONLY B group students so goes in the other output."
Buzzy_Bee
Quartz | Level 8

Hi Kurt,

Any technique at all that solves it would make me happy. When I did the SAS programming course a while back, I thought the first and last variables looked so handy, and yet some problems I just can't get right using them. When I tried this, it outputted any occurrence of a student in group A into the output. I just can't solve it for the case of every student must be in group A or that school and its students don't go into the A_students output.

 

But any quick and dirty technique is welcome. It's a simplified version of something I'm trying to do at work, but due to lockdown I'm working alone with no experts to ask. 

Buzzy_Bee
Quartz | Level 8

Sorry - I don't think I answered your question properly. Yes - they'll always either be flagged as A or B. There won't be any other options in there and there is no chance of a group C or D etc in the future. Group A means the student met a certain criteria, so B is just the 'else' condition for students that didn't fit into group A.

Kurt_Bremser
Super User

Then you can simplify the code I posted to:

data
  a_students
  other
;
_group = 'A';
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if group ne 'A' then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if _group = 'A'
  then output a_students;
  else output other;
end;
drop _group;
run;
Kurt_Bremser
Super User

And if you only need to find the schoolid, it's even simpler:

data
  a_students
  other
;
_group = 'A';
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if group ne 'A' then _group = 'O';
end;
if _group = 'A'
then output a_students;
else output other;
keep schoolid;
run;
Shmuel
Garnet | Level 18

Are you trying to create a separate dataset per group for those rows that fit other condition(s) too?

Something like:

data stud_a
        stud_b
       other;
set have;
      if group="A" and <other condition> then output stud_a; else
      if group="B" and <other condition> then output stud_b;
      else output other;
run;
Buzzy_Bee
Quartz | Level 8

Hi Shmuel, 

No, I don't need an output for 'b_students'. I'm only wanting to create two output groups in this case. It's basically if every student has met the A group condition, then I'll output them as A_students and those students will considered students who doing okay at school. Then you can think of the B students as those who need more attention or tuition. So if a school has only A students then they'll go into a list of schools/students who don't need further improvement so I can forget about them. The schools with B's would be reviewed.

 

Thanks for your help.

Shmuel
Garnet | Level 18

Depending on the complexity of your condindition, it seems to me that the most simple code

will be:

 

data Stud_a Other;

 set have;

       if <condition> then output stud_a;

       else output stud_b;

 run;

 

Without knowing the <condition> - I cannot understand why to read twice the group.

Kurt_Bremser
Super User

You only need a single flag variable that holds the group and can be used to determine 'other' status:

data
  a_students
  b_students
  other
;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  if first.schoolid then _group = group;
  if _group ne group then _group = 'O';
end;
do until (last.schoolid);
  set have;
  by schoolid notsorted;
  select (_group);
    when ('A') output a_students;
    when ('B') output b_students;
    when ('O') output other;
  end;
end;
drop _group;
run;
Buzzy_Bee
Quartz | Level 8

Thanks so much for writing that. Now I can see where I was going wrong - I should have used _check not equal to in that first do until loop.

If I rewrite mine now using your logic, but so I just get the A_students and other group that I'm needing, this below works perfectly. I do need to retain all of the variables in both outputs, not just schoolID.

 

DATA A_students
		other;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
     if first.SchoolID then _check=Group;
     if _check NE Group then _check='B';
end;
do until (last.SchoolID);
     set have;
     by SchoolID notsorted;
  		select (_check);
    	when ('A') output a_students;
    	when ('B') output other;
  		end;
end;
drop _:;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2670 views
  • 4 likes
  • 3 in conversation