- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I (SAS 9.3) have a dataset for a training course where students' block grades are in individual rows. If a student fails a block they are held back until the next class where they reattempt the block and must pass before they can continue in training.
I am wanting to create a "first-attempt only" dataset that contains only their block grades up until their point of failure. Conceptually I want SAS to look at each ID and within ID retain only the rows associated with the student's first training attempt. First training attempt is defined as their performance in the training class they started in (e.g., 202001).
For example, in the dataset below, student 1001 progressed through training in class 202001 until Block_3 where they failed. They then resumed training in class 202002 and finished all five blocks. For student 1001 I want to keep their performance in class 202001 (their first training attempt) and drop all observations of subsequent training classes (202002).
This is an example of my dataset. My goal is, within each ID, keep only the observations associated with the trainees first class. For 1001 that's class 202001, for 1002 that's class 202004, etc.
ID | Class | Training_Block | Grade | Status |
1001 | 202001 | Block_1 | 88 | PASS |
1001 | 202001 | Block_2 | 76 | PASS |
1001 | 202001 | Block_3 | 68 | FAIL |
1001 | 202002 | Block_3 | 71 | PASS |
1001 | 202002 | Block_4 | 86 | PASS |
1001 | 202002 | Block_5 | 90 | PASS |
1002 | 202004 | Block_1 | 50 | FAIL |
1002 | 202006 | Block_1 | 73 | PASS |
1002 | 202006 | Block_2 | 91 | PASS |
1002 | 202006 | Block_3 | 95 | PASS |
1002 | 202006 | Block_4 | 71 | PASS |
1002 | 202006 | Block_5 | 78 | PASS |
1003 | 202003 | Block_1 | 76 | PASS |
1003 | 202003 | Block_2 | 89 | PASS |
1003 | 202003 | Block_3 | 80 | PASS |
1003 | 202003 | Block_4 | 95 | PASS |
1003 | 202003 | Block_5 | 97 | PASS |
1004 | 202008 | Block_1 | 74 | PASS |
1004 | 202008 | Block_2 | 70 | PASS |
1004 | 202008 | Block_3 | 60 | FAIL |
1004 | 202009 | Block_3 | 82 | PASS |
1004 | 202009 | Block_4 | 68 | FAIL |
1004 | 202010 | Block_4 | 72 | PASS |
1004 | 202010 | Block_5 | 90 | PASS |
This is what I'm conceptually wanting SAS to do. Keep all observations with first class value within each ID and drop all subsequent class values within each ID:
ID | Class | Training_Block | Grade | Status |
1001 | 202001 | Block_1 | 88 | PASS |
1001 | 202001 | Block_2 | 76 | PASS |
1001 | 202001 | Block_3 | 68 | FAIL |
1002 | 202004 | Block_1 | 50 | FAIL |
1003 | 202003 | Block_1 | 76 | PASS |
1003 | 202003 | Block_2 | 89 | PASS |
1003 | 202003 | Block_3 | 80 | PASS |
1003 | 202003 | Block_4 | 95 | PASS |
1003 | 202003 | Block_5 | 97 | PASS |
1004 | 202008 | Block_1 | 74 | PASS |
1004 | 202008 | Block_2 | 70 | PASS |
1004 | 202008 | Block_3 | 60 | FAIL |
This is my desired output:
ID | Class | Training_Block | Grade | Status |
1001 | 202001 | Block_1 | 88 | PASS |
1001 | 202001 | Block_2 | 76 | PASS |
1001 | 202001 | Block_3 | 68 | FAIL |
1002 | 202004 | Block_1 | 50 | FAIL |
1003 | 202003 | Block_1 | 76 | PASS |
1003 | 202003 | Block_2 | 89 | PASS |
1003 | 202003 | Block_3 | 80 | PASS |
1003 | 202003 | Block_4 | 95 | PASS |
1003 | 202003 | Block_5 | 97 | PASS |
1004 | 202008 | Block_1 | 74 | PASS |
1004 | 202008 | Block_2 | 70 | PASS |
1004 | 202008 | Block_3 | 60 | FAIL |
I feel like I've done this before and it involved do loops and first.dot and last.dot processing, but for the life of me I can't figure out how to do it this time.
Thanks in advance for help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try next code:
data want;
set have;
by id ;
retain flag;
if first.id then flag=0;
if status = 'FAIL' then flag=1;
if not flag=1;
run;