- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data dat3;
input name $ id $ exam score;
datalines;
John A01 1 89
John A01 2 90
John A01 3 92
John A01 3 95
Mary A02 1 92
Mary A02 3 81
Mary A02 3 85
run;
proc transpose data=dat3 out=dat3_out1 (drop=_name_) prefix=test_;
var score;
by name;
id exam;
run;
Log Issue:
ERROR: The ID value "test_3" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
name=John
ERROR: The ID value "test_3" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
name=Mary
ERROR: All BY groups were bad.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 7 observations read from the data set WORK.DAT3.
WARNING: The data set WORK.DAT3_OUT1 may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.DAT3_OUT1 was not replaced because this step was stopped.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This looks like the right approach. However, we don't really know how many duplicates there might be, or for which exam values they might occur. In order to handle many possible variations, the data preparation needs to take care of additional situations. After creating DAT3:
proc sort data=dat3;
by name id exam;
run;
data temp;
set dat3;
by name id exam;
if first.exam then group=1;
else group + 1;
run;
proc sort data=temp;
by name id group;
run;
At that point, the final PROC TRANSPOSE in @Ksharp's solution would be appropriate.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The first step has nothing to do with programming. What do you want the result to look like? You have two observations for John, A01, exam 3. Do you want to use the 89 and ignore the 95, or use the 95 and ignore the 89? Do you want both? If so, what would the output data set look like?
Your answers determine how the program needs to change.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Required Output:
name id test_1 test_2 test_3
JOHN A01 89 90 92
JOHN A01 . . 95
MARY A02 92 . 81
MARY A02 . . 85
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Make a group variable before proc transpose.
data dat3;
input name $ id $ exam score;
datalines;
John A01 1 89
John A01 2 90
John A01 3 92
John A01 3 95
Mary A02 1 92
Mary A02 3 81
Mary A02 3 85
;
run;
data temp;
set dat3;
by name id;
if first.id or exam=lag(exam) then group+1;
run;
proc transpose data=temp out=want prefix=test_;
by name id group;
id exam;
var score;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This looks like the right approach. However, we don't really know how many duplicates there might be, or for which exam values they might occur. In order to handle many possible variations, the data preparation needs to take care of additional situations. After creating DAT3:
proc sort data=dat3;
by name id exam;
run;
data temp;
set dat3;
by name id exam;
if first.exam then group=1;
else group + 1;
run;
proc sort data=temp;
by name id group;
run;
At that point, the final PROC TRANSPOSE in @Ksharp's solution would be appropriate.