Solved: Proc Transpose - Repeated Values

JAR · Posted 02-13-2012 05:21 AM

Dear All,

I have the following table

Teacher	Score	Student
Teacher1	A	Tom
Teacher1	A	Harry
Teacher1	B	Ron
Teacher2	A	Neval

I need to transpose it in this manner:

Teacher	A's	B's
Teacher1	Tom	Ron
Teacher1	Harry
Teacher2	Neval

Howerever, Proc Transose sends and error as ID (score) has duplicates.

Could someone help me to write a proc code so that when the repeatation happends, the entry must be moved to the next row (as in the second one)?

Thanking you in advance,

Sincerely,

Jijil Ramakrishnan

Astounding · Posted 02-13-2012 09:28 AM

JAR,

Here's an approach that is deceptively short. You have to understand some of the inner workings of the DATA step. In this case, you'll get a note about a many-to-many merge. It's harmless this time, but may not be in other DATA steps:

data final;

merge original (rename=(student=A) where=(score='A'))

original (rename=(student=B) where=(score='B'));

by teacher;

output;

A = ' ';

B = ' ';

run;

All of the replies so far are relying on your data containing just A and B. All could be extended to handle other values as well. But it's a harder problem to tackle if you don't know the values of SCORE ahead of time.

Good luck.

View solution in original post

art297 · Posted 02-13-2012 08:38 AM

If it doesn't matter which people get paired, as long as they are on the currect teacher/grade columns and variables, then you could use something like:

data have;

input (Teacher Score Student) ($);

cards;

Teacher1 A Tom

Teacher1 A Harry

Teacher1 B Ron

Teacher2 A Neval

;

data a (keep=Teacher A) b (keep=Teacher B);

set have;

if Score eq "A" then do;

a=Student;

b='ZZZZZZZZ';

output a;

output b;

end;

else if Score eq "B" then do;

b=Student;

a='ZZZZZZZZ';

output a;

output b;

end;

run;

proc sort data=a;

by teacher a;

run;

proc sort data=b;

by teacher b;

run;

data want;

set a;

set b;

if catt(a,b) ne 'ZZZZZZZZZZZZZZZZ';

if a eq 'ZZZZZZZZ' then call missing(a);

if b eq 'ZZZZZZZZ' then call missing(b);

run;

data_null__ · Posted 02-13-2012 09:19 AM

Are you willing to create one more variable?

dm 'clear log; clear output;';
data test;
   input teacher:$10. grade :$1. name :$8.;
   cards;
Teacher1 A Tom 
Teacher1 A Harry 
Teacher1 B Ron 
Teacher2 A Neval 
;;;;
   run;
proc sort data=test;
   by teacher grade name;
   run;
data test;
   set test;
   by teacher grade;
   if first.grade then index=0;
   index +1;
   run;
proc sort data=test;
   by teacher index grade;
   run;
proc transpose data=test out=want(drop=_name_);;
   by teacher index;
   var name;
   id grade;
   run;
proc print;
   run;

Linlin · Posted 02-13-2012 09:57 AM

Hi Data _null_,

Why did you use ';;;;' instead of ';'?

Thank you!

art297 · Posted 02-13-2012 10:02 AM

Linlin: a lot of people use ;;;; instead of ; to end a cards, datalines, cards4 or datalines4 section as they are needed for cards4 and datalines4 and don't pose a problem for the other two.

Cards4 and Datalines4 are necessary when the input contains a semi-colon

data_null__ · Posted 02-13-2012 10:08 AM

I like to be VERY careful. :smileygrin:

Linlin · Posted 02-13-2012 10:19 AM

Thank you Art and Data _null_!

Astounding · Posted 02-13-2012 09:28 AM

JAR,

Here's an approach that is deceptively short. You have to understand some of the inner workings of the DATA step. In this case, you'll get a note about a many-to-many merge. It's harmless this time, but may not be in other DATA steps:

data final;

merge original (rename=(student=A) where=(score='A'))

original (rename=(student=B) where=(score='B'));

by teacher;

output;

A = ' ';

B = ' ';

run;

All of the replies so far are relying on your data containing just A and B. All could be extended to handle other values as well. But it's a harder problem to tackle if you don't know the values of SCORE ahead of time.

Good luck.

JAR · Posted 02-14-2012 08:44 AM

Your code is astounding! It worked perfectly.

There is some sort of elegance in this.

Thanks Mr. Atounding!

I wonder how would you have dealt with it, if the scores are not predetermined.

Jijil

Astounding · Posted 02-14-2012 09:24 AM

Jijil,

Thank you. If the scores are not predetermined, macro language gets involved. (I didn't check data_null_'s approach, but it might work for "normal" scores. It would run into trouble if the same data set had scores of both $ and # because the ID statement in PROC TRANSPOSE would try to create a variable named _ for both of those. A similar issue would arise if the data contained both A+ and A-.)

The details might be burdensome, but this is the approach I would take. Capture a list of distinct values for SCORE. (Be careful not to overcount if any of the SCORE values actually contains an embedded blank, such as A +.) Use two sets of loops in macro language to generate SCORE_01 - SCORE_nn as needed and to label them appropriately. For example, in the MERGE statement, one iteration through the loop would generate:

original (rename=(student=score_01) where=(score="&first_value_found"))

Use a second loop through the same values later, to generate labels, for example:

label score_01 = "Score was &first_value_found");

Even that might be a hair too simple if any of the original SCORE values contain double quotes, but that's the idea.

Good luck.

Ksharp · Posted 02-13-2012 11:13 PM

data test;
   input teacher:$10. grade :$1. name :$8.;
   cards;
Teacher1 A Tom 
Teacher1 A Harry 
Teacher1 B Ron 
Teacher2 A Neval 
;;;;
   run;
proc sql noprint;
 select distinct cats('test(where=(teacher="',teacher,'" and grade="',grade,'") rename=(name=',grade,'))') into : list separated by ' '
  from test;
quit;
%put &list;
data want(drop=grade);
 merge &list ;
 by teacher;
 output;
 call missing(of _all_);
run;

Ksharp

Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Re: Proc Transpose - Repeated Values

Registration is open

SAS Training: Just a Click Away