## Transform data format to accommodate Proc expectations

Solved
Regular Contributor
Posts: 238

# Transform data format to accommodate Proc expectations

Suppose we are faced with results of 100 races by 100 runners.  Consider a subset of the results in the following spreadsheet:

 Order Finished Race 1 Race 2 Race 3 1 D E D 2 A D B 3 E * A 4 C B C 5 B C *

A SAS procedure I'm trying to use shows an example with the data in a different form, as follows:

 Runner Order_1 Order_2 Order 3 A 2 * 3 B 5 4 2 C 4 5 4 D 1 2 1 E 3 1 *

As can be seen, some runners miss certain races.

The above transformation was easy enough to do manually.  But 100 x 100 is another story.  Could one of you please help with the SAS code that will transform from the first format to the second?

Your help will be greatly appreciated.

By the way, the SAS example I am attempting to replicate using my data set is the following:

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

Thanks much!

Nicholas Kormanik

Accepted Solutions
Solution
‎04-30-2013 09:58 PM
Posts: 5,539

## Re: Transform data format to accommodate Proc expectations

Use this :

data input;
input rank (race_1-race_3) (\$) ;
cards;
1 D E D
2 A D B
3 E * A
4 C B C
5 B C *
;

data list;
set input;
array r race_:;
do race = 1 to dim(r);
name = r{race};
if anyalpha(name) then output;
end;
drop race_:;
run;

proc sort data=list; by name race; run;

proc transpose data=list out=want(drop=_name_) prefix=Order_;
by name;
var rank;
id race;
run;

option missing=*;
proc print data=want noobs; var name order_1-order_3; run;
option missing=.;

PG

PG

All Replies
Solution
‎04-30-2013 09:58 PM
Posts: 5,539

## Re: Transform data format to accommodate Proc expectations

Use this :

data input;
input rank (race_1-race_3) (\$) ;
cards;
1 D E D
2 A D B
3 E * A
4 C B C
5 B C *
;

data list;
set input;
array r race_:;
do race = 1 to dim(r);
name = r{race};
if anyalpha(name) then output;
end;
drop race_:;
run;

proc sort data=list; by name race; run;

proc transpose data=list out=want(drop=_name_) prefix=Order_;
by name;
var rank;
id race;
run;

option missing=*;
proc print data=want noobs; var name order_1-order_3; run;
option missing=.;

PG

PG
Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations

Outstanding, , masterfully done.

I'd never have gotten it on my own.

Posts: 3,167

## Re: Transform data format to accommodate Proc expectations

Indeed, your solution not only precise, but also intuitive. FWIW, here is a hash() approach (I know, I know, it is awful. Hey, Ksharp is online, hopefully he will come up a better Hash solution):

data input;

input rank (race_1-race_3) (\$) ;

cards;

1 D E D

2 A D B

3 E * A

4 C B C

5 B C *

;

data _null_;

call symputx('obs',nobs);

stop;

set input nobs=nobs;

run;

proc transpose data=input out=have1;

var _all_;

run;

data _null_;

call symputx('var',nobs-1);

stop;

set have1 nobs=nobs;

run;

%put &var &obs;

data want;

dcl hash hoh();

dcl hiter hih('hoh');

hoh.definekey('_NAME_');

hoh.definedata('_name_', 'hh');

hoh.definedone();

dcl hash hh();

length player col: rank \$8.;

do _n_=1 by 1 until (last);

set have1 end=last;

array col(&obs);

array _rank(&var) \$8.;

array t(&obs) \$8. _temporary_;

if _n_ >1 and hoh.find() ne 0 then do;

hh=_new_ hash(ordered: 'a');

hh.definekey ('player');

hh.definedata('rank', 'player');

hh.definedone();

hoh.replace();

end;

if _n_=1 then do i=1 to &obs.;

t(i)=left(col(i));

end;

else do i=1 to &obs.;

rank=t(i);

player=col(i);

hh.replace();

end;

end;

do until (last1);

set input (keep=race_1) end=last1;

rc=hih.first(); i=0;

do rc=0 by 0 while (rc=0);

i+1;

if hh.find(key: race_1)=0 then _rank(i)=rank;

else _rank(i)='*';

rc=hih.next();

end;

output;

end;

stop;

keep player _r:;

run;

Haikuo

Super User
Posts: 10,784

## Re: Transform data format to accommodate Proc expectations

Dear Haikuo,

I think PG's method is good enough. Hash Table has its advantage when a large data .

Under this situation , PG is good, If You let me write, I also the code like PG.

See you later.

Ksharp

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations

I'm glad for this addition by , as the actual data sets I'm dealing with are pretty large.

So if the second version of transform given here is better suited to larger data sets, I'll use it.

Now I need that mentioned example translated, using this present data set created....

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

Thanks much!

Super User
Posts: 10,784

## Re: Transform data format to accommodate Proc expectations

OK. Here is .

I am afraid that I will be blamed by Dr .

I should not spoon feed .

If you are familiar with PCA, it should be easy to do it.

```data input;
input rank (race1-race3) (\$) ;
cards;
1 D E D
2 A D B
3 E . A
4 C B C
5 B C .
;
run;

data _null_;
if _n_ eq 1 then do;
length order1 - order3 8 k \$ 8;
declare hash ha(ordered:'y');
ha.definekey('k');
ha.definedata('k','order1','order2','order3');
ha.definedone();
end;
set input end=last;
array o{3} order1-order3 ;
array r{3} race1-race3 ;
do i=1 to dim(r);
if not missing(r{i}) then do;
k=r{i};
rc=ha.find();
o{i}=rank;
ha.replace();
call missing(of order:);
end;
end;
if last then ha.output(dataset:'bballm');
run;

proc means data=bballm noprint;
output out=maxrank
max=morder1 morder2 morder3  ;
run;

data bball;
set bballm;
if _n_=1 then set maxrank;
array services[3] order1-order3;
array maxranks[3] morder1-morder3 ;
keep k order1-order3 ;
do i=1 to 3 ;
if services=. then services=maxranks+1;
end;
run;

ods graphics on;

proc prinqual data=bball out=tbball scores n=1 tstandard=z
plots=transformations;
transform untie(order1-order3);
id k ;
run;

* Perform the Final Principal Component Analysis;
proc factor nfactors=1 plots=scree;
ods select factorpattern screeplot;
var Torder1-Torder3 ;
run;

proc sort;
by Prin1;
run;

* Display Scores on the First Principal Component;
proc print;
var k Prin1;
run;

```

Ksharp

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations

I'm sure glad you translated that mini-monster.  The results are quite interesting:

 Runner Principal Component 1 D -2.69500 E -0.19243 A 0.75650 B 1.06547 C 1.06547

Assuming correct translation, and correct methodology, this PCA approach solves a tie issue between D and E (that resulted in a simpler form of solution script).

Also, if the PC loading values have meaning (good question for ), I suppose we could say the tie between D and E wasn't even close.  Whereas the performance of B and C is very close.  Too, some clustering is possible, eye-balling, or using one of SAS clustering procedures.

Thanks much for your help here!

I still need to convert your script to accommodate my 100x100 data sets, so the work is far from over.  I may have to come back and ask for more help in doing that.

Super User
Posts: 10,784

## Re: Transform data format to accommodate Proc expectations

Ha. HaiKuo,

Here is :

```data input;
input rank (race1-race3) (\$) ;
cards;
1 D E D
2 A D B
3 E . A
4 C B C
5 B C .
;
run;
options missing='*';
data _null_;
if _n_ eq 1 then do;
length order1 - order3 8 k \$ 8;
declare hash ha(ordered:'y');
ha.definekey('k');
ha.definedata('k','order1','order2','order3');
ha.definedone();
end;
set input end=last;
array o{3} order1-order3 ;
array r{3} race1-race3 ;
do i=1 to dim(r);
if not missing(r{i}) then do;
k=r{i};
rc=ha.find();
o{i}=rank;
ha.replace();
call missing(of order:);
end;
end;
if last then ha.output(dataset:'want');
run;

```

Ksharp

Posts: 3,167

## Re: Transform data format to accommodate Proc expectations

Bravo! This is ingenious! Very Happy to learn! Thank you!

Haikuo

🔒 This topic is solved and locked.