## Transform data format to accommodate Proc expectations, Take #2

Solved
Regular Contributor
Posts: 238

# Transform data format to accommodate Proc expectations, Take #2

I've tried repeatedly to follow what you provided in the first go-round.  But with my actual data, it just didn't work.

.

So, I'll give the actual data here, and beg for more help.  Suppose we are looking at 'marathon' results.  Assume 5 'races' by hundreds of 'runners'.  We only examine the top 20 finalists (but perhaps the top 30, or top 50).  Here is the order they come in for the 5 races....

.

 Order Race_1 Race_2 Race_3 Race_4 Race_5 1 22604_22804 23401_23403 23401_23403 23401_23403 23401_23403 2 22402_22604 20506_23003 22604a_23005 22702_23003 20506_21402 3 22601_22604a 22604a_23005 20506_23003 20506_21402 21802_23401 4 20206_21503 21402_23401 20506_21402 21702_23306 21402_21805 5 21503a_22002 21802_23401 21702_23306 21402_22706 20801_23003 6 22402_23003 21702_23306 22702_23003 21801_22603a 22805_23002 7 20706_21605 21402_23403 21005_21301 20505_21402 21402_21806 8 20901_22803 20801_22604a 20801_23004 21402_21805 22702_23003 9 21402_22804 20506_21402 20801_22604a 21402_22705 21402_22706 10 21401_23101 23002_23403 20505_21402 20506_23003 21702_23306 11 20301_20501 21806_23004 21805_23003 20801_23004 21802_23405 12 20305_20901 20801_23004 21801_22603a 22604a_23005 21801_22603a 13 20301_23005 22301_23003 22604a_23002 22604a_23002 22001_23003 14 21502_21806 21005_21301 21402_22706 20301_21801 20505_21402 15 20802_21005 21103_21801 22703_23003 21402_21806 21402_21703 16 21602_22804 22703_23003 21402_23401 21402_22703 21802_23403 17 21004_21401 23002_23401 20301_21801 22703_23003 21402_22703 18 20203_22401 20505_21402 21402_22705 22706_23003 21402_22705 19 21605_23202 21402_22705 21402_23403 21401_21501 20801_21402 20 21502_22602a 21402_23405 21806_23003 21805_23003 21402_23401

.

I need to transform the above data format to show the Runner in the first column, followed by the Order each runner comes in for the particular race....

.

 Runner Order_1 Order_2 Order_3 Order_4 Order_5 20506_21402 * * * 3 2 20506_23003 * 2 3 10 * 22402_22604 2 * * * * 22601_22604a 3 * * * * 22604_22804 1 * * * * 22604a_23005 * 3 2 12 * 22702_23003 * * 6 2 8 23401_23403 * 1 1 1 1

.

Note that 23401_23403 placed first in the last four races, but did not even place in the top 20 in the first race.

.
The reason for doing the above transformation is so that I can run Proc Prinqual -- roughly the same example as:
.

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

.

.

Thanks.

.

Nicholas Kormanik

.

Accepted Solutions
Solution
‎05-15-2013 12:46 AM
Super User
Posts: 10,766

## Re: Transform data format to accommodate Proc expectations, Take #2

OK.

```data input;
infile cards expandtabs;
input rank (race1-race5) (: \$40.) ;
cards;
1     22604_22804     23401_23403     23401_23403     23401_23403     23401_23403
2     22402_22604     20506_23003     22604a_23005     22702_23003     20506_21402
3     22601_22604a     22604a_23005     20506_23003     20506_21402     21802_23401
4     20206_21503     21402_23401     20506_21402     21702_23306     21402_21805
5     21503a_22002     21802_23401     21702_23306     21402_22706     20801_23003
6     22402_23003     21702_23306     22702_23003     21801_22603a     22805_23002
7     20706_21605     21402_23403     21005_21301     20505_21402     21402_21806
8     20901_22803     20801_22604a     20801_23004     21402_21805     22702_23003
9     21402_22804     20506_21402     20801_22604a     21402_22705     21402_22706
10     21401_23101     23002_23403     20505_21402     20506_23003     21702_23306
11     20301_20501     21806_23004     21805_23003     20801_23004     21802_23405
12     20305_20901     20801_23004     21801_22603a     22604a_23005     21801_22603a
13     20301_23005     22301_23003     22604a_23002     22604a_23002     22001_23003
14     21502_21806     21005_21301     21402_22706     20301_21801     20505_21402
15     20802_21005     21103_21801     22703_23003     21402_21806     21402_21703
16     21602_22804     22703_23003     21402_23401     21402_22703     21802_23403
17     21004_21401     23002_23401     20301_21801     22703_23003     21402_22703
18     20203_22401     20505_21402     21402_22705     22706_23003     21402_22705
19     21605_23202     21402_22705     21402_23403     21401_21501     20801_21402
20     21502_22602a     21402_23405     21806_23003     21805_23003     21402_23401
;
run;

data _null_;
if _n_ eq 1 then do;
length order1 - order5 8 k \$ 40;
declare hash ha(ordered:'y');
ha.definekey('k');
ha.definedata('k','order1','order2','order3','order4','order5');
ha.definedone();
end;
set input end=last;
array o{*} order1-order5 ;
array r{*} race1-race5 ;
do i=1 to dim(r);
if not missing(r{i}) then do;
k=r{i};
rc=ha.find();
o{i}=rank;
ha.replace();
call missing(of order:);
end;
end;
if last then ha.output(dataset:'bballm');
run;

proc means data=bballm noprint;
output out=maxrank
max=morder1 morder2 morder3 morder4 morder5 ;
run;

data bball;
set bballm;
if _n_=1 then set maxrank;
array services order1-order5;
array maxranks morder1-morder5 ;
keep k order1-order5 ;
do i=1 to 5 ;
if services=. then services=maxranks+1;
end;
run;

ods graphics on;

proc prinqual data=bball out=tbball scores n=1 tstandard=z
plots=transformations;
transform untie(order1-order5);
id k ;
run;

* Perform the Final Principal Component Analysis;
proc factor nfactors=1 plots=scree;
ods select factorpattern screeplot;
var Torder1-Torder5 ;
run;

proc sort;
by Prin1;
run;

* Display Scores on the First Principal Component;
proc print;
var k Prin1;
run;

```

Ksharp

All Replies
Super User
Posts: 23,683

## Re: Transform data format to accommodate Proc expectations, Take #2

Here's the first step:

data step1;

set have;

array races (5) race1-race5;

do i= 1 to dim(races);

race=i;

Runner=race(i);

output;

end;

Keep order race runner;

run;

You can use proc transpose for the second step. If you can't get it to work, post your attempt including code.

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations, Take #2

, I appreciate you were trying to get me to learn.  Thanks.

It would be great, though, just for completeness for anyone else coming across this thread, if you could put the rest of the code.  The first part seemed beautifully written.

Super User
Posts: 10,766

## Re: Transform data format to accommodate Proc expectations, Take #2

Oh. That would not happen, why race3 4 5  have two champions ?

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations, Take #2

Oops.  I made a mistake there.  Was trying to fill it in manually.  Not an easy transformation.  Well..., that's exactly the reason for wanting the SAS code to do this, so that the job gets done properly.

I've gone back and tried to make corrections.

Thanks, , for noticing and pointing out the error.

Solution
‎05-15-2013 12:46 AM
Super User
Posts: 10,766

## Re: Transform data format to accommodate Proc expectations, Take #2

OK.

```data input;
infile cards expandtabs;
input rank (race1-race5) (: \$40.) ;
cards;
1     22604_22804     23401_23403     23401_23403     23401_23403     23401_23403
2     22402_22604     20506_23003     22604a_23005     22702_23003     20506_21402
3     22601_22604a     22604a_23005     20506_23003     20506_21402     21802_23401
4     20206_21503     21402_23401     20506_21402     21702_23306     21402_21805
5     21503a_22002     21802_23401     21702_23306     21402_22706     20801_23003
6     22402_23003     21702_23306     22702_23003     21801_22603a     22805_23002
7     20706_21605     21402_23403     21005_21301     20505_21402     21402_21806
8     20901_22803     20801_22604a     20801_23004     21402_21805     22702_23003
9     21402_22804     20506_21402     20801_22604a     21402_22705     21402_22706
10     21401_23101     23002_23403     20505_21402     20506_23003     21702_23306
11     20301_20501     21806_23004     21805_23003     20801_23004     21802_23405
12     20305_20901     20801_23004     21801_22603a     22604a_23005     21801_22603a
13     20301_23005     22301_23003     22604a_23002     22604a_23002     22001_23003
14     21502_21806     21005_21301     21402_22706     20301_21801     20505_21402
15     20802_21005     21103_21801     22703_23003     21402_21806     21402_21703
16     21602_22804     22703_23003     21402_23401     21402_22703     21802_23403
17     21004_21401     23002_23401     20301_21801     22703_23003     21402_22703
18     20203_22401     20505_21402     21402_22705     22706_23003     21402_22705
19     21605_23202     21402_22705     21402_23403     21401_21501     20801_21402
20     21502_22602a     21402_23405     21806_23003     21805_23003     21402_23401
;
run;

data _null_;
if _n_ eq 1 then do;
length order1 - order5 8 k \$ 40;
declare hash ha(ordered:'y');
ha.definekey('k');
ha.definedata('k','order1','order2','order3','order4','order5');
ha.definedone();
end;
set input end=last;
array o{*} order1-order5 ;
array r{*} race1-race5 ;
do i=1 to dim(r);
if not missing(r{i}) then do;
k=r{i};
rc=ha.find();
o{i}=rank;
ha.replace();
call missing(of order:);
end;
end;
if last then ha.output(dataset:'bballm');
run;

proc means data=bballm noprint;
output out=maxrank
max=morder1 morder2 morder3 morder4 morder5 ;
run;

data bball;
set bballm;
if _n_=1 then set maxrank;
array services order1-order5;
array maxranks morder1-morder5 ;
keep k order1-order5 ;
do i=1 to 5 ;
if services=. then services=maxranks+1;
end;
run;

ods graphics on;

proc prinqual data=bball out=tbball scores n=1 tstandard=z
plots=transformations;
transform untie(order1-order5);
id k ;
run;

* Perform the Final Principal Component Analysis;
proc factor nfactors=1 plots=scree;
ods select factorpattern screeplot;
var Torder1-Torder5 ;
run;

proc sort;
by Prin1;
run;

* Display Scores on the First Principal Component;
proc print;
var k Prin1;
run;

```

Ksharp

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations, Take #2

Beautiful, beautiful, beautiful!!

Thanks soooo much.

Nicholas

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations, Take #2

, by the way, might as well ask now..., would it be okay to include the top 50 rows using your code?  Or is it just limited to 20, as requested in the original post?

I'll try.  But just wanted to ask you now since it's before you presently.

Thanks!

Super User
Posts: 10,766

## Re: Transform data format to accommodate Proc expectations, Take #2

OK.

At the bottom of my code , You can print top 50 observations.

* Display Scores on the First Principal Component;

proc print obs=50;

var k Prin1;

run;

Regular Contributor
Posts: 238

## Re: Transform data format to accommodate Proc expectations, Take #2

, I mean in the cards section at top..., can I include 50 lines?

Super User
Posts: 10,766