Help using Base SAS procedures

Transform data format to accommodate Proc expectations, Take #2

Accepted Solution Solved
Reply
Regular Contributor
Posts: 223
Accepted Solution

Transform data format to accommodate Proc expectations, Take #2

I've tried repeatedly to follow what you provided in the first go-round.  But with my actual data, it just didn't work.

.

So, I'll give the actual data here, and beg for more help.  Suppose we are looking at 'marathon' results.  Assume 5 'races' by hundreds of 'runners'.  We only examine the top 20 finalists (but perhaps the top 30, or top 50).  Here is the order they come in for the 5 races....

.

OrderRace_1Race_2Race_3Race_4Race_5
122604_2280423401_2340323401_2340323401_2340323401_23403
222402_2260420506_2300322604a_2300522702_2300320506_21402
322601_22604a22604a_2300520506_2300320506_2140221802_23401
420206_2150321402_2340120506_2140221702_2330621402_21805
521503a_2200221802_2340121702_2330621402_2270620801_23003
622402_2300321702_2330622702_2300321801_22603a22805_23002
720706_2160521402_2340321005_2130120505_2140221402_21806
820901_2280320801_22604a20801_2300421402_2180522702_23003
921402_2280420506_2140220801_22604a21402_2270521402_22706
1021401_2310123002_2340320505_2140220506_2300321702_23306
1120301_2050121806_2300421805_2300320801_2300421802_23405
1220305_2090120801_2300421801_22603a22604a_2300521801_22603a
1320301_2300522301_2300322604a_2300222604a_2300222001_23003
1421502_2180621005_2130121402_2270620301_2180120505_21402
1520802_2100521103_2180122703_2300321402_2180621402_21703
1621602_2280422703_2300321402_2340121402_2270321802_23403
1721004_2140123002_2340120301_2180122703_2300321402_22703
1820203_2240120505_2140221402_2270522706_2300321402_22705
1921605_2320221402_2270521402_2340321401_2150120801_21402
2021502_22602a21402_2340521806_2300321805_2300321402_23401

.

I need to transform the above data format to show the Runner in the first column, followed by the Order each runner comes in for the particular race....

.

RunnerOrder_1Order_2Order_3Order_4Order_5
20506_21402***32
20506_23003*2310*
22402_226042****
22601_22604a3****
22604_228041****
22604a_23005*3212*
22702_23003**628
23401_23403*1111

.

Note that 23401_23403 placed first in the last four races, but did not even place in the top 20 in the first race.

.
The reason for doing the above transformation is so that I can run Proc Prinqual -- roughly the same example as:
.

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

.

I'll greatly appreciate your help.

.

Thanks.

.

Nicholas Kormanik

.


Accepted Solutions
Solution
‎05-15-2013 12:46 AM
Super User
Posts: 10,041

Re: Transform data format to accommodate Proc expectations, Take #2

Posted in reply to NicholasKormanik

OK.

data input;
infile cards expandtabs;
input rank (race1-race5) (: $40.) ;
cards;
1     22604_22804     23401_23403     23401_23403     23401_23403     23401_23403
2     22402_22604     20506_23003     22604a_23005     22702_23003     20506_21402
3     22601_22604a     22604a_23005     20506_23003     20506_21402     21802_23401
4     20206_21503     21402_23401     20506_21402     21702_23306     21402_21805
5     21503a_22002     21802_23401     21702_23306     21402_22706     20801_23003
6     22402_23003     21702_23306     22702_23003     21801_22603a     22805_23002
7     20706_21605     21402_23403     21005_21301     20505_21402     21402_21806
8     20901_22803     20801_22604a     20801_23004     21402_21805     22702_23003
9     21402_22804     20506_21402     20801_22604a     21402_22705     21402_22706
10     21401_23101     23002_23403     20505_21402     20506_23003     21702_23306
11     20301_20501     21806_23004     21805_23003     20801_23004     21802_23405
12     20305_20901     20801_23004     21801_22603a     22604a_23005     21801_22603a
13     20301_23005     22301_23003     22604a_23002     22604a_23002     22001_23003
14     21502_21806     21005_21301     21402_22706     20301_21801     20505_21402
15     20802_21005     21103_21801     22703_23003     21402_21806     21402_21703
16     21602_22804     22703_23003     21402_23401     21402_22703     21802_23403
17     21004_21401     23002_23401     20301_21801     22703_23003     21402_22703
18     20203_22401     20505_21402     21402_22705     22706_23003     21402_22705
19     21605_23202     21402_22705     21402_23403     21401_21501     20801_21402
20     21502_22602a     21402_23405     21806_23003     21805_23003     21402_23401
;
run;


data _null_;
 if _n_ eq 1 then do;
   length order1 - order5 8 k $ 40;
   declare hash ha(ordered:'y');
    ha.definekey('k');
     ha.definedata('k','order1','order2','order3','order4','order5');
     ha.definedone();
 end;
set input end=last;
array o{*} order1-order5 ;
array r{*} race1-race5 ;
do i=1 to dim(r);
 if not missing(r{i}) then do;
                             k=r{i};
                                    rc=ha.find();
                                    o{i}=rank;
                                    ha.replace();
                                    call missing(of order:);
                                   end;
end;
if last then ha.output(dataset:'bballm');
run;

proc means data=bballm noprint;
   output out=maxrank
      max=morder1 morder2 morder3 morder4 morder5 ;
run;

data bball;
   set bballm;
   if _n_=1 then set maxrank;
   array services
  • order1-order5;    array maxranks
  • morder1-morder5 ;    keep k order1-order5 ;    do i=1 to 5 ;       if services=. then services=maxranks+1;    end; run; ods graphics on; proc prinqual data=bball out=tbball scores n=1 tstandard=z    plots=transformations;    transform untie(order1-order5);    id k ; run; * Perform the Final Principal Component Analysis; proc factor nfactors=1 plots=scree;    ods select factorpattern screeplot;    var Torder1-Torder5 ; run; proc sort;    by Prin1; run; * Display Scores on the First Principal Component; proc print;    var k Prin1; run;
  • Ksharp

    View solution in original post


    All Replies
    Super User
    Posts: 19,837

    Re: Transform data format to accommodate Proc expectations, Take #2

    Posted in reply to NicholasKormanik

    Here's the first step:

    data step1;

    set have;

    array races (5) race1-race5;

    do i= 1 to dim(races);

         race=i;

         Runner=race(i);

         output;

    end;

    Keep order race runner;

    run;

    You can use proc transpose for the second step. If you can't get it to work, post your attempt including code.

    Regular Contributor
    Posts: 223

    Re: Transform data format to accommodate Proc expectations, Take #2

    , I appreciate you were trying to get me to learn.  Thanks.

    It would be great, though, just for completeness for anyone else coming across this thread, if you could put the rest of the code.  The first part seemed beautifully written.

    Super User
    Posts: 10,041

    Re: Transform data format to accommodate Proc expectations, Take #2

    Posted in reply to NicholasKormanik

    Oh. That would not happen, why race3 4 5  have two champions ?

    Regular Contributor
    Posts: 223

    Re: Transform data format to accommodate Proc expectations, Take #2

    Oops.  I made a mistake there.  Was trying to fill it in manually.  Not an easy transformation.  Well..., that's exactly the reason for wanting the SAS code to do this, so that the job gets done properly.

    I've gone back and tried to make corrections.

    Thanks, , for noticing and pointing out the error.

    Solution
    ‎05-15-2013 12:46 AM
    Super User
    Posts: 10,041

    Re: Transform data format to accommodate Proc expectations, Take #2

    Posted in reply to NicholasKormanik

    OK.

    data input;
    infile cards expandtabs;
    input rank (race1-race5) (: $40.) ;
    cards;
    1     22604_22804     23401_23403     23401_23403     23401_23403     23401_23403
    2     22402_22604     20506_23003     22604a_23005     22702_23003     20506_21402
    3     22601_22604a     22604a_23005     20506_23003     20506_21402     21802_23401
    4     20206_21503     21402_23401     20506_21402     21702_23306     21402_21805
    5     21503a_22002     21802_23401     21702_23306     21402_22706     20801_23003
    6     22402_23003     21702_23306     22702_23003     21801_22603a     22805_23002
    7     20706_21605     21402_23403     21005_21301     20505_21402     21402_21806
    8     20901_22803     20801_22604a     20801_23004     21402_21805     22702_23003
    9     21402_22804     20506_21402     20801_22604a     21402_22705     21402_22706
    10     21401_23101     23002_23403     20505_21402     20506_23003     21702_23306
    11     20301_20501     21806_23004     21805_23003     20801_23004     21802_23405
    12     20305_20901     20801_23004     21801_22603a     22604a_23005     21801_22603a
    13     20301_23005     22301_23003     22604a_23002     22604a_23002     22001_23003
    14     21502_21806     21005_21301     21402_22706     20301_21801     20505_21402
    15     20802_21005     21103_21801     22703_23003     21402_21806     21402_21703
    16     21602_22804     22703_23003     21402_23401     21402_22703     21802_23403
    17     21004_21401     23002_23401     20301_21801     22703_23003     21402_22703
    18     20203_22401     20505_21402     21402_22705     22706_23003     21402_22705
    19     21605_23202     21402_22705     21402_23403     21401_21501     20801_21402
    20     21502_22602a     21402_23405     21806_23003     21805_23003     21402_23401
    ;
    run;
    
    
    data _null_;
     if _n_ eq 1 then do;
       length order1 - order5 8 k $ 40;
       declare hash ha(ordered:'y');
        ha.definekey('k');
         ha.definedata('k','order1','order2','order3','order4','order5');
         ha.definedone();
     end;
    set input end=last;
    array o{*} order1-order5 ;
    array r{*} race1-race5 ;
    do i=1 to dim(r);
     if not missing(r{i}) then do;
                                 k=r{i};
                                        rc=ha.find();
                                        o{i}=rank;
                                        ha.replace();
                                        call missing(of order:);
                                       end;
    end;
    if last then ha.output(dataset:'bballm');
    run;
    
    proc means data=bballm noprint;
       output out=maxrank
          max=morder1 morder2 morder3 morder4 morder5 ;
    run;
    
    data bball;
       set bballm;
       if _n_=1 then set maxrank;
       array services
  • order1-order5;    array maxranks
  • morder1-morder5 ;    keep k order1-order5 ;    do i=1 to 5 ;       if services=. then services=maxranks+1;    end; run; ods graphics on; proc prinqual data=bball out=tbball scores n=1 tstandard=z    plots=transformations;    transform untie(order1-order5);    id k ; run; * Perform the Final Principal Component Analysis; proc factor nfactors=1 plots=scree;    ods select factorpattern screeplot;    var Torder1-Torder5 ; run; proc sort;    by Prin1; run; * Display Scores on the First Principal Component; proc print;    var k Prin1; run;
  • Ksharp

    Regular Contributor
    Posts: 223

    Re: Transform data format to accommodate Proc expectations, Take #2

    Beautiful, beautiful, beautiful!!

    Thanks soooo much.

    Nicholas

    Regular Contributor
    Posts: 223

    Re: Transform data format to accommodate Proc expectations, Take #2

    , by the way, might as well ask now..., would it be okay to include the top 50 rows using your code?  Or is it just limited to 20, as requested in the original post?

    I'll try.  But just wanted to ask you now since it's before you presently.

    Thanks!

    Super User
    Posts: 10,041

    Re: Transform data format to accommodate Proc expectations, Take #2

    Posted in reply to NicholasKormanik

    OK.

    At the bottom of my code , You can print top 50 observations.

    * Display Scores on the First Principal Component;

    proc print obs=50;

       var k Prin1;

    run;

    Regular Contributor
    Posts: 223

    Re: Transform data format to accommodate Proc expectations, Take #2

    , I mean in the cards section at top..., can I include 50 lines?

    Super User
    Posts: 10,041

    Re: Transform data format to accommodate Proc expectations, Take #2

    Posted in reply to NicholasKormanik

    Yes. You can add any rows as you want.

    🔒 This topic is solved and locked.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 10 replies
    • 249 views
    • 6 likes
    • 3 in conversation