BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NKormanik
Barite | Level 11

Suppose we are faced with results of 100 races by 100 runners.  Consider a subset of the results in the following spreadsheet:

Order FinishedRace 1Race 2Race 3
1DED
2ADB
3E*A
4CBC
5BC*

A SAS procedure I'm trying to use shows an example with the data in a different form, as follows:

RunnerOrder_1Order_2Order 3
A2*3
B542
C454
D121
E31*

As can be seen, some runners miss certain races.

The above transformation was easy enough to do manually.  But 100 x 100 is another story.  Could one of you please help with the SAS code that will transform from the first format to the second?

Your help will be greatly appreciated.

By the way, the SAS example I am attempting to replicate using my data set is the following:

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

Thanks much!

Nicholas Kormanik

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use this :


data input;
input rank (race_1-race_3) ($) ;
cards;
1 D E D
2 A D B
3 E * A
4 C B C
5 B C *
;

data list;
set input;
array r race_:;
do race = 1 to dim(r);
     name = r{race};
     if anyalpha(name) then output;
     end;
drop race_:;
run;

proc sort data=list; by name race; run;

proc transpose data=list out=want(drop=_name_) prefix=Order_;
by name;
var rank;
id race;
run;

option missing=*;
proc print data=want noobs; var name order_1-order_3; run;
option missing=.;

PG

PG

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

Use this :


data input;
input rank (race_1-race_3) ($) ;
cards;
1 D E D
2 A D B
3 E * A
4 C B C
5 B C *
;

data list;
set input;
array r race_:;
do race = 1 to dim(r);
     name = r{race};
     if anyalpha(name) then output;
     end;
drop race_:;
run;

proc sort data=list; by name race; run;

proc transpose data=list out=want(drop=_name_) prefix=Order_;
by name;
var rank;
id race;
run;

option missing=*;
proc print data=want noobs; var name order_1-order_3; run;
option missing=.;

PG

PG
NKormanik
Barite | Level 11

Outstanding, , masterfully done.

I'd never have gotten it on my own.

Haikuo
Onyx | Level 15

Indeed, your solution not only precise, but also intuitive. FWIW, here is a hash() approach (I know, I know, it is awful. Hey, Ksharp is online, hopefully he will come up a better Hash solution):

data input;

input rank (race_1-race_3) ($) ;

cards;

1 D E D

2 A D B

3 E * A

4 C B C

5 B C *

;

data _null_;

  call symputx('obs',nobs);

  stop;

  set input nobs=nobs;

run;

proc transpose data=input out=have1;

var _all_;

run;

data _null_;

  call symputx('var',nobs-1);

  stop;

  set have1 nobs=nobs;

run;

%put &var &obs;

data want;

  dcl hash hoh();

  dcl hiter hih('hoh');

  hoh.definekey('_NAME_');

  hoh.definedata('_name_', 'hh');

  hoh.definedone();

  dcl hash hh();

  length player col: rank $8.;

    do _n_=1 by 1 until (last);

           set have1 end=last;

            array col(&obs);

  array _rank(&var) $8.;

                array t(&obs) $8. _temporary_;

  if _n_ >1 and hoh.find() ne 0 then do;

    hh=_new_ hash(ordered: 'a');

    hh.definekey ('player');

    hh.definedata('rank', 'player');

    hh.definedone();

    hoh.replace();

  end;

                if _n_=1 then do i=1 to &obs.;

                  t(i)=left(col(i));

  end;

  else do i=1 to &obs.;

   rank=t(i);

   player=col(i);

   hh.replace();

  end;

  end;

  do until (last1);

      set input (keep=race_1) end=last1;

     rc=hih.first(); i=0;

     do rc=0 by 0 while (rc=0);

  i+1;

  if hh.find(key: race_1)=0 then _rank(i)=rank;

     else _rank(i)='*';

  rc=hih.next();

  end;

  output;

  end;

  stop;

  keep player _r:;

  run;

Haikuo

Ksharp
Super User

Dear Haikuo,

I think PG's method is good enough. Hash Table has its advantage when a large data .

Under this situation , PG is good, If You let me write, I also the code like PG.

See you later.

Ksharp

NKormanik
Barite | Level 11

  I'm glad for this addition by , as the actual data sets I'm dealing with are pretty large.

So if the second version of transform given here is better suited to larger data sets, I'll use it.

Now I need that mentioned example translated, using this present data set created....

Example 74.2 Principal Components of Basketball Rankings

SAS/STAT(R) 12.1 User's Guide

Thanks much!

Ksharp
Super User

OK. Here is .

I am afraid that I will be blamed by Dr .

I should not spoon feed .

If you are familiar with PCA, it should be easy to do it.

data input;
input rank (race1-race3) ($) ;
cards;
1 D E D
2 A D B
3 E . A
4 C B C
5 B C .
;
run;

data _null_;
 if _n_ eq 1 then do;
   length order1 - order3 8 k $ 8;
   declare hash ha(ordered:'y');
    ha.definekey('k');
     ha.definedata('k','order1','order2','order3');
     ha.definedone();
 end;
set input end=last;
array o{3} order1-order3 ;
array r{3} race1-race3 ;
do i=1 to dim(r);
 if not missing(r{i}) then do;
                             k=r{i};
                                    rc=ha.find();
                                    o{i}=rank;
                                    ha.replace();
                                    call missing(of order:);
                                   end;
end;
if last then ha.output(dataset:'bballm');
run;

proc means data=bballm noprint;
   output out=maxrank
      max=morder1 morder2 morder3  ;
run;

data bball;
   set bballm;
   if _n_=1 then set maxrank;
   array services[3] order1-order3;
   array maxranks[3] morder1-morder3 ;
   keep k order1-order3 ;
   do i=1 to 3 ;
      if services=. then services=maxranks+1;
   end;
run;

ods graphics on;

proc prinqual data=bball out=tbball scores n=1 tstandard=z
   plots=transformations;
   transform untie(order1-order3);
   id k ;
run;

* Perform the Final Principal Component Analysis;
proc factor nfactors=1 plots=scree;
   ods select factorpattern screeplot;
   var Torder1-Torder3 ;
run;

proc sort;
   by Prin1;
run;

* Display Scores on the First Principal Component;
proc print;
   var k Prin1;
run;

Ksharp

NKormanik
Barite | Level 11

  I'm sure glad you translated that mini-monster.  The results are quite interesting:

RunnerPrincipal Component 1
D-2.69500
E-0.19243
A0.75650
B1.06547
C1.06547

Assuming correct translation, and correct methodology, this PCA approach solves a tie issue between D and E (that resulted in a simpler form of solution script).

Also, if the PC loading values have meaning (good question for ), I suppose we could say the tie between D and E wasn't even close.  Whereas the performance of B and C is very close.  Too, some clustering is possible, eye-balling, or using one of SAS clustering procedures.

Thanks much for your help here!

I still need to convert your script to accommodate my 100x100 data sets, so the work is far from over.  I may have to come back and ask for more help in doing that.

Ksharp
Super User

Ha. HaiKuo,

Here is :

data input;
input rank (race1-race3) ($) ;
cards;
1 D E D
2 A D B
3 E . A
4 C B C
5 B C .
;
run;
options missing='*';
data _null_;
 if _n_ eq 1 then do;
   length order1 - order3 8 k $ 8;
   declare hash ha(ordered:'y');
    ha.definekey('k');
     ha.definedata('k','order1','order2','order3');
     ha.definedone();
 end;
set input end=last;
array o{3} order1-order3 ;
array r{3} race1-race3 ;
do i=1 to dim(r);
 if not missing(r{i}) then do;
                             k=r{i};
                                    rc=ha.find();
                                    o{i}=rank;
                                    ha.replace();
                                    call missing(of order:);
                                   end;
end;
if last then ha.output(dataset:'want');
run;

Ksharp

Haikuo
Onyx | Level 15

Bravo! This is ingenious! Very Happy to learn! Thank you!

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1212 views
  • 8 likes
  • 4 in conversation