BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bibbnd
Fluorite | Level 6

I want to extract the name from a variable based on the highest value in the fields.

here is a sample of the data

   id   wht_rev  blk_rev  asn_rev iph_rev

   b1    265        114          34        7

   b2    65          218         5          22

   b4    78          101        12          4

 

so i want to create a variable called ethnic based on the highest value from each row of the data set. the result should look like

 id    ethnic

b1     wht

b2     blk

b4     blk

 

I am not sure where to begin at all     

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this.

 

data have;
input id $ wht_rev blk_rev asn_rev iph_rev;
datalines;
b1 265 114 34 7
b2 65 218 5 22
b4 78 101 12 4
;

data want(keep = id ethnic);
   set have;
   array a wht_rev -- iph_rev;
   idx = whichn(max(of a[*]), of a[*]);
   ethnic = scan(vname(a[idx]), 1, '_');
run;

 

Result:

 

id  ethnic 
b1  wht 
b2  blk 
b4  blk 

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

data have;
input   id $  wht_rev  blk_rev  asn_rev iph_rev;
cards;
   b1    265        114          34        7

   b2    65          218         5          22

   b4    78          101        12          4
   ;

data want;
 set have;
 length want $32;
 array t(*) wht_rev--iph_rev;
 max=max(of t(*)); /*keep the max value in dataset*/
 want=scan(vname(t(whichn(max,of t(*)))),1,'_');
run;

 Please feel free to add a KEEP or DROP statement to choose what variables you need to remain or drop 🙂

novinosrin
Tourmaline | Level 20

HI @bibbnd  Please be aware WHICHN is not really the greatest function if you have a very very wide dataset. That doesn't quite do the binary search as we would like. At least in my little experience with what I have tested so far, it's rather slow when we have to deal with very wide datasets.

 

Sometimes, cycling through the list is far cleaner, easier and surprisingly faster in my experience. So-


data have;
input   id $  wht_rev  blk_rev  asn_rev iph_rev;
cards;
   b1    265        114          34        7

   b2    65          218         5          22

   b4    78          101        12          4
   ;

data want;
 set have;
 array t(*) wht_rev--iph_rev;
 max=max(of t(*)); 
 length want $32;
 do _n_=1 to dim(t);
  if t(_n_)=max then do;
    want=scan(vname(t(_n_)),1,'_');
	leave;
  end;
 end;
run;
 

Agree, it's boring but it's convenient. 

 

 

PeterClemmensen
Tourmaline | Level 20

Try this.

 

data have;
input id $ wht_rev blk_rev asn_rev iph_rev;
datalines;
b1 265 114 34 7
b2 65 218 5 22
b4 78 101 12 4
;

data want(keep = id ethnic);
   set have;
   array a wht_rev -- iph_rev;
   idx = whichn(max(of a[*]), of a[*]);
   ethnic = scan(vname(a[idx]), 1, '_');
run;

 

Result:

 

id  ethnic 
b1  wht 
b2  blk 
b4  blk 

 

andreas_lds
Jade | Level 19

If you don't like data steps, you could use proc transpose + proc sort (twice):

proc transpose data=have out=transposed name=ethnic;
	by id;
	var _numeric_;
run;

proc sort data=transposed out=sorted;
	by id descending col1;
run;

proc sort data=sorted out=want(drop=col1) nodupkey;
	by id;
run;

 

EDIT: Processing data thrice is, of course, less efficient than then data-step-solution already provided 😉

novinosrin
Tourmaline | Level 20

@andreas_lds  My SQL colleagues/folks here sitting besides me would rather have your solution as opposed to the rest. Not kidding, honestly here it seems the focus is ONLY credit risk loan performance and appetite and not SAS efficiency for some reason.  

 

Rule: What is simple that produces results is business for us. So Kudos!!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1371 views
  • 1 like
  • 4 in conversation