DATA Step, Macro, Functions and more

how to get the name of the variable with the max value?

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

how to get the name of the variable with the max value?

hi,

in my dataset, i have incm1, incm2, incm3 and incm4 for each customer. some have only 1 income, some have 2 or more. i want to create a new variable with the max value of the 4 incomes and at the same time, i need to create a flag indicating which income is the max.

for example, the max income of a record is the incm1, the new income value is easy, it's just new_income=max(incm1,incm2,incm3,incm4).

i know using if statement and matching the new income value and the 4 income values can show me the source of the income.

if new_income=incm1 then flag='incm1'; else if new_income=incm2 then flag='incm2'; etc..

but are there any other ways to create the new income flag which shows 'incm1' for this record?

thanks


Accepted Solutions
Solution
‎05-12-2013 01:34 PM
Respected Advisor
Posts: 4,655

Re: how to get the name of the variable with the max value?

You could use an array and function vname as follows :

data have;
input incm1 incm2 incm3 incm4;
datalines;
1 2 3 4
;

data want;
set have;
array inc{*} incm:;
incMax = max(of inc{*});
do _n_ = 1 to dim(inc);
     if inc{_n_} = incMax then incId = vname(inc{_n_});
     end;
run;

proc print; run;

PG

PG

View solution in original post


All Replies
Solution
‎05-12-2013 01:34 PM
Respected Advisor
Posts: 4,655

Re: how to get the name of the variable with the max value?

You could use an array and function vname as follows :

data have;
input incm1 incm2 incm3 incm4;
datalines;
1 2 3 4
;

data want;
set have;
array inc{*} incm:;
incMax = max(of inc{*});
do _n_ = 1 to dim(inc);
     if inc{_n_} = incMax then incId = vname(inc{_n_});
     end;
run;

proc print; run;

PG

PG
New Contributor
Posts: 3

Re: how to get the name of the variable with the max value?

this one is great. thank you very much.

Super User
Posts: 9,687

Re: how to get the name of the variable with the max value?

PG,

You also can use whichn() to instead of DO LOOP . make it more succinct .

Respected Advisor
Posts: 4,655

Re: how to get the name of the variable with the max value?

Thank you . I had forgotten about whichn.

PG
Super Contributor
Posts: 1,040

Re: how to get the name of the variable with the max value?

Hi Ksharp,

hope you are doing good.

i still have question understanding this concept..

I expect x0 to have a value of 5!!!!!!!becos .  occupies the 5th position????????

why is x0 in the below example having a value of missing???

Thanks

data _null_;

   array dates

  • Columbus Hastings Nicea US_Independence missing
  •                   Magna_Carta Gutenberg

                      (1492 1066 325 1776 . 1215 1450);

       x0=whichn(., of dates

  • );
  •    x1=whichn(1492, of dates

  • );
  •    x2=whichn(1066, of dates

  • );
  •    x3=whichn(1450, of dates

  • );
  •    x4=whichn(1000, of dates

  • );
  •    put x0= / x1= / x2= / x3= / x4=;

    run;

    SAS writes the following output to the log:

    x0=.

    x1=1

    x2=2

    x3=7

    x4=0

    Super User
    Posts: 9,687

    Re: how to get the name of the variable with the max value?

    It looks like whichn() will ignore missing value, like sum() , mean()

    Super Contributor
    Posts: 1,040

    Re: how to get the name of the variable with the max value?

    Hi,

    Could you please explain :

    do _n_ = 1 to dim(inc);

    Generally _n_ is from top to bottom. Here we are saying to do it across(column wise)????

    Also in the below log why is it performing four times???it should be in a single step???

    Thanks

    52   data want;
    53   set have;
    54   array inc{*} incm:;
    55   incMax = max(of inc{*});
    56   do _n_ = 1 to dim(inc);
    57        if inc{_n_} = incMax then incId = vname(inc{_n_});
    58       putlog _all_;
    59        end;
    60   run;

    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=1
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=2
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=3
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=incm4 _ERROR_=0 _N_=4

        

    Respected Advisor
    Posts: 4,655

    Re: how to get the name of the variable with the max value?

    Hi,

    Variable _n_ is always created by SAS within a datastep. Normally _n_ counts the number of iterations of the datastep and is not saved in the resulting dataset. When you don't need its value, you can do anything you want with _n_. In my little program I used it as the control variable for my DO loop. That cheap trick saves me the trouble of dropping the variable (since it is not saved).

    Array inc contains all variables beginning with incm (incm1, incm2, incm3, incm4). The DO loop scans the elements of array inc to find which one has the maximum value. It has to loop dim(inc) = 4 times. Each time it loops, printlog executes. Thus it prints four lines on the log. You can see that it finds the maximum value on the last iteration.

    hth

    PG

    PG
    Respected Advisor
    Posts: 4,655

    Re: how to get the name of the variable with the max value?

    Also, like pointed out, you could also use:

    data want2;

    set have;

    array inc{*} incm:;

    incId = vname(inc{whichn(max(of inc{*}),of inc{*})});

    run;

    to avoid programming the loop.

    Quiz Smiley Happy

    1) The two codes do not always give the same result for incId. Can you tell how?

    2) Can you modify the looping code so that both codes will give the same result?

    PG

    PG
    Occasional Contributor
    Posts: 7

    Re: how to get the name of the variable with the max value?

    I'm guessing it can happen when two or more columns share the max value?

    Respected Advisor
    Posts: 4,655

    Re: how to get the name of the variable with the max value?

    Good guess! What happens then?

    PG
    Super Contributor
    Posts: 1,040

    Re: how to get the name of the variable with the max value?

    I guess the variable name of last iteration with maximum value is outputted??? If it is right then how do we correct this while using whichn?

    Thanks

    Respected Advisor
    Posts: 4,655

    Re: how to get the name of the variable with the max value?

    You are right! The looping code will return the last maximum whereas whichn will return the first one. I would find it easier to modify the looping algorithm to match the whichn algorithm than the other way around.I can think of two simple ways to get the looping code to return the first max. I will post them over the weekend if you don't find them first!

    PG

    PG
    Super Contributor
    Posts: 1,040

    Re: how to get the name of the variable with the max value?

    Hi PG,

    I could not think of a proper solution for the loop to return the max value...

    Could you help me with that/

    Thanks

    ☑ This topic is solved.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 14 replies
    • 410 views
    • 1 like
    • 5 in conversation