BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Edmond
Calcite | Level 5

hi,

in my dataset, i have incm1, incm2, incm3 and incm4 for each customer. some have only 1 income, some have 2 or more. i want to create a new variable with the max value of the 4 incomes and at the same time, i need to create a flag indicating which income is the max.

for example, the max income of a record is the incm1, the new income value is easy, it's just new_income=max(incm1,incm2,incm3,incm4).

i know using if statement and matching the new income value and the 4 income values can show me the source of the income.

if new_income=incm1 then flag='incm1'; else if new_income=incm2 then flag='incm2'; etc..

but are there any other ways to create the new income flag which shows 'incm1' for this record?

thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could use an array and function vname as follows :

data have;
input incm1 incm2 incm3 incm4;
datalines;
1 2 3 4
;

data want;
set have;
array inc{*} incm:;
incMax = max(of inc{*});
do _n_ = 1 to dim(inc);
     if inc{_n_} = incMax then incId = vname(inc{_n_});
     end;
run;

proc print; run;

PG

PG

View solution in original post

14 REPLIES 14
PGStats
Opal | Level 21

You could use an array and function vname as follows :

data have;
input incm1 incm2 incm3 incm4;
datalines;
1 2 3 4
;

data want;
set have;
array inc{*} incm:;
incMax = max(of inc{*});
do _n_ = 1 to dim(inc);
     if inc{_n_} = incMax then incId = vname(inc{_n_});
     end;
run;

proc print; run;

PG

PG
Edmond
Calcite | Level 5

this one is great. thank you very much.

Ksharp
Super User

PG,

You also can use whichn() to instead of DO LOOP . make it more succinct .

PGStats
Opal | Level 21

Thank you . I had forgotten about whichn.

PG
robertrao
Quartz | Level 8

Hi Ksharp,

hope you are doing good.

i still have question understanding this concept..

I expect x0 to have a value of 5!!!!!!!becos .  occupies the 5th position????????

why is x0 in the below example having a value of missing???

Thanks

data _null_;

   array dates

  • Columbus Hastings Nicea US_Independence missing
  •                   Magna_Carta Gutenberg

                      (1492 1066 325 1776 . 1215 1450);

       x0=whichn(., of dates

  • );
  •    x1=whichn(1492, of dates

  • );
  •    x2=whichn(1066, of dates

  • );
  •    x3=whichn(1450, of dates

  • );
  •    x4=whichn(1000, of dates

  • );
  •    put x0= / x1= / x2= / x3= / x4=;

    run;

    SAS writes the following output to the log:

    x0=.

    x1=1

    x2=2

    x3=7

    x4=0

    Ksharp
    Super User

    It looks like whichn() will ignore missing value, like sum() , mean()

    robertrao
    Quartz | Level 8

    Hi,

    Could you please explain :

    do _n_ = 1 to dim(inc);

    Generally _n_ is from top to bottom. Here we are saying to do it across(column wise)????

    Also in the below log why is it performing four times???it should be in a single step???

    Thanks

    52   data want;
    53   set have;
    54   array inc{*} incm:;
    55   incMax = max(of inc{*});
    56   do _n_ = 1 to dim(inc);
    57        if inc{_n_} = incMax then incId = vname(inc{_n_});
    58       putlog _all_;
    59        end;
    60   run;

    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=1
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=2
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=  _ERROR_=0 _N_=3
    incm1=1 incm2=2 incm3=3 incm4=4 incMax=4 incId=incm4 _ERROR_=0 _N_=4

        

    PGStats
    Opal | Level 21

    Hi,

    Variable _n_ is always created by SAS within a datastep. Normally _n_ counts the number of iterations of the datastep and is not saved in the resulting dataset. When you don't need its value, you can do anything you want with _n_. In my little program I used it as the control variable for my DO loop. That cheap trick saves me the trouble of dropping the variable (since it is not saved).

    Array inc contains all variables beginning with incm (incm1, incm2, incm3, incm4). The DO loop scans the elements of array inc to find which one has the maximum value. It has to loop dim(inc) = 4 times. Each time it loops, printlog executes. Thus it prints four lines on the log. You can see that it finds the maximum value on the last iteration.

    hth

    PG

    PG
    PGStats
    Opal | Level 21

    Also, like pointed out, you could also use:

    data want2;

    set have;

    array inc{*} incm:;

    incId = vname(inc{whichn(max(of inc{*}),of inc{*})});

    run;

    to avoid programming the loop.

    Quiz Smiley Happy

    1) The two codes do not always give the same result for incId. Can you tell how?

    2) Can you modify the looping code so that both codes will give the same result?

    PG

    PG
    aland1
    Calcite | Level 5

    I'm guessing it can happen when two or more columns share the max value?

    PGStats
    Opal | Level 21

    Good guess! What happens then?

    PG
    robertrao
    Quartz | Level 8

    I guess the variable name of last iteration with maximum value is outputted??? If it is right then how do we correct this while using whichn?

    Thanks

    PGStats
    Opal | Level 21

    You are right! The looping code will return the last maximum whereas whichn will return the first one. I would find it easier to modify the looping algorithm to match the whichn algorithm than the other way around.I can think of two simple ways to get the looping code to return the first max. I will post them over the weekend if you don't find them first!

    PG

    PG
    robertrao
    Quartz | Level 8

    Hi PG,

    I could not think of a proper solution for the loop to return the max value...

    Could you help me with that/

    Thanks

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 14 replies
    • 1366 views
    • 1 like
    • 5 in conversation