Help using Base SAS procedures

first occurence and conditions within arrays

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

first occurence and conditions within arrays

Hi,

I would be thankful if I could get help on:

Data:-

Data I_have;input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt;

Datalines;

y    y    y    y    200    5    33    50    40   

N    N    N    y    100    44    22    24    50   

N    N    N    y    42    22    22    300    500   

N    N    Y    N    55    200    300    100    12   

N    N    y    y    500    99    15    400    14   

;run;

Goal:- Record the first occurence of the "Y" in a separate varaible called Def_month from the series

of the Variable/array called LDF,at the same point we take the Def_balance, shown in the code below.

To this extent I have completed the code. Now, if just before the first occurence and also at the first occurence of

"Y" the series of bal variable represented through Lbal remains less than or equal to 100 then Def_balance should equal

the variable disc_amt.

For example:-

For the first observation, first occurence of "Y" happens in Dfm1 therefore Def_month gets the value of 1 and since it this occurence come from first variable

so Def_balance=bal1 that is 200.

For the second observation occurence of "Y" happens in Dfm4 therefore Def_month gets the value of 4 and since it this occurence comes from fourth varialbe

so Def_balance=bal4 that is 4.But since we want extra condition to be fullfiled as discussed above the Def_balance then should be 50, value from disc_amt( needed output).This happens because 24 and 22 in Dfm3 and Dfm4 respectively have value less than 100;

Data I_get;

set I_have;

array LDF{*}$ dfm1-dfm4;

array Lbal{*} bal1-bal4;

do j=1 to dim(LDF);

if LDF(j) ="y" then do;

Def_month=j;

Def_balance=Lbal(j)

leave;

end;

end;

Data I_wanna;

input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt Def_month Def_balance;

Datalines;

y    y    y    y    200    5    33    50    40    1    200

N    N    N    y    100    44    22    24    50    4    50

N    N    N    y    42    22    22    300    500    4    300

N    N    Y    N    55    200    300    100    12    3    300

N    N    y    y    500    99    15    400    14    3    14

;run;


Accepted Solutions
Solution
‎11-16-2011 08:28 AM
Esteemed Advisor
Posts: 7,075

first occurence and conditions within arrays

I'm not sure I follow what you are trying to accomplish, as your use of the term "lag" differs from its typical use in SAS which implies across records.  Does the following satisfy your extra condition?

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

View solution in original post


All Replies
Esteemed Advisor
Posts: 7,075

first occurence and conditions within arrays

You were close with your own code.  I only had to add one line and make a minor adjustment to another:

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      Def_balance=ifn(lbal(j) lt 100,disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Occasional Contributor
Posts: 18

first occurence and conditions within arrays

Hi Art,

Thank you.

In this case we are not chekcing the lag of the first occurence of"Y". In the code we are only emphasizing at the point where first"Y" occured.

Just make the matter clear. I add sixth observation in the data and show youhow the code impacts the results:-

Data I_have;input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt;
Datalines;
y    y    y    y   200    5    33   50    40  
N    N    N    y   100    44    22   24    50  
N    N    N    y   42    22    22   300    500  
N    N    Y    N   55    200    300   100    12  
N    N    y    y   500    99    15   400    14 N    N    N    y   100    44    200   24    50  
;
run;


The above data is exactly the same as previous one but I have added only oneobservation. In the sixth observation you would see that the first occurence of"Y"  happens at variable Dfm4 and at that point variable bal4 is24( which is smaller than 100) but then to fullfil the second argument that lagof first occurence should also have balance lower than 100,it is only than wecan have Def_balance=disc_amt. if we observe the first lag, which in this casehappens to be bal3 which is not lower than 100 therefore the Def_balance shouldbe 24 not 50.But the code produces 50 for Def_balnce variable.

With lots of thanks in advance.

regards,

Tony

Solution
‎11-16-2011 08:28 AM
Esteemed Advisor
Posts: 7,075

first occurence and conditions within arrays

I'm not sure I follow what you are trying to accomplish, as your use of the term "lag" differs from its typical use in SAS which implies across records.  Does the following satisfy your extra condition?

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Super Contributor
Posts: 1,636

first occurence and conditions within arrays

Data I_want (drop=j);

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      if lbal(j) ge 100 then def_balance=lbal(j);

          else if j>1 and lbal(j-1) ge 100 then  def_balance=lbal(j);

                  else def_balance=disc_amt;

           leave;

    end;

  end;

run;

Occasional Contributor
Posts: 18

first occurence and conditions within arrays

Great....I am highly thankful to both Art and Linlin. Both codes were of great help.

Thanks once again.

Just a little change for the sake of fun otherwise above programs do not need any change

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Valued Guide
Posts: 2,168

first occurence and conditions within arrays

Surprised no one offered the function WHICHC()

Esteemed Advisor
Posts: 7,075

first occurence and conditions within arrays

Peter,  I agree that whichc would have simplified the code, but I couldn't figure out how to use it and simultaneously upcase the variables being checked.

Valued Guide
Posts: 2,168

first occurence and conditions within arrays

Art

you are right!

When the case of the Y is uncertain, whichC() provides no easy way (like find() function modifiers).

However, I'm surprised that these indicators have uncertain case. It is the kind of problem we would remove as the data are loaded (the $upcase. informat is simple).

I don't think we would recommend carrying information in the distinction between "y" and "Y".

When case is uncertain, I would recommend (for clarity rather than peformance), a data step view to upper-case them all.

data  I_view / view= I_view ;

   set I_have ;

   array upp dfm: ;

   do over upp ;

      upp = upcase( upp ) ;

   end ;

run ;

For performance, alternatives come to mind, like Def_month = find( cats( of dfm: ), 'y', 'i' ) ; as in

data want ;

   set I_have ;

   Def_month = find( cats( of dfm: ), 'y', 'i' ) ;

   retain dum 0 ;

   array bal(*) dum bal: ;

   if not def_month then call missing( def_balance ) ;

   else

   if max( bal( def_month ), bal( def_month+1 ) ) < 100

      then def_balance = disc_amt ;

      else def_balance = bal( def_month+1 ) ;

   drop dum ;

run ;

I added DUM before the Balances in the array to remove exceptional handling when def_month=1

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 162 views
  • 4 likes
  • 4 in conversation