BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Swordfish
Calcite | Level 5

Hi,

I would be thankful if I could get help on:

Data:-

Data I_have;input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt;

Datalines;

y    y    y    y    200    5    33    50    40   

N    N    N    y    100    44    22    24    50   

N    N    N    y    42    22    22    300    500   

N    N    Y    N    55    200    300    100    12   

N    N    y    y    500    99    15    400    14   

;run;

Goal:- Record the first occurence of the "Y" in a separate varaible called Def_month from the series

of the Variable/array called LDF,at the same point we take the Def_balance, shown in the code below.

To this extent I have completed the code. Now, if just before the first occurence and also at the first occurence of

"Y" the series of bal variable represented through Lbal remains less than or equal to 100 then Def_balance should equal

the variable disc_amt.

For example:-

For the first observation, first occurence of "Y" happens in Dfm1 therefore Def_month gets the value of 1 and since it this occurence come from first variable

so Def_balance=bal1 that is 200.

For the second observation occurence of "Y" happens in Dfm4 therefore Def_month gets the value of 4 and since it this occurence comes from fourth varialbe

so Def_balance=bal4 that is 4.But since we want extra condition to be fullfiled as discussed above the Def_balance then should be 50, value from disc_amt( needed output).This happens because 24 and 22 in Dfm3 and Dfm4 respectively have value less than 100;

Data I_get;

set I_have;

array LDF{*}$ dfm1-dfm4;

array Lbal{*} bal1-bal4;

do j=1 to dim(LDF);

if LDF(j) ="y" then do;

Def_month=j;

Def_balance=Lbal(j)

leave;

end;

end;

Data I_wanna;

input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt Def_month Def_balance;

Datalines;

y    y    y    y    200    5    33    50    40    1    200

N    N    N    y    100    44    22    24    50    4    50

N    N    N    y    42    22    22    300    500    4    300

N    N    Y    N    55    200    300    100    12    3    300

N    N    y    y    500    99    15    400    14    3    14

;run;

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I'm not sure I follow what you are trying to accomplish, as your use of the term "lag" differs from its typical use in SAS which implies across records.  Does the following satisfy your extra condition?

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

View solution in original post

8 REPLIES 8
art297
Opal | Level 21

You were close with your own code.  I only had to add one line and make a minor adjustment to another:

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      Def_balance=ifn(lbal(j) lt 100,disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Swordfish
Calcite | Level 5

Hi Art,

Thank you.

In this case we are not chekcing the lag of the first occurence of"Y". In the code we are only emphasizing at the point where first"Y" occured.

Just make the matter clear. I add sixth observation in the data and show youhow the code impacts the results:-

Data I_have;input Dfm1$ Dfm2$ Dfm3$ Dfm4$ bal1 bal2 bal3 bal4 disc_amt;
Datalines;
y    y    y    y   200    5    33   50    40  
N    N    N    y   100    44    22   24    50  
N    N    N    y   42    22    22   300    500  
N    N    Y    N   55    200    300   100    12  
N    N    y    y   500    99    15   400    14 N    N    N    y   100    44    200   24    50  
;
run;


The above data is exactly the same as previous one but I have added only oneobservation. In the sixth observation you would see that the first occurence of"Y"  happens at variable Dfm4 and at that point variable bal4 is24( which is smaller than 100) but then to fullfil the second argument that lagof first occurence should also have balance lower than 100,it is only than wecan have Def_balance=disc_amt. if we observe the first lag, which in this casehappens to be bal3 which is not lower than 100 therefore the Def_balance shouldbe 24 not 50.But the code produces 50 for Def_balnce variable.

With lots of thanks in advance.

regards,

Tony

art297
Opal | Level 21

I'm not sure I follow what you are trying to accomplish, as your use of the term "lag" differs from its typical use in SAS which implies across records.  Does the following satisfy your extra condition?

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Linlin
Lapis Lazuli | Level 10

Data I_want (drop=j);

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      if lbal(j) ge 100 then def_balance=lbal(j);

          else if j>1 and lbal(j-1) ge 100 then  def_balance=lbal(j);

                  else def_balance=disc_amt;

           leave;

    end;

  end;

run;

Swordfish
Calcite | Level 5

Great....I am highly thankful to both Art and Linlin. Both codes were of great help.

Thanks once again.

Just a little change for the sake of fun otherwise above programs do not need any change

Data I_want;

  set I_have;

  array LDF{*}$ dfm1-dfm4;

  array Lbal{*} bal1-bal4;

  do j=1 to dim(LDF);

    if upcase(LDF(j)) ="Y" then do;

      Def_month=j;

      use_disc_amt=disc_amt;

      if j gt 1 and lbal(j) lt 100 and lbal(j-1) ge 100

       then use_disc_amt=lbal(j);

      Def_balance=ifn(lbal(j) lt 100,use_disc_amt,Lbal(j));

      leave;

    end;

  end;

run;

Peter_C
Rhodochrosite | Level 12

Surprised no one offered the function WHICHC()

art297
Opal | Level 21

Peter,  I agree that whichc would have simplified the code, but I couldn't figure out how to use it and simultaneously upcase the variables being checked.

Peter_C
Rhodochrosite | Level 12

Art

you are right!

When the case of the Y is uncertain, whichC() provides no easy way (like find() function modifiers).

However, I'm surprised that these indicators have uncertain case. It is the kind of problem we would remove as the data are loaded (the $upcase. informat is simple).

I don't think we would recommend carrying information in the distinction between "y" and "Y".

When case is uncertain, I would recommend (for clarity rather than peformance), a data step view to upper-case them all.

data  I_view / view= I_view ;

   set I_have ;

   array upp dfm: ;

   do over upp ;

      upp = upcase( upp ) ;

   end ;

run ;

For performance, alternatives come to mind, like Def_month = find( cats( of dfm: ), 'y', 'i' ) ; as in

data want ;

   set I_have ;

   Def_month = find( cats( of dfm: ), 'y', 'i' ) ;

   retain dum 0 ;

   array bal(*) dum bal: ;

   if not def_month then call missing( def_balance ) ;

   else

   if max( bal( def_month ), bal( def_month+1 ) ) < 100

      then def_balance = disc_amt ;

      else def_balance = bal( def_month+1 ) ;

   drop dum ;

run ;

I added DUM before the Balances in the array to remove exceptional handling when def_month=1

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1118 views
  • 4 likes
  • 4 in conversation