BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8


Hi All,

Could some on ehelp me in understanding the code for the question with the title shown above.

Detail codes are given for that question and could some on explain to me why the same dataset is set twice and whats the logic behind...etc etc

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Karun,

I presume you are asking why/how the DOW works.

It is simply you taking control of the data step.  Your code has two loops (i.e., do until statements), each set(ing) the file 'have' with a by statement.

Each Account_number goes through the two loops separately.  Thus, in the first loop, it looks through all of the records that have Account_number 111 and, for each of those that contain an Arrears_Band equal to "write off", it increments a variable called flag.  If the Account_number has one such record, flag will be set to 1, if it has two it will be set to 2, etc.

Then, in the second loop, it again looks only at the Account_number 111 records, but the value of the flag variable is present in each of the records.  If it is greater than one, then the records are output;

Finally, flag is set to missing, and then dropped from the output file, and the system goes back to the first loop and reads account number 222, and so on, and so on.

View solution in original post

3 REPLIES 3
ballardw
Super User

Time to provide an example of input data and desired output as the topic is very vague.

robertrao
Quartz | Level 8

Dear Ballard,

Thanks for the help.

Below is the data I am referring to:

The output I desired is below the data;

This question has been answered (several codes seen below) earlier in the forum......BUT since I am a beginner i want the explanation of the code

Several ways of code are given below (bottom of this  reply)

/*Classic 2X DOW*/

/*Merge*/

/*Hash()*/

Only the merge method I am familiar

Even in that method there are 2 times we set the same dataset......explanation for stuff like this...........

In this data set, I need to separate the records  where  Arrears_Band at least once has “writoff” value.

data have;

informat Current_date date9.;

Input Current_date   Account_number 11-13 Arrears_Band $ 15-25;

Format Current_date date9.;

cards;

31MAY2011  111  NPNA

30JUN2011  111  writoff

31JUL2011  111  NPNA

31AUG2011  111  NPNA

30SEP2011  111  NPNA

31OCT2011  111  NPNA

30NOV2011  111  NPNA

31DEC2011  111  NPNA

31JAN2012  111  NPNA

31DEC2011  222  NPNA

31JAN2012  222  NPNA

30NOV2010  333  NPNA

31DEC2010  333  NPNA

31JAN2011  333  NPNA

28FEB2011  333  NPNA

31MAR2011  333  writoff

30APR2011  333  NPNA

31MAY2011  333  NPNA

30JUN2011  333  NPNA

31JUL2011  333  NPNA

31AUG2011  333  NPNA

30SEP2011  333  NPNA

31OCT2011  333  NPNA

30NOV2011  333  NPNA

31DEC2011  333  NPNA

31JAN2012  333  NPNA

28FEB2010  444  Current

31MAR2010  444  30 - 60

30APR2010  444  30 - 60

31MAY2010  444  Current

30JUN2010  444  Current

31JUL2010  444  Current

31AUG2010  444  Current

30SEP2010  444  Current

31OCT2010  444  Current

30NOV2010  444  Current

31DEC2010  444  Current

31JAN2011  444  1 - 30

28FEB2011  444  30 - 60

31MAR2011  444  60 - 90

30APR2011  444  90 +

31MAY2011  444  90 +

30JUN2011  444  90 +

31JUL2011  444  NPNA

31AUG2011  444  NPNA

30SEP2011  444  NPNA

31OCT2011  444  NPNA

30NOV2011  444  NPNA

31DEC2011  444  NPNA

31JAN2012  444  NPNA

28FEB2010  555  30 - 60

31MAR2010  555  30 - 60

30APR2010  555  60 - 90

31MAY2010  555  NPNA

31JAN2012  666  writoff

31JAN2012  777  NPNA

;

run;

/*Answer*/

The data set I want is this.

 

Current_date    Account_number Arrears_Band

31MAY2011  111  NPNA

30JUN2011  111  writoff

31JUL2011  111  NPNA

31AUG2011  111  NPNA

30SEP2011  111  NPNA

31OCT2011  111  NPNA

30NOV2011  111  NPNA

31DEC2011  111  NPNA

31JAN2012  111  NPNA

30NOV2010  333  NPNA

31DEC2010  333  NPNA

31JAN2011  333  NPNA

28FEB2011  333  NPNA

31MAR2011  333  writoff

30APR2011  333  NPNA

31MAY2011  333  NPNA

30JUN2011  333  NPNA

31JUL2011  333  NPNA

31AUG2011  333  NPNA

30SEP2011  333  NPNA

31OCT2011  333  NPNA

30NOV2011  333  NPNA

31DEC2011  333  NPNA

31JAN2012  333  NPNA

31JAN2012  666  writoff

 

/*Classic 2X DOW*/

data want;

  do until (last.account_number);

    set have;

      by account_number;

        if Arrears_Band='writoff' then flag+1;

   end;

   do until (last.account_number);

    set have;

      by account_number;

        if flag >=1 then output;

   end;

   call missing(flag);

   drop flag;

run;

/*Merge*/

data want;

  merge have (where=(Arrears_Band='writoff') in=a) have (in=b);

   by account_number;

   if a and b;

run;

/*Hash()*/

data want;

  if _n_=1 then do;

    if 0 then set have;

       dcl hash h(dataset:'have', multidata:'y');

       h.definekey('account_number');

       h.definedata(all:'y');

       h.definedone();

   end;

   do until (last.account_number);

     set have;

       by account_number;

         if Arrears_Band='writoff' then flag+1;

   end;

   if flag>=1 then do;

      rc=h.find();

       do rc=0 by 0 while (rc=0);

          output;

          rc=h.find_next();

       end;

    end;

    call missing(flag);

    drop flag rc;

    run;

art297
Opal | Level 21

Karun,

I presume you are asking why/how the DOW works.

It is simply you taking control of the data step.  Your code has two loops (i.e., do until statements), each set(ing) the file 'have' with a by statement.

Each Account_number goes through the two loops separately.  Thus, in the first loop, it looks through all of the records that have Account_number 111 and, for each of those that contain an Arrears_Band equal to "write off", it increments a variable called flag.  If the Account_number has one such record, flag will be set to 1, if it has two it will be set to 2, etc.

Then, in the second loop, it again looks only at the Account_number 111 records, but the value of the flag variable is present in each of the records.  If it is greater than one, then the records are output;

Finally, flag is set to missing, and then dropped from the output file, and the system goes back to the first loop and reads account number 222, and so on, and so on.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 717 views
  • 0 likes
  • 3 in conversation