BookmarkSubscribeRSS Feed
VibhaChordiya
Calcite | Level 5

Hi,

I am new to sas array and loops. I was wondering if someone can help me with my query.

Please find below sample of my dataset. Basically I want to output only those rows where the difference betweeen reportdate of 2 records is more than 8.

so for example fromw below records - i would want to output row1, row4, row5,.....so on

i was looking at writing something like below - but its not working. I want a way to tell that reportdate of observation _n_ is....

data chk;

set test;

array reportarray {1}ReportDate ;

do until (EndofFile);

Daysdiff = reportarray[i+1] - reportarray;

if

Daysdiff > 8 then output;

end;

run;

RecordReportDateAcctNumber
107/06/09ABC123456
213/06/09ABC123456
321/06/09ABC123456
428/06/09ABC123456
502/03/10ABC123456
607/03/10ABC123456
730/04/10ABC123456
802/05/10ABC123456
930/06/10ABC123456
1004/07/10ABC123456
1111/07/10ABC123456
1218/07/10ABC123456
1330/09/10ABC123456
1403/10/10ABC123456
1530/04/11ABC123456
1601/05/11ABC123456
1730/07/11ABC123456
1831/07/11ABC123456
1913/06/09DEF567890
2021/06/09DEF567890
2128/06/09DEF567890
2205/07/09DEF567890
2302/02/10DEF567890
2409/02/10DEF567890
2516/02/10DEF567890
9 REPLIES 9
art297
Opal | Level 21

Given your example which rows do you want to output?

VibhaChordiya
Calcite | Level 5

Hi,

Thanks for looking into this. I want to output below rows:

RecordReportDateAcctNumber
107/06/09ABC123456
228/06/09ABC123456
302/03/10ABC123456
530/04/10ABC123456
730/06/10ABC123456
1130/09/10ABC123456
1330/04/11ABC123456
1530/07/11ABC123456
1713/06/09DEF567890
2102/02/10DEF567890
Ksharp
Super User

It is so weird. Are you sure row4 is what you need?

data temp;
input Record     ReportDate : ddmmyy10.     AcctNumber : $10.;
format  ReportDate ddmmyy10.;
datalines;
1     07/06/09     ABC123456
2     13/06/09     ABC123456
3     21/06/09     ABC123456
4     28/06/09     ABC123456
5     02/03/10     ABC123456
6     07/03/10     ABC123456
7     30/04/10     ABC123456
8     02/05/10     ABC123456
9     30/06/10     ABC123456
10     04/07/10     ABC123456
11     11/07/10     ABC123456
12     18/07/10     ABC123456
13     30/09/10     ABC123456
14     03/10/10     ABC123456
15     30/04/11     ABC123456
16     01/05/11     ABC123456
17     30/07/11     ABC123456
18     31/07/11     ABC123456
19     13/06/09     DEF567890
20     21/06/09     DEF567890
21     28/06/09     DEF567890
22     05/07/09     DEF567890
23     02/02/10     DEF567890
24     09/02/10     DEF567890
25     16/02/10     DEF567890
;
run;
data want;
 set temp;
 dif=dif(ReportDate);
 if  AcctNumber ne lag(AcctNumber) then output;
  else if dif gt 8 then output;
  drop dif;
run;



Ksharp
MikeZdeb
Rhodochrosite | Level 12

Hi ... just a variation on the Ksharp solution (same result) ...

data want;

set temp;

by acctnumber;

if first.acctnumber or (^first.acctnumber and dif(reportdate) gt 8);

run;

Peter_C
Rhodochrosite | Level 12

won't that always select the first row of an account number?

MikeZdeb
Rhodochrosite | Level 12

hi ... yes, it matches the response to "Given your example which rows do you want to output?"

Peter_C
Rhodochrosite | Level 12

requirement stated as

I want to output only those rows where the difference betweeen reportdate of 2 records is more than 8.

has more than 1 interpretation

  1. output pairs with with difference > 8
  2. output later of pair
  3. output earlier

none of these requirements need ARRAY handling when the data are in order

The proposed code that follows produced the rows for Requirement 1 only.

As AcctNumber is shown among the data, I'll assume differences are relevant only within an account

Two "use cases" are not demonstrated in that test data, so I extended the test data by changing the ReportDate on record 12 to 20-jul-2011 (then record 12 will appear twice - once with record 11 and once with record 13), and to check that change in AcctNumber is tested, I added record 18.5

18.5,31/08/2011,ABC123457

I think the neatest solution uses "the read-ahead-merge"

With this data step

 

data requirement1  ;

option mergeNOby = NOwarn ;

   merge test

         test( firstobs=2 keep= ReportDate  AcctNumber

                 rename=( AcctNumber = a2

                          ReportDate = d2 ) ) ;

   if part = '1' then do;

      part = '2' ;

      output ;

      part = ' ' ;

   end ;

   if AcctNumber ne a2 then delete ;

     *********************************

   * finished with this AcctNumber *

   *********************************;  

   ddif =  d2 - reportDate ;

   if ddif > 8 then do ;

      part = '1' ;

      output ;

   end ;  

   retain part ;

   format ReportDate d2 date11.  ;

run ;

option mergeNOby = warn ;

I obtained these rows

The SAS System                                        10:26 Saturday, October 22, 2011

                                  Acct

Obs    record     ReportDate     Number               d2       a2        part    ddif

  1       4      28-JUN-2009    ABC123456    02-MAR-2010    ABC123456     1       247

  2       5      02-MAR-2010    ABC123456    07-MAR-2010    ABC123456     2

  3       6      07-MAR-2010    ABC123456    30-APR-2010    ABC123456     1        54

  4       7      30-APR-2010    ABC123456    02-MAY-2010    ABC123456     2

  5       8      02-MAY-2010    ABC123456    30-JUN-2010    ABC123456     1        59

  6       9      30-JUN-2010    ABC123456    04-JUL-2010    ABC123456     2

  7      11      11-JUL-2010    ABC123456    20-JUL-2010    ABC123456     1         9

  8      12      20-JUL-2010    ABC123456    30-SEP-2010    ABC123456     2

  9      12      20-JUL-2010    ABC123456    30-SEP-2010    ABC123456     1        72

10      13      30-SEP-2010    ABC123456    03-OCT-2010    ABC123456     2

11      14      03-OCT-2010    ABC123456    30-APR-2011    ABC123456     1       209

12      15      30-APR-2011    ABC123456    01-MAY-2011    ABC123456     2

13      16      01-MAY-2011    ABC123456    30-JUL-2011    ABC123456     1        90

14      17      30-JUL-2011    ABC123456    31-JUL-2011    ABC123456     2

15      22      05-JUL-2009    DEF567890    02-FEB-2010    DEF567890     1       212

16      23      02-FEB-2010    DEF567890    09-FEB-2010    DEF567890     2

Message was edited by: Peter Crawford ### was appearing for trailing blanks on some lines of code, so it seemed a tidy idea to clean

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1543 views
  • 8 likes
  • 5 in conversation