BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jade_SAS
Pyrite | Level 9

Hi All,

 

    I have a dataset with column properties as the first three columns, I want to come up with a do loop to get the fourth column, whihc is range:

 

    Date    Status    Group_ID    Range (Want to get the value of this field)

    date1      0          A               

    date2      0          A

    date3      1          A                 date3-date1

    date4      0          A

    date5      1          A                 date5-date4

    date6      1          A

    date7      1          A

    date8      1          B

    date9      1          B

    date10     0         B

    date11     0         B

    date12     0         B

    date13     1         B                date13-date10

    date14     1         B

      .              .          .

      .              .          .       

    Data can go on with this pattern.

 

  In this data, date is sorted ascendingly. Could you please let me now how should O come up with do loop to get the column range?

 

Thank you! Amy comments are welcome!

 

Jade 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

 

 data want;

    declare hash myhash();

    rc = myhash.definekey('group_id');

    rc = myhash.definedata('temp_date');

    myhash.definedone();

  do until(last.group_id);

    set have;

    by group_id;

    if myhash.check() ne 0 and status=0 then

      do;

        temp_date=date;

        myhash.add();

           call missing(temp_date);

     end;

   else if status=1 and myhash.find()=0 then

     do;

       range=date-temp_date;

       myhash.clear();

        call missing(temp_date);

      end;

output;

call missing(range);

end;

drop temp_date rc;

run;

 

 

View solution in original post

11 REPLIES 11
Astounding
PROC Star

It looks like this should do the trick:

 

data want;

set have;

by status notsorted;

if first.status then do;

   if status=0 then begin_date = date;

   else range = date - begin_date;

end;

retain begin_date;

drop begin_date;

run;

Jade_SAS
Pyrite | Level 9

Thank you for the qucik reply. From the coding, I seems to get the following results (see range_got):

 

Date    Status    Group_ID    Range (Want to get the value of this field)   Range_got

    date1      0          A               

    date2      0          A

    date3      1          A                 date3-date1                                                date3-date2

    date4      0          A

    date5      1          A                 date5-date4                                                date5-date4

    date6      1          A                                                                                     date6-date4

    date7      1          A                                                                                     date7-date4

    date8      1          B                                                                                     date8-date4

    date9      1          B                                                                                     date9-date4

    date10     0         B                                                                                    

    date11     0         B

    date12     0         B

    date13     1         B                date13-date10                                             date13-date12

    date14     1         B                                                                                     date14-date12

 

Astounding
PROC Star

At first glance, that doesn't seem possible.  The code I posted wouldn't compute anything for most of the rows.

 

Since my solution still needs to be adjusted to account for changes in GROUP_ID, I can post a revised solution later.  In the meantime, perhaps you could show the code that you actually ran that produced those calculations.

PGStats
Opal | Level 21

What is the role of groupId in your data?

PG
Jade_SAS
Pyrite | Level 9

The groupID is a higher level than status. There is a lot of groups (A, B, C, D, E,F, etc. groups) in the data,  I forgot to mention in the post.

 

Within each group, there is status with 0 and 1.

art297
Opal | Level 21

The following is just a slight modification of @Astounding's suggested code to account for Group_ID:

 

data want;
  set have;
  by group_id status notsorted;
  if first.group_id then call missing(begin_date);
  if first.status then do;
     if status=0 then begin_date = date;
      else if status eq 1 and not missing(begin_date) then range = date - begin_date;
  end;
  retain begin_date;
  drop begin_date;
run;

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20

 

 data want;

    declare hash myhash();

    rc = myhash.definekey('group_id');

    rc = myhash.definedata('temp_date');

    myhash.definedone();

  do until(last.group_id);

    set have;

    by group_id;

    if myhash.check() ne 0 and status=0 then

      do;

        temp_date=date;

        myhash.add();

           call missing(temp_date);

     end;

   else if status=1 and myhash.find()=0 then

     do;

       range=date-temp_date;

       myhash.clear();

        call missing(temp_date);

      end;

output;

call missing(range);

end;

drop temp_date rc;

run;

 

 

Ksharp
Super User

data have;
input Date $    Status    Group_ID  $;
cards;
    date1      0          A               
    date2      0          A
    date3      1          A          
    date4      0          A
    date5      1          A               
    date6      1          A
    date7      1          A
    date8      1          B
    date9      1          B
    date10     0         B
    date11     0         B
    date12     0         B
    date13     1         B               
    date14     1         B
;
run;

data temp;
 set have;
 by Group_ID;
 if first.Group_ID or lag(status)=1 then group+1;
run;

data want;
 set temp;
 by group;
 length first $ 100;
 retain first;
 if first.group then do;first=date;n=0;end;
 n+1;
 if last.group then range=catx('-',date,first);
 if n=1 then call missing(range);
 drop group first;
run;


Jade_SAS
Pyrite | Level 9

Thank you for your input!

Tom
Super User Tom
Super User

Seems pretty straight forward to me. You will need to retain the starting date for the block to be able to subtract it out.

First let's make your example data a little more comcrete.

data have ;
  infile cards dsd dlm='|' truncover ;
  input group $ status date :yymmdd10. expect :$20. ;
  format date yymmdd10. ;
cards;
A|0|2017-01-01|
A|0|2017-01-02|
A|1|2017-01-03|date3-date1=2
A|0|2017-01-04|
A|1|2017-01-05|date5-date4=1
A|1|2017-01-06|
A|1|2017-01-07|
B|1|2017-01-08|
B|1|2017-01-09|
B|0|2017-01-10|
B|0|2017-01-11|
B|0|2017-01-12|
B|1|2017-01-13|date13-date10=3
B|1|2017-01-14|
;

Now we just need to process by group and status, but tell SAS not to expect the data to really be sorted by those variables. That way we can find the first time that STATUS=0 or STATUS=1.

data want ;
  set have ;
  by group status notsorted ;
  retain start_dt ;
  if first.group then start_dt=.;
  if status=0 and first.status then start_dt = date ;
  if status=1 and first.status and not missing(start_dt) then want=date - start_dt ;
  format start_dt yymmdd10. ;
run;

image.png

Jade_SAS
Pyrite | Level 9
Thank you, I will test it.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 3657 views
  • 5 likes
  • 7 in conversation