Hi All,
I have a dataset with column properties as the first three columns, I want to come up with a do loop to get the fourth column, whihc is range:
Date Status Group_ID Range (Want to get the value of this field)
date1 0 A
date2 0 A
date3 1 A date3-date1
date4 0 A
date5 1 A date5-date4
date6 1 A
date7 1 A
date8 1 B
date9 1 B
date10 0 B
date11 0 B
date12 0 B
date13 1 B date13-date10
date14 1 B
. . .
. . .
Data can go on with this pattern.
In this data, date is sorted ascendingly. Could you please let me now how should O come up with do loop to get the column range?
Thank you! Amy comments are welcome!
Jade
data want;
declare hash myhash();
rc = myhash.definekey('group_id');
rc = myhash.definedata('temp_date');
myhash.definedone();
do until(last.group_id);
set have;
by group_id;
if myhash.check() ne 0 and status=0 then
do;
temp_date=date;
myhash.add();
call missing(temp_date);
end;
else if status=1 and myhash.find()=0 then
do;
range=date-temp_date;
myhash.clear();
call missing(temp_date);
end;
output;
call missing(range);
end;
drop temp_date rc;
run;
It looks like this should do the trick:
data want;
set have;
by status notsorted;
if first.status then do;
if status=0 then begin_date = date;
else range = date - begin_date;
end;
retain begin_date;
drop begin_date;
run;
Thank you for the qucik reply. From the coding, I seems to get the following results (see range_got):
Date Status Group_ID Range (Want to get the value of this field) Range_got
date1 0 A
date2 0 A
date3 1 A date3-date1 date3-date2
date4 0 A
date5 1 A date5-date4 date5-date4
date6 1 A date6-date4
date7 1 A date7-date4
date8 1 B date8-date4
date9 1 B date9-date4
date10 0 B
date11 0 B
date12 0 B
date13 1 B date13-date10 date13-date12
date14 1 B date14-date12
At first glance, that doesn't seem possible. The code I posted wouldn't compute anything for most of the rows.
Since my solution still needs to be adjusted to account for changes in GROUP_ID, I can post a revised solution later. In the meantime, perhaps you could show the code that you actually ran that produced those calculations.
What is the role of groupId in your data?
The groupID is a higher level than status. There is a lot of groups (A, B, C, D, E,F, etc. groups) in the data, I forgot to mention in the post.
Within each group, there is status with 0 and 1.
The following is just a slight modification of @Astounding's suggested code to account for Group_ID:
data want; set have; by group_id status notsorted; if first.group_id then call missing(begin_date); if first.status then do; if status=0 then begin_date = date; else if status eq 1 and not missing(begin_date) then range = date - begin_date; end; retain begin_date; drop begin_date; run;
Art, CEO, AnalystFinder.com
data want;
declare hash myhash();
rc = myhash.definekey('group_id');
rc = myhash.definedata('temp_date');
myhash.definedone();
do until(last.group_id);
set have;
by group_id;
if myhash.check() ne 0 and status=0 then
do;
temp_date=date;
myhash.add();
call missing(temp_date);
end;
else if status=1 and myhash.find()=0 then
do;
range=date-temp_date;
myhash.clear();
call missing(temp_date);
end;
output;
call missing(range);
end;
drop temp_date rc;
run;
data have; input Date $ Status Group_ID $; cards; date1 0 A date2 0 A date3 1 A date4 0 A date5 1 A date6 1 A date7 1 A date8 1 B date9 1 B date10 0 B date11 0 B date12 0 B date13 1 B date14 1 B ; run; data temp; set have; by Group_ID; if first.Group_ID or lag(status)=1 then group+1; run; data want; set temp; by group; length first $ 100; retain first; if first.group then do;first=date;n=0;end; n+1; if last.group then range=catx('-',date,first); if n=1 then call missing(range); drop group first; run;
Thank you for your input!
Seems pretty straight forward to me. You will need to retain the starting date for the block to be able to subtract it out.
First let's make your example data a little more comcrete.
data have ;
infile cards dsd dlm='|' truncover ;
input group $ status date :yymmdd10. expect :$20. ;
format date yymmdd10. ;
cards;
A|0|2017-01-01|
A|0|2017-01-02|
A|1|2017-01-03|date3-date1=2
A|0|2017-01-04|
A|1|2017-01-05|date5-date4=1
A|1|2017-01-06|
A|1|2017-01-07|
B|1|2017-01-08|
B|1|2017-01-09|
B|0|2017-01-10|
B|0|2017-01-11|
B|0|2017-01-12|
B|1|2017-01-13|date13-date10=3
B|1|2017-01-14|
;
Now we just need to process by group and status, but tell SAS not to expect the data to really be sorted by those variables. That way we can find the first time that STATUS=0 or STATUS=1.
data want ;
set have ;
by group status notsorted ;
retain start_dt ;
if first.group then start_dt=.;
if status=0 and first.status then start_dt = date ;
if status=1 and first.status and not missing(start_dt) then want=date - start_dt ;
format start_dt yymmdd10. ;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.