Having some programming issues. I would like to take a table that has information populated for the first ID, and populate the rest of the observations for the ID according to specific conditions.
1. I would like to populate REV_DATE_LAST with the most recent REV_DATE_LAST by ID, unless the prior observation included Mark=1.
2. I would then like to calculate REV_AGE as YEARMO - REV_DATE_LAST.
3. Mark should =1 if REV_AGE is GE 12 unless FLAG=1
The intent is to review each ID every 12 months, unless they are flagged, then the review should take place once they are no longer flagged.
I have tried many different potential solutions with no avail including lags, retains, counts, etc.
HAVE:
ID | YEARMO | REV_DATE_LAST | FLAG | REV_AGE | MARK |
1 | 201803 | 201608 | 22 | 1 | |
1 | 201804 | ||||
1 | 201805 | ||||
1 | 201806 | ||||
1 | 201807 | ||||
1 | 201808 | ||||
1 | 201809 | ||||
1 | 201810 | ||||
1 | 201811 | ||||
1 | 201812 | ||||
1 | 201901 | ||||
1 | 201902 | ||||
1 | 201903 | ||||
1 | 201904 | ||||
1 | 201905 | ||||
1 | 201906 | ||||
2 | 201803 | 201607 | 1 | 21 | |
2 | 201804 | 1 | |||
2 | 201805 | 1 | |||
2 | 201806 | 1 | |||
2 | 201807 | 1 | |||
2 | 201808 | 1 | |||
2 | 201809 | 1 | |||
2 | 201810 | 1 | |||
2 | 201811 | 1 | |||
2 | 201812 | 1 | |||
2 | 201901 | 1 | |||
2 | 201902 | 1 | |||
2 | 201903 | ||||
2 | 201904 | ||||
2 | 201905 | ||||
2 | 201906 | 1 | |||
2 | 201907 | 1 | |||
2 | 201908 | 1 | |||
2 | 201909 | 1 | |||
2 | 201910 | 1 | |||
2 | 201911 | 1 | |||
2 | 201912 | 1 | |||
2 | 202001 | 1 | |||
2 | 202002 | 1 | |||
2 | 202003 | ||||
2 | 202004 | ||||
2 | 202005 |
WANT:
ID | YEARMO | REV_DATE_LAST | FLAG | REV_AGE | MARK |
1 | 201803 | 201608 | 22 | 1 | |
1 | 201804 | 201804 | 0 | . | |
1 | 201805 | 201804 | 1 | . | |
1 | 201806 | 201804 | 2 | . | |
1 | 201807 | 201804 | 3 | . | |
1 | 201808 | 201804 | 4 | . | |
1 | 201809 | 201804 | 5 | . | |
1 | 201810 | 201804 | 6 | . | |
1 | 201811 | 201804 | 7 | . | |
1 | 201812 | 201804 | 8 | . | |
1 | 201901 | 201804 | 9 | . | |
1 | 201902 | 201804 | 10 | . | |
1 | 201903 | 201804 | 11 | . | |
1 | 201904 | 201804 | 12 | 1 | |
1 | 201905 | 201905 | 0 | . | |
1 | 201906 | 201905 | 1 | . | |
2 | 201803 | 201607 | 1 | 21 | . |
2 | 201804 | 201607 | 1 | 22 | . |
2 | 201805 | 201607 | 1 | 23 | . |
2 | 201806 | 201607 | 1 | 24 | . |
2 | 201807 | 201607 | 1 | 25 | . |
2 | 201808 | 201607 | 1 | 26 | . |
2 | 201809 | 201607 | 1 | 27 | . |
2 | 201810 | 201607 | 1 | 28 | . |
2 | 201811 | 201607 | 1 | 29 | . |
2 | 201812 | 201607 | 1 | 30 | . |
2 | 201901 | 201607 | 1 | 31 | . |
2 | 201902 | 201607 | 1 | 32 | . |
2 | 201903 | 201607 | 33 | 1 | |
2 | 201904 | 201904 | 0 | . | |
2 | 201905 | 201904 | 1 | . | |
2 | 201906 | 201904 | 1 | 2 | . |
2 | 201907 | 201904 | 1 | 3 | . |
2 | 201908 | 201904 | 1 | 4 | . |
2 | 201909 | 201904 | 1 | 5 | . |
2 | 201910 | 201904 | 1 | 6 | . |
2 | 201911 | 201904 | 1 | 7 | . |
2 | 201912 | 201904 | 1 | 8 | . |
2 | 202001 | 201904 | 1 | 9 | . |
2 | 202002 | 201904 | 1 | 10 | . |
2 | 202003 | 201904 | 11 | . | |
2 | 202004 | 201904 | 12 | 1 | |
2 | 202005 | 202005 | 0 |
Your initial rev_age values for both groups are wrong, given the values of yearmo and rev_date_last.
But see this code:
data have;
infile datalines dlm="09"x dsd truncover;
input ID YEARMO :yymmn6. REV_DATE_LAST :yymmn6. FLAG REV_AGE MARK;
format yearmo rev_date_last yymmn6.;
datalines;
1 201803 201608 22 1
1 201804
1 201805
1 201806
1 201807
1 201808
1 201809
1 201810
1 201811
1 201812
1 201901
1 201902
1 201903
1 201904
1 201905
1 201906
2 201803 201607 1 21
2 201804 1
2 201805 1
2 201806 1
2 201807 1
2 201808 1
2 201809 1
2 201810 1
2 201811 1
2 201812 1
2 201901 1
2 201902 1
2 201903
2 201904
2 201905
2 201906 1
2 201907 1
2 201908 1
2 201909 1
2 201910 1
2 201911 1
2 201912 1
2 202001 1
2 202002 1
2 202003
2 202004
2 202005
;
data want;
set have (rename=(rev_date_last=_rev_date_last mark=_mark));
by id;
retain rev_date_last mark;
format rev_date_last yymmn6.;
if first.id
then do;
rev_date_last = _rev_date_last;
mark = _mark;
end;
else do;
if mark
then do;
rev_date_last = yearmo;
mark = .;
end;
end;
rev_age = intck('month',rev_date_last,yearmo);
if not flag and rev_age ge 12 then mark = 1;
drop _:;
run;
Your initial rev_age values for both groups are wrong, given the values of yearmo and rev_date_last.
But see this code:
data have;
infile datalines dlm="09"x dsd truncover;
input ID YEARMO :yymmn6. REV_DATE_LAST :yymmn6. FLAG REV_AGE MARK;
format yearmo rev_date_last yymmn6.;
datalines;
1 201803 201608 22 1
1 201804
1 201805
1 201806
1 201807
1 201808
1 201809
1 201810
1 201811
1 201812
1 201901
1 201902
1 201903
1 201904
1 201905
1 201906
2 201803 201607 1 21
2 201804 1
2 201805 1
2 201806 1
2 201807 1
2 201808 1
2 201809 1
2 201810 1
2 201811 1
2 201812 1
2 201901 1
2 201902 1
2 201903
2 201904
2 201905
2 201906 1
2 201907 1
2 201908 1
2 201909 1
2 201910 1
2 201911 1
2 201912 1
2 202001 1
2 202002 1
2 202003
2 202004
2 202005
;
data want;
set have (rename=(rev_date_last=_rev_date_last mark=_mark));
by id;
retain rev_date_last mark;
format rev_date_last yymmn6.;
if first.id
then do;
rev_date_last = _rev_date_last;
mark = _mark;
end;
else do;
if mark
then do;
rev_date_last = yearmo;
mark = .;
end;
end;
rev_age = intck('month',rev_date_last,yearmo);
if not flag and rev_age ge 12 then mark = 1;
drop _:;
run;
Kurt,
Brilliant solution. Thank you for posting! Have a great weekend.
- adschiltz
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.