Hello!
I need some help to retrieve values in the "next" lines of a report to create some metrics, look, i have the FLAG_1 and FLAG_2.
When flag_1 and flag_2 are equal to 1 the ID received the delivery okay, finish date is equal "-" and MONTHS_TO_RECEIVE equal 0
When flag_1 = 1 and flag_2 = 2 the ID didn't received in the currently month, so the months_to_receive need to be fill with the DATE (i put in bold to help look from where i retrieve the value) of the receive, the date is the next time when flag_1 =2 and flag_2 = 1.
In my data i have filled date, flag_1, flag_2, id and flag, i need to create the finish date, any idea? I tried first, retain and last...
DATE | FLAG_1 | FLAG_2 | ID | FLAG | FINISH_DATE | MONTHS_TO_RECEIVE |
202201 | 1 | 1 | A | RECEIVED - OK | - | 0 |
202202 | 1 | 2 | A | N/RECEIVED - FIRST MONTH | 202204 | 2 |
202203 | 2 | 2 | A | N/RECEIVED - WAITING | - | - |
202204 | 2 | 1 | A | RECEIVED - LATE | - | - |
202205 | 1 | 1 | A | RECEIVED - OK | - | 0 |
202206 | 1 | 2 | A | N/RECEIVED - FIRST MONTH | 202209 | 3 |
202207 | 2 | 2 | A | N/RECEIVED - WAITING | - | - |
202208 | 2 | 2 | A | N/RECEIVED - WAITING | - | - |
202209 | 2 | 1 | A | RECEIVED - LATE | - | - |
202210 | 1 | 1 | A | RECEIVED - OK | - | 0 |
202211 | 1 | 2 | A | N/RECEIVED - FIRST MONTH | 202212 | 1 |
202212 | 2 | 2 | A | RECEIVED - LATE | - | - |
I tried:
DATA TEST2; SET TEST; RETAIN FINISH_DATE; IF FLAG_1 = 1 AND FLAG_2 = 2 THEN FINISH_DATE = FINISH_DATE; BY ID DATE; RUN;
But, the same ID can receive or not in other date's. Someone can help me?
data have; infile datalines dlm="|" dsd; input date flag1 flag2 id :$1. flag : $30. finish_date months_to_receive ; datalines; 202201|1|1|A|RECEIVED - OK|-|0 202202|1|2|A|N/RECEIVED - FIRST MONTH|202204|2 202203|2|2|A|N/RECEIVED - WAITING|-|- 202204|2|1|A|RECEIVED - LATE|-|- 202205|1|1|A|RECEIVED - OK|-|0 202206|1|2|A|N/RECEIVED - FIRST MONTH|202209|3 202207|2|2|A|N/RECEIVED - WAITING|-|- 202208|2|2|A|N/RECEIVED - WAITING|-|- 202209|2|1|A|RECEIVED - LATE|-|- 202210|1|1|A|RECEIVED - OK|-|0 202211|1|2|A|N/RECEIVED - FIRST MONTH|202212|1 202212|2|2|A|RECEIVED - LATE|-|- ;
As always: please post data in usable form, so that we have something to work with that meets the data you have. This is extra-important if dates are in the data.
See here an example of your data in usable form:
data have;
infile datalines dlm="|" dsd;
input
date :yymmn6.
flag1
flag2
id :$1.
flag : $30.
finish_date :yymmn6.
months_to_receive
;
format date finish_date yymmn6.;
datalines;
202201|1|1|A|RECEIVED - OK|.|0
202202|1|2|A|N/RECEIVED - FIRST MONTH|202204|2
;
It is assumed that date and finish_date are in fact SAS date values with proper formatting.
Expand on this, and post it; make corrections where necessary so that the resulting dataset has the same attributes as your original dataset.
NEVER (as in NEVER!!!!) store date values like this. Not even when hell freezes over so hard that the brimstone turns superconducting.
Always store date values as SAS dates, so you can make use of formats and functions provided by the SAS system.
data have;
infile datalines dlm="|" dsd;
input
date :yymmn6.
flag1
flag2
id :$1.
flag : $30.
;
formnat date yymmn6.;
datalines;
202201|1|1|A|RECEIVED - OK|-|0
202202|1|2|A|N/RECEIVED - FIRST MONTH|202204|2
202203|2|2|A|N/RECEIVED - WAITING|-|-
202204|2|1|A|RECEIVED - LATE|-|-
202205|1|1|A|RECEIVED - OK|-|0
202206|1|2|A|N/RECEIVED - FIRST MONTH|202209|3
202207|2|2|A|N/RECEIVED - WAITING|-|-
202208|2|2|A|N/RECEIVED - WAITING|-|-
202209|2|1|A|RECEIVED - LATE|-|-
202210|1|1|A|RECEIVED - OK|-|0
202211|1|2|A|N/RECEIVED - FIRST MONTH|202212|1
202212|2|2|A|RECEIVED - LATE|-|-
;
proc sql;
create table want as
select
t1.*,
case
when t1.flag1 = 1 and t1.flag2 = 2
then t2.date
else .
end as date_finished format=yymmn6.,
case
when t1.flag1 = 1 and t1.flag2 = 2
then intck('month',t1.date,t2.date)
else .
end as months_to_receive
from have t1 left join have t2
on t1.id = t2.id and t2.flag = "RECEIVED - LATE" and t1.date < t2.date
group by t1.id, t1.date
having t2.date = min(t2.date)
;
quit;
An idea:
The step is hardly tested:
data want;
set have;
if _n_= 1 then do;
if 0 then set lookup;
declare hash h(dataset: 'lookup');
h.defineKey('id', 'finish_date');
h.defineDone();
end;
months_to_receive = 0;
finish_date = .;
if flag1 = 1 and flag2 = 2 then do;
months_to_receive = 1;
do until (not missing(finish_date));
finish_date = intnx('month', date, months_to_receive, 'b');
put finish_date=;
if h.check () ^= 0 then do;
finish_date = .;
months_to_receive = months_to_receive + 1;
end;
end;
end;
run;
For the bottom row (date 202212), why is the value of flag2=2? I ask, because it is apparently the month of a delivery, which I understand should be signified by flag2=1.
Assuming that is an error (which I have corrected below), this code produces what you want.
data have;
infile datalines dlm="|" dsd;
input
date :yymmn6.
flag1
flag2
id :$1.
flag : $30.
;
format date yymmn6.;
datalines;
202201|1|1|A|RECEIVED - OK|-|0
202202|1|2|A|N/RECEIVED - FIRST MONTH|202204|2
202203|2|2|A|N/RECEIVED - WAITING|-|-
202204|2|1|A|RECEIVED - LATE|-|-
202205|1|1|A|RECEIVED - OK|-|0
202206|1|2|A|N/RECEIVED - FIRST MONTH|202209|3
202207|2|2|A|N/RECEIVED - WAITING|-|-
202208|2|2|A|N/RECEIVED - WAITING|-|-
202209|2|1|A|RECEIVED - LATE|-|-
202210|1|1|A|RECEIVED - OK|-|0
202211|1|2|A|N/RECEIVED - FIRST MONTH|202212|1
202212|2|1|A|RECEIVED - LATE|-|-
;
data want (drop=_:);
do _n=1 by 1 until (flag2=1 or last.id);
set have;
by id date;
end;
if flag2=1 then _delivery_date=date; format _delivery_date yymmn6.;
do _n_=1 to _n;
set have;
if flag1=1 and flag2=1 then finish_date=.S; /*Same month delivery*/
else if flag1=2 and flag2=2 then finish_date=.W ; /*Waiting for delivery*/
else if flag1=1 and flag2=2 then finish_date=_delivery_date;
else if flag1=2 and flag2=1 then finish_date=.L; /*Received Late*/
if finish_date=.S then months_to_receipt=0;
else if not missing(finish_date) then months_to_receipt=intck('month',date,finish_date);
output;
call missing(months_to_receipt);
end;
format finish_date yymmn6. ;
run;
This code read each set of records up to a successful delivery (or end of the ID), in order to establish a delivery date. It then rereads (and outputs) the same set of records, generating the variables finish_date and months_to_receipt as needed, using the flag values and the established delivery date.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.