I have the data in following format:
ID | Date | Supply | Pay | Visit |
1 | 03JAN19 | 90 | 90 | 1 |
2 | 02JAN19 | 30 | 30 | 1 |
2 | 15FEB2019 | 30 | 30 | 2 |
3 | 02JAN19 | 30 | 30 | 1 |
3 | 05JAN19 | -30 | -30 | 2 |
3 | 30MAR19 | 30 | 30 | 3 |
4 | 03APR19 | 60 | 60 | 1 |
5 | 01JAN19 | 60 | 60 | 1 |
5 | 02FEB19 | 60 | 60 | 2 |
5 | 06FEB19 | -60 | -60 | 3 |
6 | 01MAR19 | 30 | 30 | 1 |
6 | 01APR19 | 30 | 30 | 2 |
I want to delete negative claims dollar amount and negative supply in this data (the FIRST AND SECOND VISIT of ID 3 and SECOND AND THIRD VISIT of ID 5 in this example), considering the second observations are actually a refund or reversal of the earlier observation (reversal within 7days of a visit).
Thanks in advance for the help.
data a1;
input ID date date9. Supply Pay Visit;
format date date9.;
datalines;
1 03JAN19 90 90 1
2 02JAN19 30 30 1
2 15FEB2019 30 30 2
3 02JAN19 30 30 1
3 05JAN19 -30 -30 2
3 30MAR19 30 30 3
4 03APR19 60 60 1
5 01JAN19 60 60 1
5 02FEB19 60 60 2
5 06FEB19 -60 -60 3
6 01MAR19 30 30 1
6 01APR19 30 30 2
;
This should do it
data a1;
input ID date :date9. Supply Pay Visit;
format date date9.;
datalines;
1 03JAN19 90 90 1
2 02JAN19 30 30 1
2 15FEB2019 30 30 2
3 02JAN19 30 30 1
3 05JAN19 -30 -30 2
3 30MAR19 30 30 3
4 03APR19 60 60 1
5 01JAN19 60 60 1
5 02FEB19 60 60 2
5 06FEB19 -60 -60 3
6 01MAR19 30 30 1
6 01APR19 30 30 2
;
data want (drop = v d rc);
dcl hash h (multidata : "Y");
h.definekey("ID", "Supply", "Pay");
h.definedata("d", "V");
h.definedone();
array r {9999} _temporary_ (9999 * .);
do until (last.ID);
set a1;
by ID;
v = Visit;
d = date;
h.add();
end;
v = .;
d = .;
do until (last.ID);
set a1;
by ID;
if r[visit] then continue;
rc = h.find(key : ID, key : -Supply, key : -Pay);
if rc = 0 and date + 7 > d and r[v] = . then do;
r[v] = 1;
r[visit] = 1;
continue;
end;
output;
end;
h.clear();
call missing(of r[*]);
format d date9.;
run;
Result:
ID date Supply Pay Visit 1 03JAN2019 90 90 1 2 02JAN2019 30 30 1 2 15FEB2019 30 30 2 3 30MAR2019 30 30 3 4 03APR2019 60 60 1 5 01JAN2019 60 60 1 6 01MAR2019 30 30 1 6 01APR2019 30 30 2
There can be an observation (or more) between the highlighted observations, correct?
This should do it
data a1;
input ID date :date9. Supply Pay Visit;
format date date9.;
datalines;
1 03JAN19 90 90 1
2 02JAN19 30 30 1
2 15FEB2019 30 30 2
3 02JAN19 30 30 1
3 05JAN19 -30 -30 2
3 30MAR19 30 30 3
4 03APR19 60 60 1
5 01JAN19 60 60 1
5 02FEB19 60 60 2
5 06FEB19 -60 -60 3
6 01MAR19 30 30 1
6 01APR19 30 30 2
;
data want (drop = v d rc);
dcl hash h (multidata : "Y");
h.definekey("ID", "Supply", "Pay");
h.definedata("d", "V");
h.definedone();
array r {9999} _temporary_ (9999 * .);
do until (last.ID);
set a1;
by ID;
v = Visit;
d = date;
h.add();
end;
v = .;
d = .;
do until (last.ID);
set a1;
by ID;
if r[visit] then continue;
rc = h.find(key : ID, key : -Supply, key : -Pay);
if rc = 0 and date + 7 > d and r[v] = . then do;
r[v] = 1;
r[visit] = 1;
continue;
end;
output;
end;
h.clear();
call missing(of r[*]);
format d date9.;
run;
Result:
ID date Supply Pay Visit 1 03JAN2019 90 90 1 2 02JAN2019 30 30 1 2 15FEB2019 30 30 2 3 30MAR2019 30 30 3 4 03APR2019 60 60 1 5 01JAN2019 60 60 1 6 01MAR2019 30 30 1 6 01APR2019 30 30 2
Anytime 🙂
Hi @dac_js Good morning,In our Bank especially in commercial cards division, these kind of situations has been increasingly common. If I understand you correctly, the SIGN function can be a great utility. Since, a reversal in a transaction would be a subsequent to the +VE, we could perceive as a "Forward looking" logic. What this means is the reversal is checked against the latest positive transaction for the same amount.
data a1;
input ID date :date9. Supply Pay Visit;
format date date9.;
datalines;
1 03JAN19 90 90 1
2 02JAN19 30 30 1
2 15FEB2019 30 30 2
3 02JAN19 30 30 1
3 05JAN19 -30 -30 2
3 30MAR19 30 30 3
4 03APR19 60 60 1
5 01JAN19 60 60 1
5 02FEB19 60 60 2
5 06FEB19 -60 -60 3
6 01MAR19 30 30 1
6 01APR19 30 30 2
;
data want;
if _n_=1 then do;
dcl hash h();
h.definekey('supply');
h.definedata('_iorc_');
h.definedone();
end;
array t(-138061:2936547) _temporary_;
call missing(of t(*),_iorc_);
do _n_=1 by 1 until(last.id);
set a1;
by id;
_iorc_=date;
if sign(pay)=1 then h.replace();
else if h.find(key:abs(supply))=0 then if intck('day',_iorc_,date)<=7 then do;
t(date)=date;
t(_iorc_)=_iorc_;
end;
end;
do _n_=1 to _n_;
set a1;
if t(date) then continue;
output;
end;
h.clear();
run;
Of course you have a working solution, however lately I am deep diving to make some business sense for my own personal learning. Therefore, you could care less. Cheers!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.