BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dac_js
Quartz | Level 8

I have the data in following format:
data have;
input id date :date9. Supply drug $;
datalines;
1 03JAN19 90 AB
2 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

I need help with deleting negative supply of same drug in this data (the first and second observation for ID 1, the first and second observation for ID 3, second and third observation of ID 5 in this example, but not for second and third observation of ID 7), considering the second observations are actually a refund or reversal of the earlier observation (reversal within 7days of a visit).


Thanks in advance for the help.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This version of the code includes a date range check.

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have 
        have (firstobs=2 keep=supply date rename=(date=_nxt_date supply=_nxt_supply));

  retain _delete_count 0 ;
  if last.drug=0 and supply>0 and -1*supply=_nxt_supply and _nxt_date<=date+7 then _delete_count=2;

  if _delete_count=0 then output;
  else _delete_count=_delete_count-1;
run;

This program assumes data are grouped by ID and DRUG, and are sorted by ascending date within each ID/DRUG group.  I take it that obs 2, had the wrong date, but I left it there.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

7 REPLIES 7
mkeintz
PROC Star

@dac_js wrote:

I have the data in following format:
data have;
input id date :date9. Supply drug $;
datalines;
1 03JAN19 90 AB
2 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

I need help with deleting negative supply of same drug in this data (the first and second observation for ID 1, the first and second observation for ID 3, second and third observation of ID 5 in this example, but not for second and third observation of ID 7), considering the second observations are actually a refund or reversal of the earlier observation (reversal within 7days of a visit).


Thanks in advance for the help.


But there is no "second observation for ID 1" in your sample data.  I presume the second obs in your sample should have ID=1  (not ID=2).

 

Then:

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have have (firstobs=2 keep=supply rename=(supply=_nxt_supply));

  retain _matches_prior_obs 0 ;
  /*If _matches prior obs or matches next obs, then do two things*/
  if _matches_prior_obs=1 or (-1*supply=_nxt_supply and last.drug=0) then do;
    delete;
    _matches_prior_obs=ifn(_matches_prior_obs=1,0,1);
  end;

run;

The program assumes that the matching pairs are always adjacent observations.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dac_js
Quartz | Level 8
Thank you for your suggestion. I tried it, but it didn't work as expected. I still have the negative values. Could you possibly offer any further guidance or an alternative solution? Thanks again for your help.
Patrick
Opal | Level 21

@dac_js I suggest to first try and create sample data that's fully representative for your real data. 

 

I assume the first two rows you've posted..

Patrick_0-1700355775791.png

...should actually look like:

Patrick_1-1700355829150.png

Could there also be additional scenarios in your real data? Like a single correction for two wrong entries?

Patrick_2-1700355933763.png

...or is it possible that you've got a single summary correction for two wrong entries?

 

Are corrections always after the fact or is it possible in your real data that a correction has been entered by mistake (a minus value) that then gets fixed with a positive value at a later date?

....and so on...

 

In order to provide code logic that will work for all your cases you need to provide sample data that reflects all the possible cases in your real data and of course you also need to tell us how to treat these cases.

Below some sample code that creates data for additional scenarios. Please amend this data as required so it reflects what you're really dealing with.

data have;
  input id date :date9. Supply drug $;
  format date date9.;
datalines;
1 02JAN19 90 AB
1 03JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
09 01apr19 -10 aa
09 02apr19 20  aa
10 01apr19 10  bb
10 02apr19 20  bb
10 03apr19 -10 bb
11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc
12 02jan19 -90 dd
12 03jan19 90  dd
;

 

dac_js
Quartz | Level 8

Thank you very much for your response! You are right about the first two rows of sample data 

The cases I provided is more common but it is possible to have cases like this:

11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc

The following type we do not need to deal with:

12 02jan19 -90 dd
12 03jan19 90  dd

 

 

Patrick
Opal | Level 21

Test if below does it for you.

data have;
  input id date :date9. Supply drug $;
  format date date9.;
datalines;
1 01JAN19 30 AB
1 02JAN19 90 AB
1 03JAN19 -100 AB
1 04JAN19 -10 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 40 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
10 01apr19 10  bb
10 02apr19 20  bb
10 03apr19 -10 bb
11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc
;

proc sort data=have out=inter;
  by id drug descending date descending Supply;
run;

data want(drop=_:);
  set inter;
  by id drug;
  retain _correction;
  if first.drug then _correction=0;

  if supply<0 then _correction=sum(_correction,supply);
  else
    do;
      _Supply_adjusted=sum(supply,_correction);
      _correction      =min(sum(_correction,supply),0);
      if _Supply_adjusted>0 then 
        do;
          supply=_Supply_adjusted;
          output;
        end;
    end;
run;

proc print data=want;
run;
mkeintz
PROC Star

This version of the code includes a date range check.

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have 
        have (firstobs=2 keep=supply date rename=(date=_nxt_date supply=_nxt_supply));

  retain _delete_count 0 ;
  if last.drug=0 and supply>0 and -1*supply=_nxt_supply and _nxt_date<=date+7 then _delete_count=2;

  if _delete_count=0 then output;
  else _delete_count=_delete_count-1;
run;

This program assumes data are grouped by ID and DRUG, and are sorted by ascending date within each ID/DRUG group.  I take it that obs 2, had the wrong date, but I left it there.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dac_js
Quartz | Level 8

Thank you! Your code is working.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Health and Life Sciences Learning

 

Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.

LEARN MORE

Discussion stats
  • 7 replies
  • 2860 views
  • 1 like
  • 3 in conversation