BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dac_js
Quartz | Level 8

I have the data in following format:
data have;
input id date :date9. Supply drug $;
datalines;
1 03JAN19 90 AB
2 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

I need help with deleting negative supply of same drug in this data (the first and second observation for ID 1, the first and second observation for ID 3, second and third observation of ID 5 in this example, but not for second and third observation of ID 7), considering the second observations are actually a refund or reversal of the earlier observation (reversal within 7days of a visit).


Thanks in advance for the help.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This version of the code includes a date range check.

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have 
        have (firstobs=2 keep=supply date rename=(date=_nxt_date supply=_nxt_supply));

  retain _delete_count 0 ;
  if last.drug=0 and supply>0 and -1*supply=_nxt_supply and _nxt_date<=date+7 then _delete_count=2;

  if _delete_count=0 then output;
  else _delete_count=_delete_count-1;
run;

This program assumes data are grouped by ID and DRUG, and are sorted by ascending date within each ID/DRUG group.  I take it that obs 2, had the wrong date, but I left it there.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

7 REPLIES 7
mkeintz
PROC Star

@dac_js wrote:

I have the data in following format:
data have;
input id date :date9. Supply drug $;
datalines;
1 03JAN19 90 AB
2 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

I need help with deleting negative supply of same drug in this data (the first and second observation for ID 1, the first and second observation for ID 3, second and third observation of ID 5 in this example, but not for second and third observation of ID 7), considering the second observations are actually a refund or reversal of the earlier observation (reversal within 7days of a visit).


Thanks in advance for the help.


But there is no "second observation for ID 1" in your sample data.  I presume the second obs in your sample should have ID=1  (not ID=2).

 

Then:

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have have (firstobs=2 keep=supply rename=(supply=_nxt_supply));

  retain _matches_prior_obs 0 ;
  /*If _matches prior obs or matches next obs, then do two things*/
  if _matches_prior_obs=1 or (-1*supply=_nxt_supply and last.drug=0) then do;
    delete;
    _matches_prior_obs=ifn(_matches_prior_obs=1,0,1);
  end;

run;

The program assumes that the matching pairs are always adjacent observations.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dac_js
Quartz | Level 8
Thank you for your suggestion. I tried it, but it didn't work as expected. I still have the negative values. Could you possibly offer any further guidance or an alternative solution? Thanks again for your help.
Patrick
Opal | Level 21

@dac_js I suggest to first try and create sample data that's fully representative for your real data. 

 

I assume the first two rows you've posted..

Patrick_0-1700355775791.png

...should actually look like:

Patrick_1-1700355829150.png

Could there also be additional scenarios in your real data? Like a single correction for two wrong entries?

Patrick_2-1700355933763.png

...or is it possible that you've got a single summary correction for two wrong entries?

 

Are corrections always after the fact or is it possible in your real data that a correction has been entered by mistake (a minus value) that then gets fixed with a positive value at a later date?

....and so on...

 

In order to provide code logic that will work for all your cases you need to provide sample data that reflects all the possible cases in your real data and of course you also need to tell us how to treat these cases.

Below some sample code that creates data for additional scenarios. Please amend this data as required so it reflects what you're really dealing with.

data have;
  input id date :date9. Supply drug $;
  format date date9.;
datalines;
1 02JAN19 90 AB
1 03JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
09 01apr19 -10 aa
09 02apr19 20  aa
10 01apr19 10  bb
10 02apr19 20  bb
10 03apr19 -10 bb
11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc
12 02jan19 -90 dd
12 03jan19 90  dd
;

 

dac_js
Quartz | Level 8

Thank you very much for your response! You are right about the first two rows of sample data 

The cases I provided is more common but it is possible to have cases like this:

11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc

The following type we do not need to deal with:

12 02jan19 -90 dd
12 03jan19 90  dd

 

 

Patrick
Opal | Level 21

Test if below does it for you.

data have;
  input id date :date9. Supply drug $;
  format date date9.;
datalines;
1 01JAN19 30 AB
1 02JAN19 90 AB
1 03JAN19 -100 AB
1 04JAN19 -10 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 40 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
10 01apr19 10  bb
10 02apr19 20  bb
10 03apr19 -10 bb
11 01apr19 10  cc
11 02apr19 10  cc
11 03apr19 -20 cc
;

proc sort data=have out=inter;
  by id drug descending date descending Supply;
run;

data want(drop=_:);
  set inter;
  by id drug;
  retain _correction;
  if first.drug then _correction=0;

  if supply<0 then _correction=sum(_correction,supply);
  else
    do;
      _Supply_adjusted=sum(supply,_correction);
      _correction      =min(sum(_correction,supply),0);
      if _Supply_adjusted>0 then 
        do;
          supply=_Supply_adjusted;
          output;
        end;
    end;
run;

proc print data=want;
run;
mkeintz
PROC Star

This version of the code includes a date range check.

 

data have;
input id date :date9. Supply drug $;
format date date9.;
datalines;
1 03JAN19 90 AB
1 02JAN19 -90 AB
2 15FEB2019 30 AB
3 02JAN19 30 AB
3 05JAN19 -30 AB
3 30MAR19 30 AB
4 03APR19 60 AB
5 01JAN19 60 CD
5 02FEB19 60 CD
5 06FEB19 -60 CD
6 01MAR19 30 CD
6 01APR19 30 CD
7 01JAN19 60 CD
7 02FEB19 60 CD
7 06FEB19 -60 XY
8 01MAR19 30 CD
8 01APR19 30 CD
;

data want (drop=_:);
  set have (keep=id drug);
  by id drug  notsorted;
  merge have 
        have (firstobs=2 keep=supply date rename=(date=_nxt_date supply=_nxt_supply));

  retain _delete_count 0 ;
  if last.drug=0 and supply>0 and -1*supply=_nxt_supply and _nxt_date<=date+7 then _delete_count=2;

  if _delete_count=0 then output;
  else _delete_count=_delete_count-1;
run;

This program assumes data are grouped by ID and DRUG, and are sorted by ascending date within each ID/DRUG group.  I take it that obs 2, had the wrong date, but I left it there.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dac_js
Quartz | Level 8

Thank you! Your code is working.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 7 replies
  • 1256 views
  • 1 like
  • 3 in conversation