SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kbinan
Obsidian | Level 7

Good afternoon!

I am trying to ultimately create a single record to represent an Inpatient Hospital stay from detailed claims data which has an Admit and Discharge date that spans the entirety each event. 

I created Conf_Claim which tells me this is the same person and hospital all submitted on a single claim.  I also created Conf_Prov which tells me it’s the same person and hospital, regardless of claim since there can be multiple.

First Pass to create IP_Start and IP_End worked as expected:

proc sort data=work.CONF_DATA1;
by CONF_CLM FST_SRVC_DT;
run;


DATA WORK.conf_data2;
do _n_=1 by 1 until(last.CONF_CLM);
SET WORK.CONF_DATA1; 
BY CONF_CLM;
IP_START = min(IP_START, FST_SRVC_DT);
IP_END = max(IP_END,LST_SRVC_DT);
end;

do _n_=1 to _n_;
set WORK.CONF_DATA1;
output;
end;
format IP_START IP_END date9.;
run;

 

In some cases, all records come in on a single claim (CONF_CLM) so min/max is easy, as in the first 3 examples, so the above code works.  However, in other cases it is represented by multiple claims, or where I have the same member and hospital but the date are not continuous (meaning 2 or more separate events) it gets more complicated.

I had tried something similar to above, but it grouped those instances where the date was not continuous into one event – I don’t want that to happen.

proc sort data=work.CONF_DATA2;

by CONF_CLM CONF_PROV FST_SRVC_DT;

run;

DATA WORK.conf_data3;

do _n_=1 by 1 until(last.CONF_PROV);

 SET WORK.CONF_DATA2; 

  BY CONF_PROV;

  ADMIT =min(ADMIT,IP_START);

  DISCHG=max(DISCHG,IP_END);

 end;



 do _n_=1 to _n_;

  set WORK.CONF_DATA2;

  output;

 end;

 format  ADMIT DISCHG date9.;

run;

 

Example of current output:

 
CONF_CLM CONF_PROV Member FST_SRVC_DT LST_SRVC_DT PROV_NAME IP_START IP_END ADMIT DISCHG ACTUAL_ADMIT ACTUAL_DISCHG
26443576211258724_752167878 264435762611258724 Mbr #1 01Dec2018 01Dec2018 DOCTORSHOSPITALOFSARASOTA 01Dec2018 07Dec2018 01Dec2018 07Dec2018 01Dec2018 07Dec2018
264435762611258724_752167878 264435762611258724  Mbr #1 02Dec2018 03Dec2018 DOCTORSHOSPITALOFSARASOTA 01Dec2018 07Dec2018 01Dec2018 07Dec2018 01Dec2018 07Dec2018
264435762611258724_752167878 264435762611258724  Mbr #1 04Dec2018 07Dec2018 DOCTORSHOSPITALOFSARASOTA 01Dec2018 07Dec2018 01Dec2018 07Dec2018 01Dec2018 07Dec2018
264437408621113740_751297161 264437408621113740  Mbr#2 02Dec2018 02Dec2018 OAK HILL HOSPITAL 02Dec2018 04Dec2018 02Dec2018 04Dec2018 02Dec2018 04Dec2018
264437408621113740_751297161 264437408621113740  Mbr#2 03Dec2018 04Dec2018 OAK HILL HOSPITAL 02Dec2018 04Dec2018 02Dec2018 04Dec2018 02Dec2018 04Dec2018
264437539621113740_752433592 264437539621113740  Mbr#3 06Dec2018 07Dec2018 ST LUCIE MEDICAL CENTER 06Dec2018 09Dec2018 06Dec2018 09Dec2018 06Dec2018 09Dec2018
264437539621113740_752433592 264437539621113740  Mbr#3 08Dec2018 09Dec2018 ST LUCIE MEDICAL CENTER 06Dec2018 09Dec2018 06Dec2018 09Dec2018 06Dec2018 09Dec2018
264439412621694180_751692097 264439412621694180  Mbr#4 03Dec2018 04Dec2018 JFK MEDICAL CENTER 03Dec2018 04Dec2018 03Dec2018 27Dec2018 03Dec2018 04Dec2018
264439412621694180_755242526 264439412621694180  Mbr#4 24Dec2018 27Dec2018 JFK MEDICAL CENTER 24Dec2018 27Dec2018 03Dec2018 27Dec2018 24Dec2018 27Dec2018
264442542621113740_753831921 264442542621113740  Mbr#5 14Dec2018 16Dec2018 ST LUCIE MEDICAL CENTER 14Dec2018 16Dec2018 14Dec2018 24Dec2018 14Dec2018 24Dec2018
264442542621113740_754854457 264442542621113740  Mbr#5 17Dec2018 18Dec2018 ST LUCIE MEDICAL CENTER 17Dec2018 18Dec2018 14Dec2018 24Dec2018 14Dec2018 24Dec2018
264442542621113740_754976954 264442542621113740  Mbr#5 18Dec2018 24Dec2018 ST LUCIE MEDICAL CENTER 18Dec2018 24Dec2018 14Dec2018 24Dec2018 14Dec2018 24Dec2018
264442954621113740_753825785 264442954621113740  Mbr#6 14Dec2018 16Dec2018 ST LUCIE MEDICAL CENTER 14Dec2018 16Dec2018 14Dec2018 30Dec2018 14Dec2018 16Dec2018
264442954621113740_757277841 264442954621113740  Mbr#6 28Dec2018 30Dec2018 ST LUCIE MEDICAL CENTER 28Dec2018 30Dec2018 14Dec2018 30Dec2018 28Dec2018 30Dec2018
26445321621694180_755242593 264453210621694180  Mbr #7 24Dec2018 25Dec2018 JFK MEDICAL CENTER 24Dec2018 25Dec2018 24Dec2018 11Jan2019 24Dec2018 25Dec2018
264453210621694180_758430254 264453210621694180  Mbr #7 04Jan2019 09Jan2019 JFK MEDICAL CENTER 04Jan2019 11Jan2019 24Dec2018 11Jan2019 04Jan2019 11Jan2019

 

How can I make my process use the LST_SRVC_DT of a current record compared to the FST_SRVC_DT of the next records and if it is  <= 1 then treat it a part of the same event and correctly create what I actually want as shown in the ACTUAL_ADMIT and ACTUAL_DISCHG fields, otherwise treat it as a new event so I can ultimately roll this all up to one record per event?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
kbinan
Obsidian | Level 7

Thank you all for your suggestions/help!  I came up with a solution that works exactly like I need it to.

 

/***************************************************************************************
Compare PREVIOUS record values to CURRENT record to determine if these are continuous 
visits for a single EVENT based on if IP_START of current record is between or within
1 day of IP_END of previous record 
***************************************************************************************/

proc sort data=work.conf_data3;
by CONF_PROV MBR_SYS_ID IP_START;
run;
 
data work.conf_data4;
  format SAME_CONF  $30. ;
set work.conf_data3;
  IF CONF_PROV = lag(CONF_PROV)
     AND ((IP_START   = lag(IP_END)
      OR  (IP_START  >= lag(IP_START) 
       and IP_START  <= lag(IP_END))
	  OR  (IP_START =  lag(IP_END +1))))
THEN SAME_CONF='SAME_CONF';
ELSE SAME_CONF= ' ';
RUN;/*WORK.CONF_DATA4 has 1432 observations and 16 variables.*/


/*******************************************************************************************
Create a new EVENT_ID attribute that contains the same value for all records in an IP EVENT
 ****  this is the first instance of CONF_CLM for each EVENT
********************************************************************************************/ 
data work.CONF_DATA_ADMIT;
 format EVENT_ID $50.;
 retain event_id;
set work.conf_data4; 
	 if SAME_CONF= ' ' then EVENT_ID = CONF_CLM;
run;/*WORK.CONF_DATA_ADMIT has 1432 observations and 17 variables.*/

View solution in original post

10 REPLIES 10
Reeza
Super User
Just an FYI - I formatted your post to make the code into code snippets which makes it easier to read/copy/paste.
kbinan
Obsidian | Level 7

Thank you Reeza!

ballardw
Super User

It would really help to provide an example of your raw data and what you would expect as outcome from that example data. The values do not need to be actual just enough to show the process. Include both simple and complex cases and provide, as a minimum the starting data as a data step and better is to provide the desired result as a data step. We do need to know what the results are supposed to look like. You may also need to provide the rules you intend to use to tell the difference between "events".

 

It is not impossible that you need additional information but we need to see something to start with.

 

At the same time it may be worth considering exactly what you are intending to do with the resulting data. Sometimes collapsing data can make further steps more complicated because patients have differing numbers of output variables (potentially). If the main purpose is to have something that people read it may be that less data manipulation is needed and a report procedure such as Report or Tabulate could create groups of related columns. Maybe. I am not sure without an example of both staring and ending data what is intended.

kbinan
Obsidian | Level 7

Hello.  Yes, I did include sample data in the table embedded in my original post - you have to scroll to the right to see the whole thing  - but in there I show the results I am currently getting vs the results I actually am hoping to see.

 

I need to get a unique record identifier of an inpatient event with a start and end date, so that I can then data-mine further against ALL claims detail to capture ALL associated claims that occurred between that start and end date for a patient.  I will then be able to attach that unique identifier to all associated claims which can then support further analysis.

I hope that makes sense.

Reeza
Super User

Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

If you can provide data in that format, I'm sure more people would be willing to help. 

 

ballardw
Super User

@kbinan wrote:

Hello.  Yes, I did include sample data in the table embedded in my original post - you have to scroll to the right to see the whole thing  - but in there I show the results I am currently getting vs the results I actually am hoping to see.

 


Example of incorrect output evidenced by the "Example of current output", as I read the description, is not an example of starting data and I cannot tell what the starting data would actually be. And it isn't a data step that is shown. If I have to attempt to make a data set from that sort of thing I am very likely to make choices that do not match your data so suggested code may not work properly or run at all.

 

Plus it is not clear what the desired output for any given input should be in a "correct result".

DavePrinsloo
Pyrite | Level 9

I dug out a very old macro ...done in SAS 9.1.3!  It was made to be used inside a DI Studio transformation, but it should hopefully run stand-alone!

If I understand your description correctly, it should do exactly what you need.   

It may use code that could be done more efficiently or more concisely with features from later SAS releases.   

Let me know if it works for you.   If this is useful, then I could cast a more experienced look at it.  

 

kbinan
Obsidian | Level 7
data WORK.SAMPLE;
  infile datalines dsd truncover;
  input CONF_CLM:$200. CONF_PROV:$200. MBR_LST_NM:$20. FST_SRVC_DT:DATE9. LST_SRVC_DT:DATE9. PROV_NAME:$70. IP_START:DATE9. IP_END:DATE9.;
  format FST_SRVC_DT DATE9. LST_SRVC_DT DATE9. IP_START DATE9. IP_END DATE9.;
  label MBR_LST_NM="MBR_LST_NM" FST_SRVC_DT="FST_SRVC_DT" LST_SRVC_DT="LST_SRVC_DT" PROV_NAME="PROV_NAME";
datalines;
264435762611258724_7521678785 264435762611258724 GRAHAM 01DEC2018 01DEC2018 DOCTORS HOSPITAL OF SARASOTA 01DEC2018 07DEC2018
264435762611258724_7521678785 264435762611258724 GRAHAM 02DEC2018 03DEC2018 DOCTORS HOSPITAL OF SARASOTA 01DEC2018 07DEC2018
264435762611258724_7521678785 264435762611258724 GRAHAM 04DEC2018 07DEC2018 DOCTORS HOSPITAL OF SARASOTA 01DEC2018 07DEC2018
264437408621113740_7512971612 264437408621113740 KOMAREK 02DEC2018 02DEC2018 OAK HILL HOSPITAL 02DEC2018 04DEC2018
264437408621113740_7512971612 264437408621113740 KOMAREK 03DEC2018 04DEC2018 OAK HILL HOSPITAL 02DEC2018 04DEC2018
264437539621113740_7524335925 264437539621113740 MANERO 06DEC2018 07DEC2018 ST LUCIE MEDICAL CENTER 06DEC2018 09DEC2018
264437539621113740_7524335925 264437539621113740 MANERO 08DEC2018 09DEC2018 ST LUCIE MEDICAL CENTER 06DEC2018 09DEC2018
264439412621694180_7516920975 264439412621694180 DORAN 03DEC2018 04DEC2018 JFK MEDICAL CENTER 03DEC2018 04DEC2018
264439412621694180_7552425263 264439412621694180 DORAN 24DEC2018 27DEC2018 JFK MEDICAL CENTER 24DEC2018 27DEC2018
264442542621113740_7538319212 264442542621113740 LADD 14DEC2018 16DEC2018 ST LUCIE MEDICAL CENTER 14DEC2018 16DEC2018
264442542621113740_7548544578 264442542621113740 LADD 17DEC2018 18DEC2018 ST LUCIE MEDICAL CENTER 17DEC2018 18DEC2018
264442542621113740_7549769544 264442542621113740 LADD 18DEC2018 24DEC2018 ST LUCIE MEDICAL CENTER 18DEC2018 24DEC2018
264442954621113740_7538257855 264442954621113740 MOORE 14DEC2018 16DEC2018 ST LUCIE MEDICAL CENTER 14DEC2018 16DEC2018
264442954621113740_7572778414 264442954621113740 MOORE 28DEC2018 30DEC2018 ST LUCIE MEDICAL CENTER 28DEC2018 30DEC2018
264453210621694180_7552425938 264453210621694180 WICKLES 24DEC2018 25DEC2018 JFK MEDICAL CENTER 24DEC2018 25DEC2018
264453210621694180_7584302544 264453210621694180 WICKLES 04JAN2019 09JAN2019 JFK MEDICAL CENTER 04JAN2019 11JAN2019
264453210621694180_7584302544 264453210621694180 WICKLES 10JAN2019 11JAN2019 JFK MEDICAL CENTER 04JAN2019 11JAN2019
;;;;

Here is my sample data:   What I am trying to achieve is to obtain then first FST_SRVC_DT value and the last LST_SRVC_DT value to "group" an inpatient event together.  In some cases (ie Mbr GRAHAM) this works OK because the CONF_CLM  and CONF_PROV are the same so a min/max works.  

 

In the example (Mbr MOORE) where the first and last dates are NOT continuous, the min/max approach also works.

 

However in other examples, (ie Mbr LADD) the CONF_CLM is different and CONF_PROV is the same, but the FST and LST SRVC_DT are continuous which tells me they are the same event and that is what I'm looking to roll up.

 

DavePrinsloo
Pyrite | Level 9

I dug out a very old macro ...done in SAS 9.1.3!  It was made to be used inside a DI Studio transformation, but it should run stand-alone!

If I understand your description correctly, it should do exactly what you need.   

It may use code that could be done more efficiently or more concisely with features from later SAS releases.   

Let me know if it works for you.   If this is useful, then I could document it better  

 

(I posted in the wrong place before.   sorry)

Here is the macro header:    

%macro ut_consolidate_periods (indata              = ,   /* input table */
  outdata             = ,   /* result table */
  view                = VIEW,  /* Result is VIEW or dataset */
  fromdt              = effective_from_date, /* date range start is in this column */
  todate              = effective_end_date,   /* date range end   is in this column */    
  compvar_list        = ,      /* Columns to used to compare rows to determine if a difference exists */
  sortkey             = ,      /* List of one or more columns that are the keys of the table */ 
  additional_keepvars = ,      /* Columns that are not keys and are not used for detecting differences
                                + These columns are also kept in the result - the values are taken from
                                + the last row in each date range */
  firstrowvars        = ,      /* Columns that are not keys and are not used for detecting differences   
                               + These columns are also kept in the result - the values are taken from  
                               + the first row in each date range - note that the firstrowvars= option
                               + uses more resources than the additional_keepvars= parameter.  if values 
                               + are constant in a date range then use additional_keepvars= instead of 
                               + firstrowvars= */ 
                               
  maxvalue_keepvars   = ,     /* Column(s) that must be kept where the maximum value
                               + for the column(s) for all consolidated rows rows */                                     
  must_sort           = NO,   /* If must_sort = YES then the table will be sorted by the macro  */
  inpwd               = ,     /* optional: provide sas read password if one is required */                            
  fill_gaps           = NO    /* =YES when gaps should be filled - ie. if two records have 
                              + dateranges that are not exactly consecutive then they will
                              + be consolidated anyway
                              * if the dates are actually datetimes with times at 00:00:00
                              * as dates are stored in Oracle, then periods may have a gap of 23:59:59*/ 
  );
 
 /*Module------------------------------------------------------------------------------------------
+ ut_consolidate_periods - Consolidate periods in a table
+ The ut_consolidate_periods macro works with tables that have date range columns. 
+ The macro identifies key values where there are no changes in the corresponding, non-key 
+ columns that are specified.  Identical and adjacent data is then consolidated into a single date range. 

+ This macro consolidates, into a single record, consecutive date ranges for key that is passed as a parameter.
+ The example below shows a data table before and after the Period Consolidation transformation. 
+ The key consists of Key1 and Key2. Begin Date and End Date specify the date range. 
+ The important columns for consolidation are non_key1 and non_key3. 
+ The non_key2 column is dropped from the output table because its value might not be correct 
+ for the full, consolidated date ranges. 
+
+ Input and Output Tables Using Period Consolidation Transformation 
+ Input Table
+ 	  Key1	Key2	Begin Date 	End Date 	 non_key1	 non_key2(not NB)  non_key3
+ 1	  123	  A	    01JAN2009	  31JAN2009	 21	       ABC	             XYZ
+ 2	  123	  A	    01FEB2009	  15FEB2009	 21	       DEF	             XYZ
+ 3	  123	  A	    16FEB2009	  15JUL2009	 21	       DEF	             XYZZY
+ 4	  123	  A	    16JUL2009	  31OCT2009	 21	       BCD	             XYZZY
+ 5	  123	  A	    01NOV2009	  31DEC2009	 21	       CEF	             XYZZY
+ 6	  125	  A	    01JAN2009	  31AUG2009	 21	       ABC	             CCC
+ 7	  125	  A	    01SEP2009	  31DEC2009	 22	       ABC	             CCC
+ 8	  126	  A	    01JAN2009	  15FEB2009	 22	       FED	             DDD
+ 9	  126	  A	    16FEB2009	  31MAR2009	 22	       DEF	             DDD
+ 10	126	  A	    01AUG2009	  31DEC2009	 22	       DEF	             DDD
+                                                                    
+ Output Table
+ 	Key1	Key2	Begin Date 	End Date 	   non_key1	 non_key3	        contributing rows 
+                                                                   from input table 
+ 1	123	  A	    01JAN2009	  15FEB2009	   21	       XYZ	             1,2
+ 2	123	  A	    16FEB2009	  31DEC2009	   21	       XYZZY             3-5
+ 3	125	  A	    01JAN2009	  31DEC2009	   21	       CCC	             6,7
+ 4	126	  A	    01JAN2009	  31MAR2009	   22	       DDD	             8,9
+ 5	126	  A	    01AUG2009	  31DEC2009	   22	       DDD	             10 
+      (This row is not concatenated to 9 because there is a gap between 
+       the end date of 9 and the begin date of 10.)
+      If option fill_gaps = YES is used then result would consolidate rows 4 and 5:
+ 	Key1	Key2	Begin Date 	End Date 	   non_key1	 non_key3	        contributing rows
+                                                                   from input table 
+ 4	126	  A	    01JAN2009	  31DEC2009	   22	       DDD	             8,9,10         
 
 
kbinan
Obsidian | Level 7

Thank you all for your suggestions/help!  I came up with a solution that works exactly like I need it to.

 

/***************************************************************************************
Compare PREVIOUS record values to CURRENT record to determine if these are continuous 
visits for a single EVENT based on if IP_START of current record is between or within
1 day of IP_END of previous record 
***************************************************************************************/

proc sort data=work.conf_data3;
by CONF_PROV MBR_SYS_ID IP_START;
run;
 
data work.conf_data4;
  format SAME_CONF  $30. ;
set work.conf_data3;
  IF CONF_PROV = lag(CONF_PROV)
     AND ((IP_START   = lag(IP_END)
      OR  (IP_START  >= lag(IP_START) 
       and IP_START  <= lag(IP_END))
	  OR  (IP_START =  lag(IP_END +1))))
THEN SAME_CONF='SAME_CONF';
ELSE SAME_CONF= ' ';
RUN;/*WORK.CONF_DATA4 has 1432 observations and 16 variables.*/


/*******************************************************************************************
Create a new EVENT_ID attribute that contains the same value for all records in an IP EVENT
 ****  this is the first instance of CONF_CLM for each EVENT
********************************************************************************************/ 
data work.CONF_DATA_ADMIT;
 format EVENT_ID $50.;
 retain event_id;
set work.conf_data4; 
	 if SAME_CONF= ' ' then EVENT_ID = CONF_CLM;
run;/*WORK.CONF_DATA_ADMIT has 1432 observations and 17 variables.*/

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3083 views
  • 0 likes
  • 4 in conversation