BookmarkSubscribeRSS Feed
Wolverine
Pyrite | Level 9

I’m working with Medicare data. There are tens of millions of beneficiaries, and each beneficiary can have multiple healthcare visits during the year. Each visit can have multiple records. The records fall into 2 general categories: claims (what was actually billed) and chart review (a doctor reviews the chart and adds or deletes diagnoses). Each record has 25 diagnosis variables (DX1-DX25). There are usually only 2-3 diagnoses per record, but some visits do indeed use all 25 DX variables.

 

It is possible to have visits with claims records only. There were visits with chart review records only, but those were deleted in a previous step. In later steps, the diagnosis codes will be compared to lists of codes for various medical conditions, so we can identify beneficiaries who have each of these conditions. This is accomplished by loading the list of DX codes for each condition into a series of macro variables. An array is used to compare the list of codes in the macro variable to all 25 DX variables, and a flag for each condition is generated when there is a match.

 

But the goal here is to create a single record for each visit that has the final list of diagnoses after all chart review diagnoses have been added or removed.

 

Here are the basic steps:

  • Identify all the records from a visit using a unique identifier stored in a variable called stay_key.
  • Select the claim record where CLM_FINL_ACTN_IND=”Y”. However, it’s not that simple. It is possible to have multiple claim records where CLM_FINL_ACTN_IND=”Y”. In that case, the highest CLM_CNTL_NUM is selected among these records. This is now the “base” record to which all chart review diagnoses are added or deleted. Note: only one claim record is selected. Diagnoses from other claim records are NOT added or deleted from the base record.
  • IF CLM_MDCL_REC ^=8 in the chart review record, then the diagnosis codes listed in DX1-DX25 should be added to the base record. Then the chart review record can be dropped.
  • IF CLM_MDCL_REC =8, then then the diagnosis codes listed in DX1-DX25 should be deleted from the base record. As before, the chart review record can then be dropped.

One possible approach I thought of to add diagnoses from the chart review record is to run PROC SQL so that the file merges with itself. I could use Where statements to add DX vars from chart review records where CLM_MDCL_REC ^=8 to base record. However, this would expand it from 25 DX variables to 50 -- or more if there are multiple chart reviews to add. That could make the later array step take a long time. For this reason, it might be acceptable to keep the chart review(s) as a separate record. HOWEVER, when deleting diagnoses, all of the records would have to be checked.

 

For deleting diagnosis codes from chart review records where CLM_MDCL_REC =8, these codes could be loaded into a macro variable. Then an array could be used to compare macro variable to DX vars in base record. When match is found, set the DX variable in the base record containing that DX code to blank.

 

Below is an example of how the data is laid out (although I added a blank line between visits to make it easier to read) and what the final output table should look like (again, without the blank line I added). And there is code to create the "have" table in SAS.

HAVE    
stay_key CLM_FINL_ACTN_IND CLM_CNTL_NUM CLM_CHRT_RVW_SW CLM_MDCL_REC  DX1 DX2 DX3 DX4    
1234-01/05/18 N 22571     D217 D218 D220 D221    
1234-01/05/18 N 22686     D217 D219 D220      
1234-01/05/18 Y 22989     D217 D220 D221      
1234-01/05/18 N 30111 Y   E348          
                     
1234-03/20/18 Y 31250     D217 D218 D220 D221    
1234-03/20/18 Y 31585     D217 D219 D220 F122    
1234-03/20/18 N 32622 Y   D217 D220 D221      
1234-03/20/18 N 32745 Y   G585          
1234-03/20/18 N 32899 Y 8 F122          
1234-03/20/18 N 32963 Y 8 D218          
                     
WANT    
stay_key CLM_FINL_ACTN_IND CLM_CNTL_NUM CLM_CHRT_RVW_SW CLM_MDCL_REC  DX1 DX2 DX3 DX4 DX5 DX6
1234-01/05/18 Y 22989     D217 D220 D221   E348  
                     
1234-03/20/18 Y 31585     D217 D219 D220   D221 G585
data have;
infile datalines dsd dlm=',' truncover;
input stay_key	$ CLM_FINL_ACTN_IND $ CLM_CNTL_NUM CLM_CHRT_RVW_SW $	CLM_MDCL_REC DX1 $ DX2 $ DX3 $ DX4 $;
datalines;
1234-01/05/18 N 22571   D217 D218 D220 D221
1234-01/05/18 N 22686   D217 D219 D220 
1234-01/05/18 Y 22989   D217 D220 D221 
1234-01/05/18 N 30111 Y  E348        
1234-03/20/18 Y 31250   D217 D218 D220 D221
1234-03/20/18 Y 31585   D217 D219 D220 F122
1234-03/20/18 N 32622 Y  D217 D220 D221 
1234-03/20/18 N 32745 Y  G585   
1234-03/20/18 N 32899 Y 8 F122   
1234-03/20/18 N 32963 Y 8 D218
;RUN;
1 REPLY 1
quickbluefish
Barite | Level 11

You can try this - seems to work and produces what's in your 'want' dataset for those two records, but definitely not tested thoroughly.  

data have;
infile cards dsd dlm=',' truncover;
length stay_key $13 clm_finl_actn_ind $1 clm_cntl_num $5 
	clm_chrt_rvw_sw $1 clm_mdcl_rec $1 dx001-dx025 $4;
input stay_key clm_finl_actn_ind clm_cntl_num  
	clm_chrt_rvw_sw clm_mdcl_rec dx001-dx004;
cards;
1234-01/05/18,N,22571,,,D217,D218,D220,D221
1234-01/05/18,N,22686,,,D217,D219,D220,
1234-01/05/18,Y,22989,,,D217,D220,D221,
1234-01/05/18,N,30111,Y,,E348,,,,,,,,
1234-03/20/18,Y,31250,,,D217,D218,D220,D221
1234-03/20/18,Y,31585,,,D217,D219,D220,F122
1234-03/20/18,N,32622,Y,,D217,D220,D221,
1234-03/20/18,N,32745,Y,,G585,,,
1234-03/20/18,N,32899,Y,8,F122,,,
1234-03/20/18,N,32963,Y,8,D218,,,
;
RUN;

proc sort data=have;
by
	stay_key
	DESCENDING clm_chrt_rvw_sw
	clm_finl_actn_ind
	clm_cntl_num
	;
run;

data want;
set have;
by
	stay_key
	DESCENDING clm_chrt_rvw_sw
	clm_finl_actn_ind
	clm_cntl_num
	;
array rm {100} $4 _temporary_;
array ad {100} $4 _temporary_;
length dx026-dx100 $4;  * for adding extra dx from other records ;
array dx {*} dx001-dx025; * for any given existing record, only look at 1-25 ;
array kp {100} $4 _temporary_;
if first.stay_key then call missing(of rm[*], of ad[*], of kp[*], rmnum, adnum, kpnum);
if clm_chrt_rvw_sw="Y" then do;
	* chart review record? ;
	if clm_mdcl_rec="8" then do;
		* dx to remove ;
		do i=1 to dim(dx);
			if missing(dx[i]) then leave;
			if dx[i] not in rm then do;
				rmnum+1;
				rm[rmnum]=dx[i];
			end;
		end;
	end;
	else do;
		* dx to add ;
		do i=1 to dim(dx);
			if missing(dx[i]) then leave;
			if dx[i] not in ad then do;
				adnum+1;
				ad[adnum]=dx[i];
			end;
		end;
	end;
end;
else if last.stay_key and clm_finl_actn_ind="Y" then do;
	* this is the base record -- add / remove stuff based on chart review, then output ;
	do i=1 to dim(dx);
		* gather up the existing dx from the base record as long as not in rm list ;
		if missing(dx[i]) then leave;
		if dx[i] not in rm then do;
			kpnum+1;
			kp[kpnum]=dx[i];
		end;
	end;
	call missing(of dx[*]); * empty the existing dx array for now ;
	i=0;
	do while (i<kpnum);
		* write out the base row dx that were not removed ;
		i+1;
		dx[i]=kp[i];
	end;
	j=0; * note that i does not get reset ;
	do while (j<adnum);
		* now add anything from the add list (chart rvw recs ^= 8) ;
		j+1;
		if ad[j] not in rm and ad[j] not in kp then do; * make sure add-list dx is not also in rm list ;
			i+1;
			dx[i]=ad[j];  * note: i, j ;
		end;
	end;
	output;
end;
drop rmnum adnum kpnum i j;
run;

proc print data=want (drop=dx016-dx100); run;  * just showing dx1-15 here ;

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 281 views
  • 1 like
  • 2 in conversation