Hello,
Thank you for this community so far. It is of being great help!
In my project, comorbidities are identified using ICD-10 diagnosis codes where there is ≥ 1 inpatient or ≥ 2 outpatient claims (on different dates) during the 12-month baseline period.
I have trouble coding for at least two outpatient claims from medical file. I used the following code but the sample size decreased drastically so I am not sure if I am doing it right.
Data test;
input patid $ comorb_date YYMMDD10 diag$;
cards;
1 2019/01/23 AZER
1 2019/01/23 AZER
1 2021/04/04 TRE
2 2018/08/09 TYR
2 2018/08/09 QWK
run;
Data want will look like this -
1 2019/01/23 AZER
1 2021/04/04 TRE
2 2018/08/09 TYR
2 2018/08/09 QWK
Patid 1 had same diag codes on 2019/01/23 so one row was removed. Patid 2 has different diagnosis codes on the same date so it will be retained.
Tried code -
Proc sql;
create table OP_twodiagnosis as
select patid, count (distinct comorb_date) as cnt_op_dx
from test
group by patid, diag
;
quit;
Data created.OP_twodiagnosis_2;
set created.OP_twodiagnosis;
if cnt_op_dx >= 2;
run;
Best,
Shweta
First please test example data step code. Your code doesn't use YYMMDD10 as the informat as it is missing the period at the end. Also Cards or datalines end with a ; .
It is a good idea on this forum to paste data step (and other code as well) into a text box opened on the forum with the </> icon to prevent the forum software from reformatting text pasted into the main message windows. With data step that reformatting can change column positions enough that some input statements throw errors or read values incorrectly or sometimes acquire not visible characters that will cause other problems.
You didn't provide much example data and completely skipped over how to tell the difference between an inpatient and outpatient observation. This might provide some help as a starting point for next steps using that information.
SQL is pretty weak about anything regarding order of processing and you have an explicit order requirement (not two or more same diad on a date per patient) and possibly implied in the inpatiend/outpatient bit since different counts are used for a rule.
proc sort data=test out=testless nodupkey; by patid comorb_date diag; run; /* add a diag counter*/ data helpful; set testless; by patid comorb_date; retain counter; if first.patid then counter=1; else counter+1; run;
I added a counter variable so you can see the way things increment.
Questions:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.