BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dominiquec
Calcite | Level 5

I am working with patient hospitalization data, which includes a series of diagnosis code flags for each hospitalization. In our study design, patients have different exposures during different periods of time (exposure_type). I want to summarize the hospitalizations within each exposure type for each patient by creating an 'ever diagnosed flag' for each of the diagnosis codes in the study. This 'ever diagnosed flag' will indicate if a patient ever had that diagnosis code associated with any of their visits during each exposure period. This is my code so far: 

data want;
set have;
by patient_id exposure_type;

array oldflags {20} diag1-diag20;
array overallflg {20} everdiag1-everdiag20;

if first.patient_id and first.exposure_type then do i = 1;
overallflg[i] = 0;
end;

do i = 1 to 20;
overallflg[i] = max(oldflags[i]);
end;
keep patient_id exposure_type everdiag1-everdiag20;
run;

When I run this above code, the max value for each diagnosis flag isn't carrying over. everdiag1-everdiag20 are basically copies of the original flags, and the max value within each unique ID and exposure date is not retained. What am I missing? Thank you for your time.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

If I understand you properly (and I'm not 100% sure that I do understand you properly), then this doesn't require arrays at all. This is a job for some procedure that can look at different rows for each patient_id and exposure_type, and find out where the values of 1 exist.

 

In other words, this is a job for PROC SUMMARY (the SUPERMAN of SAS data aggregation™).

 

data have;
    input patient_id exposure_type hosp_dt :mmddyy10. diag1 diag2 diag3 diag4;
cards;
100100	1	01.01.1995	0	1	0	1
100100	1	06.15.1995	1	0	0	1
100100	2	10.01.1995	1	0	0	0
100100	2	12.15.1995	1	0	1	0
;

proc summary data=have nway;
    class patient_id exposure_type;
    var diag1-diag4;
    output out=want(drop=_:) max=everdiag1-everdiag4;
run;

 

Now, this output data set named WANT has only one row per patient_id and exposure_type. You could merge it back into the original data set if you want multiple rows per patient_id and exposure_type, as shown in your desired output. If you want to do that, I leave that to you to figure out what the code is.

--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

It would be very helpful if you could show us a portion of the actual data, or even fake data arranged to be in a similar arrangement to the actual data. And then it would also be very helpful if you showed us the incorrect output and showed us what the correct output ought to  be.

--
Paige Miller
dominiquec
Calcite | Level 5

Certainly!

An example of what the data currently look like is below:

patient_idexposure_typehosp_dtdiag1diag2diag3diag4
100100101.01.19950101
100100106.15.19951001
100100210.01.19951000
100100212.15.19951010

 

Right now, my results look like a copy of the original file: 

patient_idexposure_typeeverdiag1everdiag2everdiag3everdiag4
10010010101
10010011001
10010021000
10010021010

For patient 100100, the everdiag for diag1 should be 1 for both exposure types 1 and 2, as at least one hospital visit had that diagnosis associated with it during each exposure type. However, the everdiag flag for diag3 should only be 1 for only exposure type 2, as this diagnosis only appeared during a hospitalization when the patient was in exposure type 2.

 

See below for how I would like the final dataset to look like. All the everdiag flags should look the same within each unique patient ID and exposure type:

patient_idexposure_typeeverdiag1everdiag2everdiag3everdiag4
10010011101
10010011101
10010021010
10010021010
PaigeMiller
Diamond | Level 26

If I understand you properly (and I'm not 100% sure that I do understand you properly), then this doesn't require arrays at all. This is a job for some procedure that can look at different rows for each patient_id and exposure_type, and find out where the values of 1 exist.

 

In other words, this is a job for PROC SUMMARY (the SUPERMAN of SAS data aggregation™).

 

data have;
    input patient_id exposure_type hosp_dt :mmddyy10. diag1 diag2 diag3 diag4;
cards;
100100	1	01.01.1995	0	1	0	1
100100	1	06.15.1995	1	0	0	1
100100	2	10.01.1995	1	0	0	0
100100	2	12.15.1995	1	0	1	0
;

proc summary data=have nway;
    class patient_id exposure_type;
    var diag1-diag4;
    output out=want(drop=_:) max=everdiag1-everdiag4;
run;

 

Now, this output data set named WANT has only one row per patient_id and exposure_type. You could merge it back into the original data set if you want multiple rows per patient_id and exposure_type, as shown in your desired output. If you want to do that, I leave that to you to figure out what the code is.

--
Paige Miller
dominiquec
Calcite | Level 5
Thank you, this gave me exactly what I needed! I definitely overcomplicated this and need to practice more with proc summary.
ballardw
Super User

I would expect to see comparing if a value you have matches some other values, such as list of actual codes. I don't see any such comparison.

 

And when you use

do i = 1 to 20;
overallflg[i] = max(oldflags[i]);
end;

this is exactly the same as

do i = 1 to 20;
overallflg[i] = oldflags[i];
end;

because you are asking for the Max of exactly one value, oldflags[i];

If you want to use the largest value of an array, which doesn't make much sense to me in terms of your described problem the code would be

do i = 1 to 20;
overallflg[i] = max( of oldflags(*));
end;

but that sets all of the overallflg to the same value. Which is why I don't think that is what you actually want either.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 734 views
  • 0 likes
  • 3 in conversation