10-15-2016 06:08 PM
I have a dataset of patients with up to ten different diagnoses (over ten different columns). Each patient, additionally, may have more than one occurrence.
How do I program SAS to count the number of patients (unique IDs) with a particular set of diagnoses (i.e. diagnosis 1 or diagnosis 2 or diagnosis 3)? I do not want a frequency count of diagnoses as one patient could return multiple times for the same diagnoses.
10-15-2016 06:27 PM
Depends a bit. A usual recommendation is to flip your data from wide to long and then do a double proc freq or SQL distinct to get the number of unique people, and use a WHERE clause or filter your diagnosis.
Another option is is to create a flag for the diagnosis and loop through all the diagnoses and set the flag to 1 if the diagnosis is found.
Then you do a distinct count with a filter on the flag value.
A diagnosis usually isn't a single code, case definitions vary and both methods are equivalent.
If you need to work with multiple diagnoses the first method is easier to extend.
10-15-2016 06:38 PM
Do you recommend using the methos described in the following link to transform the data?
10-18-2016 07:53 PM
I tried to do proc transpose. My data has over 3 million observations and many columns. There are ten separate columns for diagnoses alone.
After running for two hours, I was unable to open the transposed file due to memory issues (not sure what that really means as everything else opens okay).
10-18-2016 11:10 PM
Then I suggest a datastep to transpose instead. Not all records have 10 diagnoses, many will be empty, correct?
3 milllion x10 diagnoses - 30 million. Not a small amount but shouldn't cause SAS to crash...
10-15-2016 06:33 PM
Thanks, Reeza. I will try the first option and keep you updated.
I have SAS 9.4 and SAS Enterprise Guide 7.1. Should my post be in a different discussion board?