Hi guys,
I am working on a simple PROC SQL command that creates distinct patient level data for each category. I have to do this step because the next macro is looking for a unique patient level data as one of its inputs.
My end goal:
Produce means of "amount paid" for each category.
This is how the original data (sourcedata) looks like:
Patient | Claims_ID | Description |
1 | 12 | A |
1 | 13 | A |
1 | 15 | Y |
2 | 16 | D |
2 | 19 | B |
2 | 26 | D |
Issue here is:
I have to adjust with an existing macro that takes only unique patient level data - to create "amount paid" at patient level..So I have to first create individual category data sets (unique at patient level) from the master data. After this step, I can pass them onto the macro.
My approach:
%let cat = Diagnosis1 Diagnosis2 Diagnosis3.... Diagnosis200;
/* macro below simply creates distinct patient level data for each category - this is used to identify patients in that category */
%macro create1();
%do i = 1 %to &countcat.; /*countcat contains number of categories say 200 */
%let eachCat = %scan(&cat, &i); /* cat is a mcro variable containing each of the 200 category names separated by space */
PROC SQL;
Create table uniquePatient_&eachCat. /* creates dataset for each category */
select distinct patientID FROM
Sourcedata
where prxmacth("m,&eachCat./oi", category) > 0; /* this filters the data to that category */
quit;
%end;
%mend();
%create1(); /* invoke the macro */
Sourcedata has 50million observations which are at patientID and claims ID level.
This simple macro is taking 3.5 minutes to create dataset for each category so basically it will take 10 hours to complete this run!
I need distinct patient ID data that will be entered into another macro otherwise I would have thought of anything other than creating so many datasets as done above. I have heard that you can create format libraries that can speed up this process..any idea on how to go about it and if it may be applicable in this scenario - will be greatly appreciated!
Regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.