BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
wlierman
Lapis Lazuli | Level 10

I am looking at 5 classes of psychotropic meds (anti_anxiety, anti_psychotic, anti_depressant, mood_stabilizer, stimulants) over two years 2022 and 2023.

The age groups are 0-5, 6-12, 13-17.

 

Each year was extracted by quarter by the month of the quarter.  So there are 4 groupings for each drug class:

anti_anxiety_2022_q1_1           anti_anxiety for 2022 for qtr 1 January

anti_anxiety_2022_q1_2          anti_anxiety for 2022 for qtr 1 February

anti_anxiety_2022_q1_3          anti_anxiety for 2022 for qtr 1 March

                                           . . .

anti_anxiety_2022_q4_1           anti_anxiety for 2022 for qtr 4 October

anti_anxiety_2022_q4_2          anti_anxiety for 2022 for qtr 4 November

anti_anxiety_2022_q4_3          anti_anxiety for 2022 for qtr 4 December

 

Data for each of the other 4 med classes are have similar construction

 

The data is cleaned up a little (e.g. if header_amount_paid = 0, record is deleted)

Age groupings are added

SQL coding is used to create

So a hypothetical anxiety quarterly dataset (Q1) looks like

 

From DOS          Recipient_ID        Total_Paid     Age_group

30Jan22              EO401J5K               132.86            0-5

05FEB22             BY3267G1                 43.55           13-17

28MAR22            PA601M4L               67.73             6-12

 

The four quarters for each med were set together to give

 

Anxiety_2022_final

Psychotic_2022_final

Depressant_2022_final

Mood_Stable_2022_final

Stimulants_2022_final

 

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

proc sort data=all_drugs_2022

                  out=uniquerecipients_2022 nodups  dupout=dup_recipients_2022 nodupkey;

                  by Recipient_ID;

proc sql noprint;

create table unique_rec_meds_2022 AS

select  From_DOS, Recipient_ID, Sum(header_amount_paid) AS Total_paid, Age_group,

COUNT(Age_group) AS A_grp

From all_drugs_2022

Group By age_group;

quit;

 

So the final result would be a excel table (results are hypothetical for illustration)

 

CY 2022        Age_group       Recipients         Payments

2022                   0-5                     8,235                $1,644,876

                            6-12                   41,875             $39,546,233

                           13-17                   54,712            $31,876,098

and a similar table for CY 2023

I thought this would produce a table of the number of children who received all

psychotropic drugs, by age group, and payments

 

But the requester doesn't think so.

 

At this point, I am confused then on how to arrive at the correct result.

 

Any suggestions and direction will be appreciated.

 

Thanks.

 

wlierman

 

 

 

 

 

The same final output

 

 

 

data anxiety_2022_q1_1;

   set anxiety_2022;

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

This should work: 

data all_drugs_2022;
    merge anti_anxiety_2022(in=in1 keep=recipient_id age_group) anti_psychotic_2022(in=in2 keep=recipient_id age_group)
        anti_depress_2022(in=in3 keep=recipient_id age_group) anti_mood_2022(in=in4 keep=recipient_id age_group)
        anti_stimulant_2022(in=in5 keep=recipient_id age_group);
    by recipient_id;
    if in1 and in2 and in3 and in4 and in5;
run;

/* Get the count */
proc sql;
     create table count as select count(distinct recipient_id) from all_drugs_2022 group by age_group;
quit;

 

This assumes all data sets are sorted by recipient_id

 

--
Paige Miller

View solution in original post

8 REPLIES 8
ballardw
Super User

Caveat: without actual data of some sort determining "all" of anything is problematic.

 

Have you verified that none of your Recipient_ID values changed age_group in any of the data sets? It is very likely that from one calendar month to another that an individual would change age and likely age group. So a rule involving that situation as it relates to the desired report is needed before starting. Since you are looking overall at a two year period you are certain to have some change age groups and how that affects the report is needed.

 

I suspect that some details of the required report are missing.

 

I see nothing in your attempt that takes the different medications into account. Your SQL dumps all the records into one call and counts by age_group only, nothing by medication.

 

I suggest providing a small example of data that looks like your 5 anti_anxiety_2022, anti-psychotic_2022 etc. data sets. Make sure to have at least 2 participants that are "all 5 medications" and some that aren't. Then show what the expected result (MANUALLY calculated which is why the set should be small) from is expected to look like.

wlierman
Lapis Lazuli | Level 10
Hello,
Thank you for responding. I will follow your suggestions and prepare a small data set for further examination

Wlierman
wlierman
Lapis Lazuli | Level 10
Thank you for your suggestions and help.
A_Kh
Lapis Lazuli | Level 10

I assume AGE, AGE_GROUP are only Baseline observations and don't change over time. 
To calculate RECIPENTS number and TOTAL_AMOUNT spent by MEDICATION, AGE_GROUP, QUARTER and YEAR I would do smth like this:

%let drugs= 'anti_anxiety' 'anti_psychotic' 'anti_depressant' 'mood_stabilizer' 'stimulants';

data have; 
	set <input data>; 
	where lowcase(MEDS) in (&drugs) and HEADER_AMOUNT_PAID >0 and (FROM_DOS between '01Jan2022'd and '31Dec2023'd);
	year= year(FROM_DOS); 
	if month(FROM_DOS) in (1, 2, 3) then QUARTER='q1';
	else if month(FROM_DOS) in (4, 5, 6) then QUARTER='q2';
	else if month(FROM_DOS) in (7, 8, 9) then QUARTER='q3';
	else if QUARTER='q4';
run; 
proc means data=have n sum noprint nway;
	class YEAR QUARTER AGE_GROUP MEDS; 
	ways 4; 
	var HEADER_AMOUNT_PAID;
	output out= summary n=RECIPIENTS sum=PAYMENTS;
	format payments dollar12.; 
run; 
PaigeMiller
Diamond | Level 26

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

This should work: 

data all_drugs_2022;
    merge anti_anxiety_2022(in=in1 keep=recipient_id age_group) anti_psychotic_2022(in=in2 keep=recipient_id age_group)
        anti_depress_2022(in=in3 keep=recipient_id age_group) anti_mood_2022(in=in4 keep=recipient_id age_group)
        anti_stimulant_2022(in=in5 keep=recipient_id age_group);
    by recipient_id;
    if in1 and in2 and in3 and in4 and in5;
run;

/* Get the count */
proc sql;
     create table count as select count(distinct recipient_id) from all_drugs_2022 group by age_group;
quit;

 

This assumes all data sets are sorted by recipient_id

 

--
Paige Miller
wlierman
Lapis Lazuli | Level 10
Thank you. That does work (though it is somewhat hard to explain at least for me). The only other filter (that I didn't mention) was that the recipients needed to be unique (deduplicated).I did that which lowered the over all total. Thanks again for your help and suggestion.
PaigeMiller
Diamond | Level 26

@wlierman wrote:
Thank you. That does work (though it is somewhat hard to explain at least for me). The only other filter (that I didn't mention) was that the recipients needed to be unique (deduplicated).I did that which lowered the over all total. Thanks again for your help and suggestion.

Explanation: The variable IN1 is 1 if the data comes from anti_anxiety_2022 and 0 otherwise. The variable IN2 is 1 if the data comes from anti_psychotic_2022 and 0 otherwise. When all five data sets are merged, you want the cases where all IN1 through IN5 have value of 1 (can't have any of them zero) and so that's what the IF statement looks for.

 

Doesn't the PROC SQL use of DISTINCT de-duplicate the counts?

--
Paige Miller
wlierman
Lapis Lazuli | Level 10

I didn't use the Distinct in proc Sql (though I could have).  I ran a proc sort (by Recipient_ID) 

 

proc sort data=Analyze2.all_drugs_2022
out=Analyze2.uniquerecipients_2022 nodups;          /*dupout = Analyze2.dup_recipients_2023 nodupkey;*/
by Recipient_ID;
run;

 

We have a large state Medicaid program. But even so, it is somewhat disconcerting that there are over 50K (at least from our Business Intelligence / Business Objects data warehouse) recipients (children in the 0-5 age group that have received all five classes of drugs. Though I think some of that is the NDC code (like " children acetaminophen" as well as others that I would not have considered in a psychotropic drug class).

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 382 views
  • 3 likes
  • 4 in conversation