BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
wlierman
Lapis Lazuli | Level 10

I am looking at 5 classes of psychotropic meds (anti_anxiety, anti_psychotic, anti_depressant, mood_stabilizer, stimulants) over two years 2022 and 2023.

The age groups are 0-5, 6-12, 13-17.

 

Each year was extracted by quarter by the month of the quarter.  So there are 4 groupings for each drug class:

anti_anxiety_2022_q1_1           anti_anxiety for 2022 for qtr 1 January

anti_anxiety_2022_q1_2          anti_anxiety for 2022 for qtr 1 February

anti_anxiety_2022_q1_3          anti_anxiety for 2022 for qtr 1 March

                                           . . .

anti_anxiety_2022_q4_1           anti_anxiety for 2022 for qtr 4 October

anti_anxiety_2022_q4_2          anti_anxiety for 2022 for qtr 4 November

anti_anxiety_2022_q4_3          anti_anxiety for 2022 for qtr 4 December

 

Data for each of the other 4 med classes are have similar construction

 

The data is cleaned up a little (e.g. if header_amount_paid = 0, record is deleted)

Age groupings are added

SQL coding is used to create

So a hypothetical anxiety quarterly dataset (Q1) looks like

 

From DOS          Recipient_ID        Total_Paid     Age_group

30Jan22              EO401J5K               132.86            0-5

05FEB22             BY3267G1                 43.55           13-17

28MAR22            PA601M4L               67.73             6-12

 

The four quarters for each med were set together to give

 

Anxiety_2022_final

Psychotic_2022_final

Depressant_2022_final

Mood_Stable_2022_final

Stimulants_2022_final

 

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

proc sort data=all_drugs_2022

                  out=uniquerecipients_2022 nodups  dupout=dup_recipients_2022 nodupkey;

                  by Recipient_ID;

proc sql noprint;

create table unique_rec_meds_2022 AS

select  From_DOS, Recipient_ID, Sum(header_amount_paid) AS Total_paid, Age_group,

COUNT(Age_group) AS A_grp

From all_drugs_2022

Group By age_group;

quit;

 

So the final result would be a excel table (results are hypothetical for illustration)

 

CY 2022        Age_group       Recipients         Payments

2022                   0-5                     8,235                $1,644,876

                            6-12                   41,875             $39,546,233

                           13-17                   54,712            $31,876,098

and a similar table for CY 2023

I thought this would produce a table of the number of children who received all

psychotropic drugs, by age group, and payments

 

But the requester doesn't think so.

 

At this point, I am confused then on how to arrive at the correct result.

 

Any suggestions and direction will be appreciated.

 

Thanks.

 

wlierman

 

 

 

 

 

The same final output

 

 

 

data anxiety_2022_q1_1;

   set anxiety_2022;

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

This should work: 

data all_drugs_2022;
    merge anti_anxiety_2022(in=in1 keep=recipient_id age_group) anti_psychotic_2022(in=in2 keep=recipient_id age_group)
        anti_depress_2022(in=in3 keep=recipient_id age_group) anti_mood_2022(in=in4 keep=recipient_id age_group)
        anti_stimulant_2022(in=in5 keep=recipient_id age_group);
    by recipient_id;
    if in1 and in2 and in3 and in4 and in5;
run;

/* Get the count */
proc sql;
     create table count as select count(distinct recipient_id) from all_drugs_2022 group by age_group;
quit;

 

This assumes all data sets are sorted by recipient_id

 

--
Paige Miller

View solution in original post

8 REPLIES 8
ballardw
Super User

Caveat: without actual data of some sort determining "all" of anything is problematic.

 

Have you verified that none of your Recipient_ID values changed age_group in any of the data sets? It is very likely that from one calendar month to another that an individual would change age and likely age group. So a rule involving that situation as it relates to the desired report is needed before starting. Since you are looking overall at a two year period you are certain to have some change age groups and how that affects the report is needed.

 

I suspect that some details of the required report are missing.

 

I see nothing in your attempt that takes the different medications into account. Your SQL dumps all the records into one call and counts by age_group only, nothing by medication.

 

I suggest providing a small example of data that looks like your 5 anti_anxiety_2022, anti-psychotic_2022 etc. data sets. Make sure to have at least 2 participants that are "all 5 medications" and some that aren't. Then show what the expected result (MANUALLY calculated which is why the set should be small) from is expected to look like.

wlierman
Lapis Lazuli | Level 10
Hello,
Thank you for responding. I will follow your suggestions and prepare a small data set for further examination

Wlierman
wlierman
Lapis Lazuli | Level 10
Thank you for your suggestions and help.
A_Kh
Barite | Level 11

I assume AGE, AGE_GROUP are only Baseline observations and don't change over time. 
To calculate RECIPENTS number and TOTAL_AMOUNT spent by MEDICATION, AGE_GROUP, QUARTER and YEAR I would do smth like this:

%let drugs= 'anti_anxiety' 'anti_psychotic' 'anti_depressant' 'mood_stabilizer' 'stimulants';

data have; 
	set <input data>; 
	where lowcase(MEDS) in (&drugs) and HEADER_AMOUNT_PAID >0 and (FROM_DOS between '01Jan2022'd and '31Dec2023'd);
	year= year(FROM_DOS); 
	if month(FROM_DOS) in (1, 2, 3) then QUARTER='q1';
	else if month(FROM_DOS) in (4, 5, 6) then QUARTER='q2';
	else if month(FROM_DOS) in (7, 8, 9) then QUARTER='q3';
	else if QUARTER='q4';
run; 
proc means data=have n sum noprint nway;
	class YEAR QUARTER AGE_GROUP MEDS; 
	ways 4; 
	var HEADER_AMOUNT_PAID;
	output out= summary n=RECIPIENTS sum=PAYMENTS;
	format payments dollar12.; 
run; 
PaigeMiller
Diamond | Level 26

The request is to get a count of unique recipients that received all 5 drugs for each age grouping.

 

My approach was to do the following

 

Data all drugs_2022;

  Set anti_anxiety_2022

         anti-psychotic_2022

         anti_depress_2022

         anti_mood_2022

         anti_stimulant_2022;

run;

 

This should work: 

data all_drugs_2022;
    merge anti_anxiety_2022(in=in1 keep=recipient_id age_group) anti_psychotic_2022(in=in2 keep=recipient_id age_group)
        anti_depress_2022(in=in3 keep=recipient_id age_group) anti_mood_2022(in=in4 keep=recipient_id age_group)
        anti_stimulant_2022(in=in5 keep=recipient_id age_group);
    by recipient_id;
    if in1 and in2 and in3 and in4 and in5;
run;

/* Get the count */
proc sql;
     create table count as select count(distinct recipient_id) from all_drugs_2022 group by age_group;
quit;

 

This assumes all data sets are sorted by recipient_id

 

--
Paige Miller
wlierman
Lapis Lazuli | Level 10
Thank you. That does work (though it is somewhat hard to explain at least for me). The only other filter (that I didn't mention) was that the recipients needed to be unique (deduplicated).I did that which lowered the over all total. Thanks again for your help and suggestion.
PaigeMiller
Diamond | Level 26

@wlierman wrote:
Thank you. That does work (though it is somewhat hard to explain at least for me). The only other filter (that I didn't mention) was that the recipients needed to be unique (deduplicated).I did that which lowered the over all total. Thanks again for your help and suggestion.

Explanation: The variable IN1 is 1 if the data comes from anti_anxiety_2022 and 0 otherwise. The variable IN2 is 1 if the data comes from anti_psychotic_2022 and 0 otherwise. When all five data sets are merged, you want the cases where all IN1 through IN5 have value of 1 (can't have any of them zero) and so that's what the IF statement looks for.

 

Doesn't the PROC SQL use of DISTINCT de-duplicate the counts?

--
Paige Miller
wlierman
Lapis Lazuli | Level 10

I didn't use the Distinct in proc Sql (though I could have).  I ran a proc sort (by Recipient_ID) 

 

proc sort data=Analyze2.all_drugs_2022
out=Analyze2.uniquerecipients_2022 nodups;          /*dupout = Analyze2.dup_recipients_2023 nodupkey;*/
by Recipient_ID;
run;

 

We have a large state Medicaid program. But even so, it is somewhat disconcerting that there are over 50K (at least from our Business Intelligence / Business Objects data warehouse) recipients (children in the 0-5 age group that have received all five classes of drugs. Though I think some of that is the NDC code (like " children acetaminophen" as well as others that I would not have considered in a psychotropic drug class).

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2253 views
  • 3 likes
  • 4 in conversation