I need to know how to get two sets of date ranges to equal Q1 or Q2. For example, 01Jan2021 - 31Mar2021 = Q1 and 01Apr2021-30Jun2021 = Q2. I thought this would be a cut and dry thing but I'm spinning my wheels for days on this.
How would I create this new column to show whether these date ranges are Q1 and Q2?
Thanks
Avoid the "any" informats, instead use the proper informat for the dates. Use YYMMDD10. for dates stored as strings like 2021-06-18.
You have haven't shown how the date ranges are stored in a SAS data set. The QTR function works fine with individual dates:
data have;
do my_date='01jan2021'd to '30jun2021'd by 5;
output;
end;
format my_date date9.;
run;
data want;
set have;
my_qtr=qtr(my_date);
run;
proc print;
run;
The data for eventDate is stored as 01-01-2021 or dd-mm-yyyy
What does "by 5" mean?
@anonymous_user wrote:
The data for eventDate is stored as 01-01-2021 or dd-mm-yyyy
What does "by 5" mean?
Instead of writing out every date for Jan through June of 2021 for the sample dataset, I wrote out every 5th date. It's a smaller sample.
I think you didn't run the program and look at the proc print results, which would have answered your question.
Is there a way that this can be used for 500k rows of data?
I'm trying to do a summary by quarter for the last data step:
data have;
do eventDate='01jan2021' to '30jun2021';
output;
end;
format eventDate date9.;
run;
data want;
set have;
my_qtr=qtr(eventDate);
run;
proc print;
run;
data OUTPAT2;
merge OUTPAT want;
run;
/*E&M pull - over 147,000 rows*/
data EMpull;
set OUTPAT2 ;
where ( '99201' <= cpt <= '99215')
or (cpt between '99241' and '99245')
or (cpt between '99381' and '99397')
or (cpt between '99354' and '99355')
or (cpt between '99401' and '99412');
run;
I'm trying to do a summary by quarter for the last data step:
Please be more specific. What do you mean by summary? PROC MEANS? PROC SUMMARY? Something else?
What output do you want? A table that can be included in a report? A SAS data set? Something else? Show us an example of what you want.
A PROC Summary showing total costs for each quarter for the EM codes.
Just assign the QTR format to the date variable and use it in the CLASS statement. PROC SUMMARY will use the formatted values to build groups.
So let's assume your source dataset is name MYDATA and the analysis variable is named COST.
Here is code to run PROC SUMMARY on the subset of data that are EM codes that occurred in the first half of 2021.
proc summary data=MYDATA ;
where (cpt between '99201' and '99215')
or (cpt between '99241' and '99245')
or (cpt between '99381' and '99397')
or (cpt between '99354' and '99355')
or (cpt between '99401' and '99412')
;
where also eventdate between '01JAN2021'd and '30JUN2021'd ;
class eventdate ;
format eventdate yyq6. ;
var cost ;
output out=WANT sum=QTR_COST;
run;
This should result in three observations (assuming you have dates in both quarters) . One will be the total for all 6 months and the other two will be for the individual quarters.
If EVENTDATE is NOT a DATE variable then first run a data step (or perhaps create a view) that creates a DATE variable you can use with the above code.
data for_summary;
set mydata ;
real_date = input(eventdate,anydtdte.);
format real_date date9.;
run;
Now just use REAL_DATE in place EVENTDATE in the PROC SUMMARY code.
It worked but some of the dates under the 'real_date" column are not showing. (There are only 2 quarters in my data, btw.)
This is what I did:
data for_summary;
set OUTPAT ;
real_date = input(eventdate,anydtdte.);
format real_date date9.;
run;
proc summary data=for_summary;
where (cpt between '99201' and '99215')
or (cpt between '99241' and '99245')
or (cpt between '99381' and '99397')
or (cpt between '99354' and '99355')
or (cpt between '99401' and '99412')
;
where also real_date between '01JAN2021'd and '30JUN2021'd;
class real_date ;
format real_date yyq6. ;
var cost ;
output out=EMpull2 sum=QTR_COST;
run;
The final output looks like this for some reason:
real_date _TYPE_ _FREQ_ QTR_COST
. 0 90160 $$$$$
2021Q1 1 39753 $$$$$
2021Q2 2 50407 $$$$$
Looks good to me. The _type_ 0 observation contains the total sum.
When running the first part, the real_date column isn't populating dates in all of the cells. Not sure why though. Also I'm noticing the dates are not accurate. Is this suppose to be the case? For example, eventDate is 2021-06-18 but in real_date cell it says 01JUN2021.
Change ANYDTDTE to ANYDTDTE10:
data want;
eventdate = '2021-06-18';
real_date = input(eventdate,anydtdte10.);
format real_date date9.;
put _all_;
run;
Avoid the "any" informats, instead use the proper informat for the dates. Use YYMMDD10. for dates stored as strings like 2021-06-18.
how would I rename to "year_qtr" in the proc summary section?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.