Hi,
I have a sql query written by a previous programmer -
PROC SQL;
CREATE TABLE WORK.BASE_PGM AS SELECT DISTINCT
(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS
,PRCHSR_CD AS DEMO_PRCHSR_CODE
,CASE WHEN GDR IS NOT MISSING THEN GDR
ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING
,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY
ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING
,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE
,CASE WHEN CALCULATED DEMO_AGE BETWEEN 18 AND 34.99 THEN '18 - 34'
WHEN CALCULATED DEMO_AGE BETWEEN 35 AND 44.99 THEN '35 - 44'
WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'
WHEN CALCULATED DEMO_AGE >= 55 AND 44.99 THEN 'OVER 55'
ELSE 'UNKNOWN' END AS DEMO_AGE_GROUPING
,PGM_NM AS PGM_NM
,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED
,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED
FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE
WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0
GROUP BY PRCHSR_CD,DOB_DT,GDR;
QUIT;
Does anyone know if there could be performance issues with calculating the age and using it again in a case statement? The report is taking excess time and my hunch is this query is taking more time than necessary.
Any thoughts/feedback will be helpful.
Thanks,
saspert.
I wouldn't think so, but your only other option would be to create and apply a format to the calculated demo_age. You could always extract a few records and compare the two methods.
I would think that your excessive time is due to something else.
There are so many factors that influence how a SQL proc or datastep runs that I find the quickest way to test it is try and see which affects the running time by commenting out parts and comparing the differences.
This isn't 100% by far, but it works pretty well.
Another possibility is the two distinct options in the code, they have to require a sort of some kind on the back end...
FWIW, I ran a VERY unscientific test and discovered that the way it was written was faster. However, in doing that, I noticed a couple of probable errors in your inherited code that you might want to fix.
WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'
WHEN CALCULATED DEMO_AGE >= 55 AND 44.99 THEN 'OVER 55'
In the first of those lines, I would think that should be '45 - 54'
In the second of those line, the AND 44.99 doesn't make any sense and the resulting value should be 'OVER 54'
Anyhow, the code I ran was as follows:
data F_MEMBER_PROGRAM_BASE_PGM;
informat dob_dt anydtdtm21.;
input (PRCHSR_CD GDR) ($) DOB_DT;
member_key_char=1;
PRCHSR_CD=1;
ETHNICITY="1";
PGM_NM=1;
PROGRAM_ENROLLED=1;
OPPORTUNITY_COMPLETED=1;
ACTION_ITEM_RESPONSE_COMPLETED=1;
do i=1 to 10000;output;end;
cards;
1 1 2/20/1970:12:10:00
1 1 2/20/1970:12:10:00
2 2 2/20/1960:12:10:00
2 2 2/20/1960:12:10:00
3 3 2/20/1949:12:10:00
3 3 2/20/1949:12:10:00
;
PROC SQL;
CREATE TABLE WORK.BASE_PGM AS SELECT DISTINCT
(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS
,PRCHSR_CD AS DEMO_PRCHSR_CODE
,CASE WHEN GDR IS NOT MISSING THEN GDR
ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING
,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY
ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING
,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE
,CASE WHEN CALCULATED DEMO_AGE BETWEEN 18 AND 34.99 THEN '18 - 34'
WHEN CALCULATED DEMO_AGE BETWEEN 35 AND 44.99 THEN '35 - 44'
WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'
WHEN CALCULATED DEMO_AGE >= 55 THEN 'OVER 55'
ELSE 'UNKNOWN' END AS DEMO_AGE_GROUPING
,PGM_NM AS PGM_NM
,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED
,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED
FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE
WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0
GROUP BY PRCHSR_CD,DOB_DT,GDR;
QUIT;
proc format;
value agegrp
18-34.99="18 - 34"
35-44.99="35 - 44"
45-54.99="18 - 55"
other="OVER 55";
run;
PROC SQL;
CREATE TABLE WORK.BASE_PGM AS SELECT DISTINCT
(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS
,PRCHSR_CD AS DEMO_PRCHSR_CODE
,CASE WHEN GDR IS NOT MISSING THEN GDR
ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING
,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY
ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING
,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE
,put(((TODAY()-DATEPART(DOB_DT))/365.25),agegrp.) AS DEMO_AGE_GROUPING
,PGM_NM AS PGM_NM
,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED
,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED
FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE
WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0
GROUP BY PRCHSR_CD,DOB_DT,GDR;
QUIT;
Thank you Reeza and Art for the feedback.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.