BookmarkSubscribeRSS Feed
saspert
Pyrite | Level 9

Hi,

I have a sql query written by a previous programmer -

PROC SQL;
CREATE TABLE WORK.BASE_PGM AS SELECT  DISTINCT
(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS
,PRCHSR_CD AS DEMO_PRCHSR_CODE
,CASE WHEN GDR IS NOT MISSING THEN GDR
ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING
,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY
ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING
,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE
,CASE WHEN CALCULATED DEMO_AGE BETWEEN 18 AND 34.99 THEN '18 - 34'
WHEN CALCULATED DEMO_AGE BETWEEN 35 AND 44.99 THEN '35 - 44'
WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'
WHEN CALCULATED DEMO_AGE >= 55 AND 44.99 THEN 'OVER 55'
ELSE 'UNKNOWN' END AS DEMO_AGE_GROUPING 
,PGM_NM AS PGM_NM
,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED
,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED
FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE
WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0
GROUP BY PRCHSR_CD,DOB_DT,GDR;
QUIT;

Does anyone know if there could be performance issues with calculating the age and using it again in a case statement? The report is taking excess time and my hunch is this query is taking more time than necessary.

Any thoughts/feedback will be helpful.

Thanks,
saspert.

4 REPLIES 4
art297
Opal | Level 21

I wouldn't think so, but your only other option would be to create and apply a format to the calculated demo_age.  You could always extract a few records and compare the two methods.

I would think that your excessive time is due to something else.

Reeza
Super User

There are so many factors that influence how a SQL proc or datastep runs that I find the quickest way to test it is try and see which affects the running time by commenting out parts and comparing the differences.

This isn't 100% by far, but it works pretty well.

Another possibility is the two distinct options in the code, they have to require a sort of some kind on the back end...

art297
Opal | Level 21

FWIW, I ran a VERY unscientific test and discovered that the way it was written was faster.  However, in doing that, I noticed a couple of probable errors in your inherited code that you might want to fix.

WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'

WHEN CALCULATED DEMO_AGE >= 55 AND 44.99 THEN 'OVER 55'

In the first of those lines, I would think that should be '45 - 54'

In the second of those line, the AND 44.99 doesn't make any sense and the resulting value should be 'OVER 54'

Anyhow, the code I ran was as follows:

data F_MEMBER_PROGRAM_BASE_PGM;

  informat dob_dt anydtdtm21.;

  input (PRCHSR_CD GDR) ($) DOB_DT;

  member_key_char=1;

  PRCHSR_CD=1;

  ETHNICITY="1";

  PGM_NM=1;

  PROGRAM_ENROLLED=1;

  OPPORTUNITY_COMPLETED=1;

  ACTION_ITEM_RESPONSE_COMPLETED=1;

  do i=1 to 10000;output;end;

  cards;

  1 1 2/20/1970:12:10:00

  1 1 2/20/1970:12:10:00

  2 2 2/20/1960:12:10:00

  2 2 2/20/1960:12:10:00

  3 3 2/20/1949:12:10:00

  3 3 2/20/1949:12:10:00

;

PROC SQL;

CREATE TABLE WORK.BASE_PGM AS SELECT  DISTINCT

(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS

,PRCHSR_CD AS DEMO_PRCHSR_CODE

,CASE WHEN GDR IS NOT MISSING THEN GDR

ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING

,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY

ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING

,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE

,CASE WHEN CALCULATED DEMO_AGE BETWEEN 18 AND 34.99 THEN '18 - 34'

WHEN CALCULATED DEMO_AGE BETWEEN 35 AND 44.99 THEN '35 - 44'

WHEN CALCULATED DEMO_AGE BETWEEN 45 AND 54.99 THEN '45 - 55'

WHEN CALCULATED DEMO_AGE >= 55 THEN 'OVER 55'

ELSE 'UNKNOWN' END AS DEMO_AGE_GROUPING 

,PGM_NM AS PGM_NM

,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED

,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED

FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE

WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0

GROUP BY PRCHSR_CD,DOB_DT,GDR;

QUIT;

proc format;

  value agegrp

  18-34.99="18 - 34"

  35-44.99="35 - 44"

  45-54.99="18 - 55"

  other="OVER 55";

run;

PROC SQL;

CREATE TABLE WORK.BASE_PGM AS SELECT  DISTINCT

(COUNT (DISTINCT MEMBER_KEY_CHAR )) AS DEMO_PARTICIPATING_USERS

,PRCHSR_CD AS DEMO_PRCHSR_CODE

,CASE WHEN GDR IS NOT MISSING THEN GDR

ELSE 'UNKNOWN' END AS DEMO_GDR_GROUPING

,CASE WHEN ETHNICITY IS NOT MISSING THEN ETHNICITY

ELSE 'UNKNOWN' END AS DEMO_ETHN_GROUPING

,((TODAY()-DATEPART(DOB_DT))/365.25) AS DEMO_AGE

,put(((TODAY()-DATEPART(DOB_DT))/365.25),agegrp.) AS DEMO_AGE_GROUPING

,PGM_NM AS PGM_NM

,OPPORTUNITY_COMPLETED AS DETAIL_OPPORTUNITY_COMPLETED

,ACTION_ITEM_RESPONSE_COMPLETED AS DETAIL_ACTION_ITEM_COMPLETED

FROM F_MEMBER_PROGRAM_BASE_PGM AS F_MEMBER_PROGRAM_BASE

WHERE F_MEMBER_PROGRAM_BASE.PROGRAM_ENROLLED > 0

GROUP BY PRCHSR_CD,DOB_DT,GDR;

QUIT;

saspert
Pyrite | Level 9

Thank you Reeza and Art for the feedback.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 776 views
  • 0 likes
  • 3 in conversation