BookmarkSubscribeRSS Feed
Jepi
Calcite | Level 5

Hi, 

I am counting the number of flavored ice cream then joining that data to a historical table. The issue I'm having is the PROC FREQ data check shows all ice cream 0 or 1 (no missing). But after executing the PROC SQL step there is missing data values for ice cream for various weeks. The variable total is sometimes zero. I'm thinking that may be part of the problem. There should be no missing values. It inhibits future calculations.

 

DATA SUMMER_FUN;

      SET ICECREAM;

     

      *COUNT VAR FOR SQL;

      CT = 1;

      IF ICECREAM =     "STRAWBERRY"            THEN STRAW1 = 1;

      IF ICECREAM =     "CHOCOLATE"                   THEN CHOCO1 = 1;

      IF ICECREAM =     "BUTER_PECAN"                 THEN BUTTER1=1;

      IF ICECREAM =     "VANILLA"               THEN VAN1=1;

     

      BY PURCHASE_ID;

      RETAIN  STRAW2 CHOCO2 BUTTER2 BALOX2 OUT2 IN2;

 

      IF FIRST.CASE_ID THEN DO;

      STRAW2 = 0;

      CHOCO2 = 0;

      BUTTER2 = 0;

      VAN2 = 0;

      END;

 

      IF STRAW1 = 1 THEN STRAW2 = STRAW2 + 1;

      IF CHOCO1 = 1 THEN CHOCO2 = CHOCO2 + 1;

      IF BUTTER1 = 1 THEN BUTTER2 = BUTTER2 + 1;

      IF VAN1 = 1 THEN VAN2 = VAN2 + 1;

     

     

      *ASSIGN FINAL COUNTING VALUE FOR NEXT SQL STEP;

            IF STRAW2 GE 1 THEN STRAWBERRY = 1;

                  ELSE STRAWBERRY = 0;

            IF CHOCO2 GE 1 THEN CHOCOLATE = 1;

                  ELSE CHOCOLATE = 0;

            IF BUTTER2 GE 1 THEN BUTTER_PECAN= 1;

                  ELSE BUTTER_PECAN= 0;

            IF VAN2 GE 1 THEN VANILLA = 1;

                  ELSE VANILLA=0;

           

      DROP STAW1 STRAW2 CHOCO1 CHOCO2 BUTTER1 BUTTER2 VAN1 VAN2;

      IF LAST.PURCHASE_ID THEN OUTPUT SUMMER_FUN;

RUN;

 

      /*DATA CHECK ON ICE CREAM COUNTING*/

      PROC FREQ DATA= SUMMER_FUN;

            TABLES BUTTER_PECAN STRAWBERRY CHOCOLATE VANILLA/LIST MISSING;

            TITLE "DATA CHECK: COUNTING STEP FOR ICE CREAM";

      RUN;TITLE;

 

*MERGE WITH BASELINE INFORMATION, COUNT TOTAL AND TYPES OF ICE CREAM;

PROC SQL;

      CREATE TABLE WEEKLY_SUMMER_FUN AS

            SELECT UNIQUE(B.WEEK_AB), B.WEEK_DISPLAY, SUM(A.CT) AS TOTAL,

                  SUM(A.STRAWBERRY) AS STRAWBERRY_COUNT,

                  SUM(A.CHOCOLATE) AS CHOCOLATE_COUNT,

                  SUM(A.BUTTER_PECAN) AS BUTTER_PECAN_COUNT,

                  SUM(A.VANILLA) AS VANILLA_COUNT,

                   

                  B.BASELINE,

                  B.STD_DEV,

                  B.STD_DEV + B.BASELINE AS STD_DEV_1,

                  B.STD_DEV*2 + B.BASELINE AS STD_DEV_2,

                  B.NEG,

                  B.OFF_THRESH,

                  CALCULATED STRAWBERRY_COUNT/CALCULATED TOTAL AS PER_STRAW FORMAT PERCENT9.2,

                  CALCULATED CHOCCOLATE_COUNT/CALCULATED TOTAL AS PER_CHOCO FORMAT PERCENT9.2,

                  CALCULATED BUTTER_PECAN_COUNT/CALCULATED TOTAL AS PER_BUTTER FORMAT PERCENT9.2,

                  CALCULATED VANILLA_COUNT/CALCULATED TOTAL AS PER_VAN FORMAT PERCENT9.2,

                 

                  FROM SUMMER_FUN AS A RIGHT JOIN SUMMER.SALES_BASELINE AS B

                        ON A.CATEGORY IN ("A","B") AND A.WEEK_AB = B.WEEK_AB

                        GROUP BY B.WEEK_AB, B.WEEK_DISPLAY;

QUIT;

 

2 REPLIES 2
Tom
Super User Tom
Super User

I don't see a variable named ICE_CREAM so let's take this one instead.

SUM(A.CT) AS TOTAL

Since A is the LEFT table in your RIGHT JOIN it is possible that A.CT is missing for every observation in the group.

 

You could use the normal SAS function SUM(,) to force zeros into TOTAL when the SQL aggregate function SUM() result is missing.

SUM(SUM(A.CT),0) AS TOTAL

 Or the COALESCE() function.

COALESCE(SUM(A.CT),0) AS TOTAL

 

ballardw
Super User

I think that you want to fix this first:

      BY PURCHASE_ID;

      RETAIN  STRAW2 CHOCO2 BUTTER2 BALOX2 OUT2 IN2;

      IF FIRST.CASE_ID THEN DO;

This would throw a note of "First.case_id is unitialized" because Case_id is not on the BY statement. So First.case_id is never true [uninitialized means the variable is always missing]

Therefore none of the Straw2 through Balox2 are ever set with a value 0. Then all the Straw+1 are equivalent to missing + 1 which SAS will set to missing. And all of the "final counts" will be 0.

 

You can, in this case, replace these statements

  IF STRAW1 = 1 THEN STRAW2 = STRAW2 + 1;

with

Straw2+straw1;

Which will increment the retained variable by the amount of straw1. If straw1 is missing or 0 then no change, otherwise add.

Side effect is if you haven't used a Retain statement this will imply one.

Or if not comfortable with that

  IF STRAW1 = 1 THEN STRAW2 = Sum(STRAW2, 1) ;

Sum function will return 1 where Straw2+1 returns missing when Straw2 is not initialized to a non-missing value.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 445 views
  • 0 likes
  • 3 in conversation