I am counting the number of flavored ice cream then joining that data to a historical table. The issue I'm having is the PROC FREQ data check shows all ice cream 0 or 1 (no missing). But after executing the PROC SQL step there is missing data values for ice cream for various weeks. The variable total is sometimes zero. I'm thinking that may be part of the problem. There should be no missing values. It inhibits future calculations.
*COUNT VAR FOR SQL;
CT = 1;
IF ICECREAM = "STRAWBERRY" THEN STRAW1 = 1;
IF ICECREAM = "CHOCOLATE" THEN CHOCO1 = 1;
IF ICECREAM = "BUTER_PECAN" THEN BUTTER1=1;
IF ICECREAM = "VANILLA" THEN VAN1=1;
RETAIN STRAW2 CHOCO2 BUTTER2 BALOX2 OUT2 IN2;
IF FIRST.CASE_ID THEN DO;
STRAW2 = 0;
CHOCO2 = 0;
BUTTER2 = 0;
VAN2 = 0;
IF STRAW1 = 1 THEN STRAW2 = STRAW2 + 1;
IF CHOCO1 = 1 THEN CHOCO2 = CHOCO2 + 1;
IF BUTTER1 = 1 THEN BUTTER2 = BUTTER2 + 1;
IF VAN1 = 1 THEN VAN2 = VAN2 + 1;
*ASSIGN FINAL COUNTING VALUE FOR NEXT SQL STEP;
IF STRAW2 GE 1 THEN STRAWBERRY = 1;
ELSE STRAWBERRY = 0;
IF CHOCO2 GE 1 THEN CHOCOLATE = 1;
ELSE CHOCOLATE = 0;
IF BUTTER2 GE 1 THEN BUTTER_PECAN= 1;
ELSE BUTTER_PECAN= 0;
IF VAN2 GE 1 THEN VANILLA = 1;
DROP STAW1 STRAW2 CHOCO1 CHOCO2 BUTTER1 BUTTER2 VAN1 VAN2;
IF LAST.PURCHASE_ID THEN OUTPUT SUMMER_FUN;
/*DATA CHECK ON ICE CREAM COUNTING*/
PROC FREQ DATA= SUMMER_FUN;
TABLES BUTTER_PECAN STRAWBERRY CHOCOLATE VANILLA/LIST MISSING;
TITLE "DATA CHECK: COUNTING STEP FOR ICE CREAM";
*MERGE WITH BASELINE INFORMATION, COUNT TOTAL AND TYPES OF ICE CREAM;
CREATE TABLE WEEKLY_SUMMER_FUN AS
SELECT UNIQUE(B.WEEK_AB), B.WEEK_DISPLAY, SUM(A.CT) AS TOTAL,
SUM(A.STRAWBERRY) AS STRAWBERRY_COUNT,
SUM(A.CHOCOLATE) AS CHOCOLATE_COUNT,
SUM(A.BUTTER_PECAN) AS BUTTER_PECAN_COUNT,
SUM(A.VANILLA) AS VANILLA_COUNT,
B.STD_DEV + B.BASELINE AS STD_DEV_1,
B.STD_DEV*2 + B.BASELINE AS STD_DEV_2,
CALCULATED STRAWBERRY_COUNT/CALCULATED TOTAL AS PER_STRAW FORMAT PERCENT9.2,
CALCULATED CHOCCOLATE_COUNT/CALCULATED TOTAL AS PER_CHOCO FORMAT PERCENT9.2,
CALCULATED BUTTER_PECAN_COUNT/CALCULATED TOTAL AS PER_BUTTER FORMAT PERCENT9.2,
CALCULATED VANILLA_COUNT/CALCULATED TOTAL AS PER_VAN FORMAT PERCENT9.2,
FROM SUMMER_FUN AS A RIGHT JOIN SUMMER.SALES_BASELINE AS B
ON A.CATEGORY IN ("A","B") AND A.WEEK_AB = B.WEEK_AB
GROUP BY B.WEEK_AB, B.WEEK_DISPLAY;
I don't see a variable named ICE_CREAM so let's take this one instead.
SUM(A.CT) AS TOTAL
Since A is the LEFT table in your RIGHT JOIN it is possible that A.CT is missing for every observation in the group.
You could use the normal SAS function SUM(,) to force zeros into TOTAL when the SQL aggregate function SUM() result is missing.
SUM(SUM(A.CT),0) AS TOTAL
Or the COALESCE() function.
COALESCE(SUM(A.CT),0) AS TOTAL
I think that you want to fix this first:
BY PURCHASE_ID; RETAIN STRAW2 CHOCO2 BUTTER2 BALOX2 OUT2 IN2; IF FIRST.CASE_ID THEN DO;
This would throw a note of "First.case_id is unitialized" because Case_id is not on the BY statement. So First.case_id is never true [uninitialized means the variable is always missing]
Therefore none of the Straw2 through Balox2 are ever set with a value 0. Then all the Straw+1 are equivalent to missing + 1 which SAS will set to missing. And all of the "final counts" will be 0.
You can, in this case, replace these statements
IF STRAW1 = 1 THEN STRAW2 = STRAW2 + 1;
Which will increment the retained variable by the amount of straw1. If straw1 is missing or 0 then no change, otherwise add.
Side effect is if you haven't used a Retain statement this will imply one.
Or if not comfortable with that
IF STRAW1 = 1 THEN STRAW2 = Sum(STRAW2, 1) ;
Sum function will return 1 where Straw2+1 returns missing when Straw2 is not initialized to a non-missing value.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.