I need to select the customers who made purchases according to the criteria of each of my MASTER_ID -and save it in a table for each of my MASTER_ID
I have the following matrix (MATRIX_TABLE):
MASTER_ID | CAMPAIGNS | PRODUCT | START_DATE | END_DATE |
1 | K1 | PROD_88T | 01/01/2021 | 05/01/2021 |
2 | k1 | PROD_55X | 20/12/2020 | 20/02/2021 |
3 | k2 | PROD_75A | 15/01/2021 | 15/03/2021 |
4 | K1 | PROD_95Y | 01/04/2021 | 14/04/2021 |
OBS: The same campaign can be repeated in several lines, since the same campaign can encompass several different products and different start and end dates.
And I have the following table that lists all the purchases my customers made during my campaigns.
PURCHASES ( PURCHASES_TABLE)
CLIENT_ID | CLIENT_NAME | PRODUCT | AMOUNT_PURCHASE | PURCHASE_DATE |
CLI_1 | CLIENT 1 | PROD_95Y | 200 | 01/10/2020 |
CLI_2 | CLIENT 2 | PROD_88T | 300 | 15/02/2021 |
CLI_3 | CLIENT 3 | PROD_95Y | 100 | 11/01/2021 |
My code runs, but I'm not getting my expected result. I generate three tables (PURCHASES_ & MASTER_ID), but the results are the same in the three tables. That is, it does not select the criteria according to each MASTER_ID
This is my attempt to get the purchases of the K1 Campaign:
%global campaign;
%macro DDC (campaign);
PROC SQL noprint;
SELECT count(*)
INTO :n_var
FROM MATRIX_TABLE;
QUIT;
%DO I = 1 %TO &n_var;
data _null_;
set WORK.MATRIX_TABLE;
if _n_= &I;
call symput('campaign', campaign);
call symput('MASTER_ID', MASTER_ID);
run;
PROC SQL;
CREATE TABLE PURCHASES_&MASTER_ID AS
SELECT t2.CLIENT_ID, T2.CLIENT_NAME, t1.PRODUCT
FROM matrix_table t1, PURCHASES_TABLE T2
WHERE t1.PRODUCT = T2.PRODUCT
and t2.PURCHASE_DATE >= t1.START_DATE
and t2.PURCHASE_DATE <= t1.END_DATE;
QUIT;
%end;
%mend DDC;
%DDC (K1);
Can you state the criteria in words and show what you expect for the output given the example data?
First, @ballardw is asking what you want the data sets to look like. Second, your CALL SYMPUT has trailing blanks when I run it, unless I made a mistake somewhere. Third, is there an actual difference between 'k1' and 'K1'? Fourth, is there a reason you just have a numeric identifier for the data sets as opposed to the campaigns identifier?
I'm thinking that this can be streamlined and more data-driven, but there are a lot of unknowns out there.
@Neeydchi wrote:
EXPECTED OUTPUT FOR %DDC (K1);
TABLE PURCHASES_1 ( PURCHASES_&MASTER_ID):
CLIENT_ID, CLIENT_NAME, PRODUCT for all customers who purchased "PROD_88T" between "01/01/2021" and "05/01/2021"
TABLE PURCHASES_2 ( PURCHASES_&MASTER_ID):
CLIENT_ID, CLIENT_NAME, PRODUCT for all customers who purchased " PROD_55X" between "20/12/2020" and "20/02/2021"
TABLE PURCHASES_4 ( PURCHASES_&MASTER_ID):
CLIENT_ID, CLIENT_NAME, PRODUCT for all customers who purchased " PROD_95Y" between "01/04/2021" and "14/04/2021"
Since you are showing dates with quotes then that appears the "date" is a character value and "between" means alphabetically not chronologically. At least from what you show. So between "20/12/2020" and "20/02/2021" does not include '01/01/2021' because '01' is not greater than "20" at the start of the string. If your data has character values for dates then the first step is convert them to SAS data values.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.