I have a transaction data set with the following fields:
Date | StoreID | ProductID | Sales | Price
On the other hand I have a Catalog with the promotions and date start and finish
DATE START | DATE FINISH | PROMOTION
What I need is to create automatically a column by each promotion in the catalog.
First I tried transpose data which transposed promotion, but I dont know how to make automatic case when calculation for each promotion conditions so that it gives 1 or 0, depending if the day of the transaction was between the range on the dates of the promotion, so I ended manually creating each promotion as a calculated field with the following function, which works but I cant scale it if its thousands of promotions, another problem was that the name of the promotions have from file characters that are not allowed in SAS, so I had to use a generic name and assign an alias with the actual name that comes in file.
CASE WHEN (t1.Date >= "18Jan2019"d and t1.Date <= "18May2019"d) THEN 1 ELSE 0 END
I guess there must be a more automatic way which I dont know, if there is and someone knows please tell me
UNTESTED CODE (If you want tested code, provide sample data using these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...)
proc sql;
create table want as select *
from dataset2 as a left join dataset1 as b
on b.date>=a.date_start and b.date<=date_finish;
quit;
I don't get what you are trying to create.
If sounds like you want to merge your list of products and your list of promotions to create a list that might have multiple rows of promotions per product. You can then use PROC TRANSPOSE to convert the promotion name into a new variable name. Not sure what value you want the variable to have however.
proc sql;
create table TALL as
select a.*
, b.promotion
, 'X' as flag
from products a
left join promotions b
on a.date between b.date_start and b.date_finish
order by a.date,a.storeid,a.productid,a.sales,a.price
;
quit;
proc transpose data=tall out=want ;
by date storeid productid sales price ;
id promotion;
var flag ;
run;
I cant join since there is no key between transactions and promotion ID, I have to make a column by each promotion in the catalog and evaluate the case when calculation, so that the promotion variable will be binary (1 = Sales was made between the promotion range, 0 = Sale was not made in promotion range), what I did manually was this but I want to know if I can make it automatic with thousand more promotions
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_ANDREASAMPLEXXXXXX AS
SELECT t1.Sucursal,
t1.Producto,
t1.Fecha,
t1.Demanda_positiva,
t1.Demanda_negativa,
t1.Talla,
t1.Precio,
t1.Demanda,
t1.Descuento,
/* PROMOCION1 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogocalzadoprimavera2019mex" AS PROMOCION1,
/* PROMOCION14 */
(CASE WHEN (t1.Fecha >= "18Feb2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogoiubellezaintegralprimavera2019eua" AS PROMOCION14,
/* PROMOCION2 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogovestirinteriorprimavera2019eua" AS PROMOCION2,
/* PROMOCION3 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogovestirexteriorprimavera2019eua" AS PROMOCION3,
/* PROMOCION4 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogocalzadoprimavera2019eua" AS PROMOCION4,
/* PROMOCION5 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogobellezaintegralprimavera2019mex" AS PROMOCION5,
/* PROMOCION6 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogovestirexteriorprimavera2019mex" AS PROMOCION6,
/* PROMOCION7 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogovestirinteriorprimavera2019mex" AS PROMOCION7,
/* PROMOCION8 */
(CASE WHEN (t1.Fecha >= "09Jan2019"d and t1.Fecha <= "09Mar2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet12019méxico" AS PROMOCION8,
/* PROM0CION9 */
(CASE WHEN (t1.Fecha >= "23Dec2018"d and t1.Fecha <= "23Feb2019"d) THEN 1 ELSE 0 END) LABEL=
"lifestyles&sportenero/febrero" AS PROM0CION9,
/* PROMOCION10 */
(CASE WHEN (t1.Fecha >= "23Jan2019"d and t1.Fecha <= "23Feb2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogofuerzayestiloenero-febrero2019" AS PROMOCION10,
/* PROMOCION111 */
(CASE WHEN (t1.Fecha >= "16Jan2019"d and t1.Fecha <= "16Mar2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet12019eua" AS PROMOCION111,
/* PROMOCION12 */
(CASE WHEN (t1.Fecha >= "06Jan2019"d and t1.Fecha <= "06Apr2019"d) THEN 1 ELSE 0 END) LABEL=
"promotorprofesionaleindustrial" AS PROMOCION12,
/* PROMOCION13 */
(CASE WHEN (t1.Fecha >= "16Feb2019"d and t1.Fecha <= "16Mar2019"d) THEN 1 ELSE 0 END) LABEL=
"promotorbotasybotines" AS PROMOCION13,
/* PROMOCION15 */
(CASE WHEN (t1.Fecha >= "08Feb2019"d and t1.Fecha <= "08Jun2019"d) THEN 1 ELSE 0 END) LABEL=
"lifestyles&urbano2019marzoeua" AS PROMOCION15,
/* PROMOCION16 */
(CASE WHEN (t1.Fecha >= "01Feb2019"d and t1.Fecha <= "01Jun2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogofuerzayestilomarzo2019" AS PROMOCION16,
/* PROMOCION17 */
(CASE WHEN (t1.Fecha >= "01Feb2019"d and t1.Fecha <= "01Jun2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogourbanoprimavera2019" AS PROMOCION17,
/* PROMOCION18 */
(CASE WHEN (t1.Fecha >= "24Feb2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogotropicalprimavera2019mexico" AS PROMOCION18,
/* PROMOCION19 */
(CASE WHEN (t1.Fecha >= "24Feb2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogosushineprimavera2019usa" AS PROMOCION19,
/* PROMOCION20 */
(CASE WHEN (t1.Fecha >= "04Mar2019"d and t1.Fecha <= "04May2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet22019méxico" AS PROMOCION20,
/* PROMOCION21 */
(CASE WHEN (t1.Fecha >= "23Feb2019"d and t1.Fecha <= "23Mar2019"d) THEN 1 ELSE 0 END) LABEL=
"programaestelar1erbimestreprimavera2019eua" AS PROMOCION21,
/* PROMOCION22 */
(CASE WHEN (t1.Fecha >= "23Jan2019"d and t1.Fecha <= "23Mar2019"d) THEN 1 ELSE 0 END) LABEL=
"programaestelar1erbimestreprimavera2019" AS PROMOCION22,
/* PROMOCION23 */
(CASE WHEN (t1.Fecha >= "04Mar2019"d and t1.Fecha <= "04May2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet22019eua" AS PROMOCION23,
/* PROMOCION24 */
(CASE WHEN (t1.Fecha >= "18Mar2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"programaestelar2ndobimestreprimavera2019eua" AS PROMOCION24,
/* PROMOCION25 */
(CASE WHEN (t1.Fecha >= "11Mar2019"d and t1.Fecha <= "11May2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet32019méxico" AS PROMOCION25,
/* PROMOCION26 */
(CASE WHEN (t1.Fecha >= "18Apr2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogooutlet32019eua" AS PROMOCION26,
/* PROMOCION27 */
(CASE WHEN (t1.Fecha >= "24Apr2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
"catálogoprofesionaleindustrial2019" AS PROMOCION27,
/* PROMOCION11 */
(CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
"catalogocalzadoprimavera2019mex" AS PROMOCION11
FROM SASHELP.ANDREASAMPLEXXXXXX t1;
QUIT;
Sounds like you just want to do the join based and then just count the number of observations.
For example to see how many records fall within those date something like this:
proc sql;
create table TALL as
select b.promotion,count(*) as n_sales
from A inner join B
on a.date between b.start_data and b.end_date
group by promotion
;
quit;
If you did want to convert it from a dataset with 2 variables and multiple observations into a dataset with one observation and multiple variables use PROC TRANSPOSE.
Best is to provide a small example of your starting data sets and what you want the result to look like.
Catalog has a very specific meaning when using SAS, so it is important to let us know if this is a SAS catalog or just a data set that you are calling a "catalog", possibly because that is what your shop calls it.
I suspect you actually have 2 or more data sets. If the date variables are all SAS date values this could be relatively simple.
Perhaps something similar to:
data work.trans; input transactionid $ transdate :date9.; format transdate date9.; datalines; 1111 11JAN2019 2222 23JAN2019 3333 17JUL2019 ; data work.promo; input promstart :date9. promend : date9. Promoid $; format promstart promend date9.; datalines; 01Jan2019 30Jan2019 PPP111 01May2019 31May2019 PPP222 01Jul2019 31Jul2019 PPP333 ; Proc sql; create table want as select a.*, b.transactionid from work.promo as a left join work.trans as b on a.promstart le b.transdate le a.promend ; quit;
I used transactionid instead of the storied, productid, sales and price.
Or logic using that information. Not quite clear exactly what you want.
Here is a sample data with the 2 tables that I have and the desired outcome.
Thanks
You show what appears to be a start date of "27ene2019" and end "18may2019" which supposed has 1/24/2019 in the range from your shown desired output. I don't really see how that is possible as 27 of any month comes after 1/24 of any year.
Here is a workable example with varied dates, see if it comes closer to what you want.
data work.trans; input Date : mmddyy10. Store $ Product $ Sales Price; format date date9.; datalines; 1/24/2019 900 84134803 2 23 1/24/2019 900 84291601 4 12 2/24/2019 900 84291601 6 12 1/24/2019 900 84291601 1 12 4/25/2019 700 61234803 3 11 1/25/2019 700 61234803 2 11 1/25/2019 700 61234803 7 11 5/10/2019 700 61234803 1 11 1/25/2019 700 61234803 1 11 3/25/2019 700 61234803 2 11 6/25/2019 700 61234803 3 11 ; data work.promo; input promstart :date9. promend : date9. Promoid $; format promstart promend date9.; datalines; 01Jan2019 31May2019 Promo1 01Feb2019 18May2019 Promo2 01Mar2019 18May2019 Promo3 01Apr2019 18May2019 Promo4 01May2019 18May2019 Promo5 ; Proc sql; create table work.want as select a.*, b.promoid, (b.promstart le a.date le b.promend) as value from work.trans as a, work.promo as b order by a.Date,a.Store,a.product,a.Sales,a.Price ; quit; proc transpose data=work.want out=work.trans (drop=_name_) let; by Date Store product Sales Price; var value; id promoid; run;
I had to play some games with the order to get the BY statement to work in the Proc transpose so your Promo variables may not be in the column order by default that you expect. Don't worry about that until you write reports. OR
I tried your program but it does not deliver the expected output and the variables with their value.
@sebastiangonzal wrote:
I tried your program but it does not deliver the expected output and the variables with their value.
Not a single detail of how the result of my program does not meet expected output or the values.
So, using MY example program post the result and very carefully provide what the expected result should have been.
Your "example", besides the date issue being impossible, showed every single value in all of the promoid groups so there really was no way to tell what might not fit.
If your issue is the order of the Promoid variables then that is not an issue as you can control any report output to a desired order once you have a data set. And there are at least a dozen or more examples of requests on the forum for how to force an order of variables.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.