BookmarkSubscribeRSS Feed
sebastiangonzal
Calcite | Level 5

I have a transaction data set with the following fields:

 

Date | StoreID | ProductID | Sales | Price

 

On the other hand I have a Catalog with the promotions and date start and finish

 

DATE START | DATE FINISH | PROMOTION

 

What I need is to create automatically a column by each promotion in the catalog.

 

First I tried transpose data which transposed promotion, but I dont know how to make automatic case when calculation for each promotion conditions so that it gives 1 or 0, depending if the day of the transaction was between the range on the dates of the promotion, so I ended manually creating each promotion as a calculated field with the following function, which works but I cant scale it if its thousands of promotions, another problem was that the name of the promotions have from file characters that are not allowed in SAS, so I had to use a generic name and assign an alias with the actual name that comes in file.

 

CASE WHEN (t1.Date >= "18Jan2019"d and t1.Date <= "18May2019"d) THEN 1 ELSE 0 END

 

I guess there must be a more automatic way which I dont know, if there is and someone knows please tell me

9 REPLIES 9
PaigeMiller
Diamond | Level 26

UNTESTED CODE (If you want tested code, provide sample data using these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...)

 

proc sql;
     create table want as select *
     from dataset2 as a left join dataset1 as b
     on b.date>=a.date_start and b.date<=date_finish;
quit;
--
Paige Miller
Tom
Super User Tom
Super User

I don't get what you are trying to create.

If sounds like you want to merge your list of products and your list of promotions to create a list that might have multiple rows of promotions per product. You can then use PROC TRANSPOSE to convert the promotion name into a new variable name.  Not sure what value you want the variable to have however.

 

proc sql;
create table TALL as
  select a.*
       , b.promotion
       , 'X' as flag
  from products a
  left join promotions b
    on a.date between b.date_start and b.date_finish
  order by a.date,a.storeid,a.productid,a.sales,a.price
;
quit;

proc transpose data=tall out=want ;
  by date storeid productid sales price ;
  id promotion;
  var flag ;
run;
sebastiangonzal
Calcite | Level 5

I cant join since there is no key between transactions and promotion ID, I have to make a column by each promotion in the catalog and evaluate the case when calculation, so that the promotion variable will be binary (1 = Sales was made between the promotion range, 0 = Sale was not made in promotion range), what I did manually was this but I want to know if I can make it automatic with thousand more promotions

 

PROC SQL;
   CREATE TABLE WORK.QUERY_FOR_ANDREASAMPLEXXXXXX AS 
   SELECT t1.Sucursal, 
          t1.Producto, 
          t1.Fecha, 
          t1.Demanda_positiva, 
          t1.Demanda_negativa, 
          t1.Talla, 
          t1.Precio, 
          t1.Demanda, 
          t1.Descuento, 
          /* PROMOCION1 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogocalzadoprimavera2019mex" AS PROMOCION1, 
          /* PROMOCION14 */
            (CASE WHEN (t1.Fecha >= "18Feb2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogoiubellezaintegralprimavera2019eua" AS PROMOCION14, 
          /* PROMOCION2 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogovestirinteriorprimavera2019eua" AS PROMOCION2, 
          /* PROMOCION3 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogovestirexteriorprimavera2019eua" AS PROMOCION3, 
          /* PROMOCION4 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogocalzadoprimavera2019eua" AS PROMOCION4, 
          /* PROMOCION5 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogobellezaintegralprimavera2019mex" AS PROMOCION5, 
          /* PROMOCION6 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogovestirexteriorprimavera2019mex" AS PROMOCION6, 
          /* PROMOCION7 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogovestirinteriorprimavera2019mex" AS PROMOCION7, 
          /* PROMOCION8 */
            (CASE WHEN (t1.Fecha >= "09Jan2019"d and t1.Fecha <= "09Mar2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet12019méxico" AS PROMOCION8, 
          /* PROM0CION9 */
            (CASE WHEN (t1.Fecha >= "23Dec2018"d and t1.Fecha <= "23Feb2019"d) THEN 1 ELSE 0 END) LABEL=
            "lifestyles&sportenero/febrero" AS PROM0CION9, 
          /* PROMOCION10 */
            (CASE WHEN (t1.Fecha >= "23Jan2019"d and t1.Fecha <= "23Feb2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogofuerzayestiloenero-febrero2019" AS PROMOCION10, 
          /* PROMOCION111 */
            (CASE WHEN (t1.Fecha >= "16Jan2019"d and t1.Fecha <= "16Mar2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet12019eua" AS PROMOCION111, 
          /* PROMOCION12 */
            (CASE WHEN (t1.Fecha >= "06Jan2019"d and t1.Fecha <= "06Apr2019"d) THEN 1 ELSE 0 END) LABEL=
            "promotorprofesionaleindustrial" AS PROMOCION12, 
          /* PROMOCION13 */
            (CASE WHEN (t1.Fecha >= "16Feb2019"d and t1.Fecha <= "16Mar2019"d) THEN 1 ELSE 0 END) LABEL=
            "promotorbotasybotines" AS PROMOCION13, 
          /* PROMOCION15 */
            (CASE WHEN (t1.Fecha >= "08Feb2019"d and t1.Fecha <= "08Jun2019"d) THEN 1 ELSE 0 END) LABEL=
            "lifestyles&urbano2019marzoeua" AS PROMOCION15, 
          /* PROMOCION16 */
            (CASE WHEN (t1.Fecha >= "01Feb2019"d and t1.Fecha <= "01Jun2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogofuerzayestilomarzo2019" AS PROMOCION16, 
          /* PROMOCION17 */
            (CASE WHEN (t1.Fecha >= "01Feb2019"d and t1.Fecha <= "01Jun2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogourbanoprimavera2019" AS PROMOCION17, 
          /* PROMOCION18 */
            (CASE WHEN (t1.Fecha >= "24Feb2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogotropicalprimavera2019mexico" AS PROMOCION18, 
          /* PROMOCION19 */
            (CASE WHEN (t1.Fecha >= "24Feb2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogosushineprimavera2019usa" AS PROMOCION19, 
          /* PROMOCION20 */
            (CASE WHEN (t1.Fecha >= "04Mar2019"d and t1.Fecha <= "04May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet22019méxico" AS PROMOCION20, 
          /* PROMOCION21 */
            (CASE WHEN (t1.Fecha >= "23Feb2019"d and t1.Fecha <= "23Mar2019"d) THEN 1 ELSE 0 END) LABEL=
            "programaestelar1erbimestreprimavera2019eua" AS PROMOCION21, 
          /* PROMOCION22 */
            (CASE WHEN (t1.Fecha >= "23Jan2019"d and t1.Fecha <= "23Mar2019"d) THEN 1 ELSE 0 END) LABEL=
            "programaestelar1erbimestreprimavera2019" AS PROMOCION22, 
          /* PROMOCION23 */
            (CASE WHEN (t1.Fecha >= "04Mar2019"d and t1.Fecha <= "04May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet22019eua" AS PROMOCION23, 
          /* PROMOCION24 */
            (CASE WHEN (t1.Fecha >= "18Mar2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "programaestelar2ndobimestreprimavera2019eua" AS PROMOCION24, 
          /* PROMOCION25 */
            (CASE WHEN (t1.Fecha >= "11Mar2019"d and t1.Fecha <= "11May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet32019méxico" AS PROMOCION25, 
          /* PROMOCION26 */
            (CASE WHEN (t1.Fecha >= "18Apr2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogooutlet32019eua" AS PROMOCION26, 
          /* PROMOCION27 */
            (CASE WHEN (t1.Fecha >= "24Apr2019"d and t1.Fecha <= "24Aug2019"d) THEN 1 ELSE 0 END) LABEL=
            "catálogoprofesionaleindustrial2019" AS PROMOCION27, 
          /* PROMOCION11 */
            (CASE WHEN (t1.Fecha >= "18Jan2019"d and t1.Fecha <= "18May2019"d) THEN 1 ELSE 0 END) LABEL=
            "catalogocalzadoprimavera2019mex" AS PROMOCION11
      FROM SASHELP.ANDREASAMPLEXXXXXX t1;
QUIT;
Tom
Super User Tom
Super User

Sounds like you just want to do the join based and then just count the number of observations.

For example to see how many records fall within those date something like this:

proc sql;
create table TALL as
select b.promotion,count(*) as n_sales
 from A inner join B
 on a.date between b.start_data and b.end_date
group by promotion
;
quit;

If you did want to convert it from a dataset with 2 variables and multiple observations into a dataset with one observation and multiple variables use PROC TRANSPOSE.

ballardw
Super User

Best is to provide a small example of your starting data sets and what you want the result to look like.

 

Catalog has a very specific meaning when using SAS, so it is important to let us know if this is a SAS catalog or just a data set that you are calling a "catalog", possibly because that is what your shop calls it.

 

I suspect you actually have 2 or more data sets. If the date variables are all SAS date values this could be relatively simple.

 

Perhaps something similar to:

data work.trans;
   input transactionid $ transdate :date9.;
   format transdate date9.;
datalines;
1111   11JAN2019
2222   23JAN2019
3333   17JUL2019
;

data work.promo;
   input promstart :date9. promend : date9. Promoid $;
   format promstart promend date9.;
datalines;
01Jan2019 30Jan2019   PPP111
01May2019 31May2019   PPP222
01Jul2019 31Jul2019   PPP333
;

Proc sql;
   create table want as
   select a.*, b.transactionid
   from work.promo as a 
        left join
        work.trans as b
   on a.promstart le b.transdate le a.promend
   ;
quit;

I used transactionid instead of the storied, productid, sales and price.

Or logic using that information. Not quite clear exactly what you want.

sebastiangonzal
Calcite | Level 5

Here is a sample data with the 2 tables that I have and the desired outcome.

 

Thanks

ballardw
Super User

You show what appears to be a start date of "27ene2019" and end "18may2019" which supposed has 1/24/2019 in the range from your shown desired output. I don't really see how that is possible as 27 of any month comes after 1/24 of any year.

 

Here is a workable example with varied dates, see if it comes closer to what you want.

data work.trans;
input Date :  mmddyy10.	Store $	Product $	Sales	Price;
format date date9.;
datalines;
1/24/2019	900	84134803	2	23
1/24/2019	900	84291601	4	12
2/24/2019	900	84291601	6	12
1/24/2019	900	84291601	1	12
4/25/2019	700	61234803	3	11
1/25/2019	700	61234803	2	11
1/25/2019	700	61234803	7	11
5/10/2019	700	61234803	1	11
1/25/2019	700	61234803	1	11
3/25/2019	700	61234803	2	11
6/25/2019	700	61234803	3	11
;

data work.promo;
   input promstart :date9. promend : date9. Promoid $;
   format promstart promend date9.;
datalines;
01Jan2019 31May2019   Promo1
01Feb2019 18May2019   Promo2
01Mar2019 18May2019   Promo3
01Apr2019 18May2019   Promo4
01May2019 18May2019   Promo5
;

Proc sql;
   create table work.want as
   select a.*, b.promoid, (b.promstart le a.date le b.promend) as value
   from work.trans as a,        
        work.promo as b
   order by a.Date,a.Store,a.product,a.Sales,a.Price 
   ;
quit;

proc transpose data=work.want out=work.trans (drop=_name_) let;
   by Date Store product Sales	Price;
   var value;
   id promoid;
run;

I had to play some games with the order to get the BY statement to work in the Proc transpose so your Promo variables may not be in the column order by default that you expect. Don't worry about that until you write reports. OR

 

sebastiangonzal
Calcite | Level 5

I tried your program but it does not deliver the expected output and the variables with their value.

ballardw
Super User

@sebastiangonzal wrote:

I tried your program but it does not deliver the expected output and the variables with their value.


Not a single detail of how the result of my program does not meet expected output or the values.

So, using MY example program post the result and very carefully provide what the expected result should have been.

 

Your "example", besides the date issue being impossible, showed every single value in all of the promoid groups so there really was no way to tell what might not fit.

 

If your issue is the order of the Promoid variables then that is not an issue as you can control any report output to a desired order once you have a data set. And there are at least a dozen or more examples of requests on the forum for how to force an order of variables.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 838 views
  • 0 likes
  • 4 in conversation