BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Georg_UPB
Fluorite | Level 6

Hi,

I have a dataset containing for each store and date the IDs of different product types (Product_ID) and the amount of sold units (Product_sold). The data are already sorted by store and date.

My tasks are to

* consolidate sales for identical product types which were sold in the same store and on the same day (done in the first PROC SQL step in the example below)

* insert columns having the information sorted by product type in ascending order (Product_ID_asc Product_sold_asc) and columns having the information sorted in descending order (Product_ID_desc Product_sold_desc) for each store and date.

I'd really like to know if there are better solutions than the (pretty slow) one I posted below. The original dataset is A and A2 represents the desired result.

1) Dataset A is very big. Is it really necessary to create a second table (A2) in order to consolidate sales (first PROC SQL step)?

2) Aren't there more efficient ways that take advantage of the fact that the data is already sorted by store and date? I still mention both variables after group by/sort by.

Thanks. I really appreciate any suggestions.

Data A;

   Format

      Store $4.

      Date date11.

      Product_ID $6.

      Product_sold best2.;

   Input

      Store :$4.

      Date :date11.

      Product_ID :$6.

      Product_sold best2.;

Datalines;

West 19-FEB-2013 Type_B 9

West 19-FEB-2013 Type_A 9

West 19-FEB-2013 Type_B 11

West 20-FEB-2013 Type_C 5

West 20-FEB-2013 Type_A 6

West 20-FEB-2013 Type_B 7

West 20-FEB-2013 Type_A 10

Run;

Proc Sql;

   Create Table A2

   As Select

      Store, Date, Product_ID, SUM(Product_sold) AS Product_sold

   From

      A

   Group By

      Store, Date, Product_ID;

Quit;

Proc Sort

   Data=A2 Out=A2Product_asc

   (

      Keep=Product_ID Product_sold

      Rename=(Product_ID=Product_ID_asc Product_sold=Product_sold_asc)

   );

   By Store Date Product_ID;

Run;

Proc Sort

   Data=A2 Out=A2Product_desc

   (

      Keep=Product_ID Product_sold

      Rename=(Product_ID=Product_ID_desc Product_sold=Product_sold_desc)

   );

   By Store Date Descending Product_ID;

Run;

Data A2;

   Set A2(Keep=Store Date);

   Set A2Product_asc;

   Set A2Product_desc;

Run;

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

I would guess that PROC SUMMARY would be faster than PROC SQL:

proc summary data=A nway;

  by store date;

  class product_id;;

  var product_sold;

  output out=a2 (keep=store date product_id product_sold_asc rename=(product_id=product_id_asc)) sum=product_sold_asc;

run;

Of course, there's just one way to be sure which would be faster.  And this does require the data to be in sorted order BY STORE DATE (which you had indicated was the case).

As RichardinOz pointed out, the result will be sorted already (regardless of whether you use SQL or SUMMARY), and you only need two data sets not three.

Good luck.

View solution in original post

3 REPLIES 3
RichardinOz
Quartz | Level 8

Your first sort is redundant (omit it!) because GROUP BY in SAS will generate a sorted table.  To be certain, you could insert and ORDER BY clause.

Data A2;

   Set A2 (Rename = (Product_Sold = Product_Sold_Asc ) );

   Set A2Product_desc;

Run;

Richard

Astounding
PROC Star

I would guess that PROC SUMMARY would be faster than PROC SQL:

proc summary data=A nway;

  by store date;

  class product_id;;

  var product_sold;

  output out=a2 (keep=store date product_id product_sold_asc rename=(product_id=product_id_asc)) sum=product_sold_asc;

run;

Of course, there's just one way to be sure which would be faster.  And this does require the data to be in sorted order BY STORE DATE (which you had indicated was the case).

As RichardinOz pointed out, the result will be sorted already (regardless of whether you use SQL or SUMMARY), and you only need two data sets not three.

Good luck.

Georg_UPB
Fluorite | Level 6

Thank you for your help! Your suggestions improved the code a lot!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 549 views
  • 3 likes
  • 3 in conversation