BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
duanzongran
Obsidian | Level 7

dear all:

I want to summary a sales data from different area ( for example sum_20240105 summary the s_20240105 and the prefix 'sum_' is a must.There are lots of columns),how can I code it not that hard?

data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
;
run;

data want;
set sales end=last;
sum_20240105+s_20240105;
sum_20240109+s_20240109;
sum_20240112+s_20240112;
sum_20240122+s_20240122;
sum_20240129+s_20240129;
sum_20240209+s_20240209;
sum_20240212+s_20240212;
sum_20240222+s_20240222;
if last then output;
keep  sum_:;
run;

The want data is:

捕获.JPG

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Use proc summary, with a statistic-rename capability.  For instance, if you have only two variables you could:

 

proc summary data=sales;
  var s_20240105 s_20240109;
  output out=want   sum(s_20240105 s_20240109) = SUM_20240105 SUM_20240109;
run;

But you have a lot of variables to be renamed.  Use the dictionary.columns capability of PROC SQL to build macrovars &VARLIST and &SUMLIST  to generate the rename components:

 

data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
run;

proc sql noprint;
  select distinct 
         name                       ,cats('SUM_',scan(name,2,'_'))
  into   :varlist separated by ' '  ,:sumlist separated by ' '
  from dictionary.columns
  where libname='WORK' and memname='SALES' and upcase(scan(name,1,'_'))='S';
quit;
%put &=varlist;
%put &=sumlist;

proc summary data=sales;
  var s_:;
  output out=want (drop=_type_ _freq_) sum(&varlist)=&sumlist;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Just ask PROC SUMMARY to do that.

proc summary data=sales nway ;
   var s_: ;
   output out=want sum=;
run;

Will produce a dataset like you asked for (only using the original variable names).  

 

Does it really matter if the names start with SUM instead of S? Why?

 

It will probably be MUCH easier if you move that numeric suffix (that looks like a DATE string) out of the variable NAME and into its own variable.

proc transpose data=sales out=sales_t(rename=(col1=sales)) name=date_char ;
  by area ;
  var s_: ;
run;

data sales_t;
  set sales_t;
  date = input(substr(date_char,3),yymmdd8.);
  format date yymmdd10.;
run;

proc summary data=sales_t nway;
  class date;
  var sales ;
  output out=want_t sum=sum_sales;
run;

Results

                                          sum_
Obs          date    _TYPE_    _FREQ_    sales

 1     2024-01-05       1         3         5
 2     2024-01-09       1         3        11
 3     2024-01-12       1         3        19
 4     2024-01-22       1         3        18
 5     2024-01-29       1         3        21
 6     2024-02-09       1         3        17
 7     2024-02-12       1         3        25
 8     2024-02-22       1         3        31

 

mkeintz
PROC Star

Use proc summary, with a statistic-rename capability.  For instance, if you have only two variables you could:

 

proc summary data=sales;
  var s_20240105 s_20240109;
  output out=want   sum(s_20240105 s_20240109) = SUM_20240105 SUM_20240109;
run;

But you have a lot of variables to be renamed.  Use the dictionary.columns capability of PROC SQL to build macrovars &VARLIST and &SUMLIST  to generate the rename components:

 

data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
run;

proc sql noprint;
  select distinct 
         name                       ,cats('SUM_',scan(name,2,'_'))
  into   :varlist separated by ' '  ,:sumlist separated by ' '
  from dictionary.columns
  where libname='WORK' and memname='SALES' and upcase(scan(name,1,'_'))='S';
quit;
%put &=varlist;
%put &=sumlist;

proc summary data=sales;
  var s_:;
  output out=want (drop=_type_ _freq_) sum(&varlist)=&sumlist;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 393 views
  • 0 likes
  • 3 in conversation