dear all:
I want to summary a sales data from different area ( for example sum_20240105 summary the s_20240105 and the prefix 'sum_' is a must.There are lots of columns),how can I code it not that hard?
data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
;
run;
data want;
set sales end=last;
sum_20240105+s_20240105;
sum_20240109+s_20240109;
sum_20240112+s_20240112;
sum_20240122+s_20240122;
sum_20240129+s_20240129;
sum_20240209+s_20240209;
sum_20240212+s_20240212;
sum_20240222+s_20240222;
if last then output;
keep sum_:;
run;
The want data is:
Use proc summary, with a statistic-rename capability. For instance, if you have only two variables you could:
proc summary data=sales;
var s_20240105 s_20240109;
output out=want sum(s_20240105 s_20240109) = SUM_20240105 SUM_20240109;
run;
But you have a lot of variables to be renamed. Use the dictionary.columns capability of PROC SQL to build macrovars &VARLIST and &SUMLIST to generate the rename components:
data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
run;
proc sql noprint;
select distinct
name ,cats('SUM_',scan(name,2,'_'))
into :varlist separated by ' ' ,:sumlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='SALES' and upcase(scan(name,1,'_'))='S';
quit;
%put &=varlist;
%put &=sumlist;
proc summary data=sales;
var s_:;
output out=want (drop=_type_ _freq_) sum(&varlist)=&sumlist;
run;
Just ask PROC SUMMARY to do that.
proc summary data=sales nway ;
var s_: ;
output out=want sum=;
run;
Will produce a dataset like you asked for (only using the original variable names).
Does it really matter if the names start with SUM instead of S? Why?
It will probably be MUCH easier if you move that numeric suffix (that looks like a DATE string) out of the variable NAME and into its own variable.
proc transpose data=sales out=sales_t(rename=(col1=sales)) name=date_char ;
by area ;
var s_: ;
run;
data sales_t;
set sales_t;
date = input(substr(date_char,3),yymmdd8.);
format date yymmdd10.;
run;
proc summary data=sales_t nway;
class date;
var sales ;
output out=want_t sum=sum_sales;
run;
Results
sum_ Obs date _TYPE_ _FREQ_ sales 1 2024-01-05 1 3 5 2 2024-01-09 1 3 11 3 2024-01-12 1 3 19 4 2024-01-22 1 3 18 5 2024-01-29 1 3 21 6 2024-02-09 1 3 17 7 2024-02-12 1 3 25 8 2024-02-22 1 3 31
Thank you @Tom .
Use proc summary, with a statistic-rename capability. For instance, if you have only two variables you could:
proc summary data=sales;
var s_20240105 s_20240109;
output out=want sum(s_20240105 s_20240109) = SUM_20240105 SUM_20240109;
run;
But you have a lot of variables to be renamed. Use the dictionary.columns capability of PROC SQL to build macrovars &VARLIST and &SUMLIST to generate the rename components:
data sales;
input area $ s_20240105 s_20240109 s_20240112 s_20240122 s_20240129 s_20240209 s_20240212 s_20240222;
datalines;
NO1 1 2 4 5 6 7 8 9
NO2 2 5 9 5 8 6 8 10
NO3 2 4 6 8 7 4 9 12
run;
proc sql noprint;
select distinct
name ,cats('SUM_',scan(name,2,'_'))
into :varlist separated by ' ' ,:sumlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='SALES' and upcase(scan(name,1,'_'))='S';
quit;
%put &=varlist;
%put &=sumlist;
proc summary data=sales;
var s_:;
output out=want (drop=_type_ _freq_) sum(&varlist)=&sumlist;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.