05-24-2012 03:18 PM
I have 100 Variable in my dataset. I need to find n nmiss min max mean median for each variable..
When I tried writing proc means/summary with output statement it is giving only one record in out dataset(don't know which variable' summary stats is for). Is it not possible to get summary stats in to a dataset for all variables with one proc statement without using macros to pass each variable and appending it to the base table.
please help me on the same.
05-24-2012 06:59 PM
I downloaded your Macro a few minutes ago, the sas code has a modified date 05/17/2012. are you sure you put your updated version on line?
I have sas 9.3, Windons 7 and office 2010. Below is the log file.
31 ! ExcelLoad(ExcelPath=help,dsn=%scan(&SYSDSN,2),sheet=&dsn,lib=%scan(&SYSDSN,1))
31 ! ;
32 %let _dsn=&dsn;
33 %let _lib=&lib;
34 %let _sheet=&sheet;
35 %let _ExcelPath=&ExcelPath;
37 %if %quote(&_ExcelPath)=help %then %do;
38 %put NOTE: Use this macro to copy a SAS dataset with more than 255
38 ! variables to an Excel workbook.;
39 %put NOTE: Usage : ExcelLoad(ExcelPath=, lib=, dsn=, sheet=);
40 %put NOTE: ExcelPath : File path to Excel workbook;
41 %put NOTE: lib : SAS library (default: library of most recently created
41 ! dataset);
42 %put NOTE: dsn : SAS dataset name (default: most recently created dataset)
42 ! ;
43 %put NOTE: sheet : excel workbook sheet name to create (default: same as
43 ! dataset name);
44 %put NOTE: You cannot overwrite an existing sheet with ExcelLoad.;
46 %else %do;
48 %let colBlock=200;
49 %let _SYSDSN=&SYSDSN;
51 libname _xl Excel "&_ExcelPath";
53 proc sql noprint;
54 select count(*) into :sheetExists from
55 dictionary.tables where libname="_XL" and
56 (upcase(memname)=upcase("&_sheet.") or
56 ! upcase(memname)=upcase("&_sheet.$"));
59 %if &sheetExists %then %do;
60 %put ERROR: Sheet &_sheet already exists in workbook &_ExcelPath.;
61 %put ERROR: You must delete sheet &_sheet in Excel or choose another sheet
61 ! name.;
63 %else %do;
64 option nonotes;
65 proc fcmp outlib=sasuser.fcmp.excel;
66 function _ExcelCol(colNo) $4;
67 n = max(int(colNo),1);
68 length colStr $4;
69 colStr = "";
70 do while(n>0);
71 m = mod(n-1,26);
72 colStr = cat(byte(65+m),colStr);
73 n = int((n-m-1)/26);
75 return (trim(colStr));
78 data _null_; call symput("restoreCMPLIB",getoption("CMPLIB","keyword"));
78 ! run;
79 options append=(cmplib=sasuser.fcmp);
81 data _null_;
82 call symput("nbVars", put(attrn(open("&_lib..&_dsn."),"NVARS"),6.0));
83 call symput("nbLines",
83 ! put(1+attrn(open("&_lib..&_dsn."),"NLOBS"),8.0));
86 %let vBeg=1;
87 %do %while(&vBeg<=&nbVars);
88 %if &vBeg+&colBlock-1>&nbVars
89 %then %let vEnd=&nbVars;
90 %else %let vEnd=%eval(&vBeg+&colBlock-1);
92 proc sql noprint;
93 select name into :vars separated by ","
94 from dictionary.columns
95 where libname=upcase("&_lib") and upcase(memname)=upcase("&_dsn")
96 and varnum between &vBeg and &vEnd ;
98 %if &vBeg=1
99 %then %let tName=&_sheet;
100 %else %let
101 proc sql;
102 create table _xl."&tName."n as select &vars. from &_lib..&_dsn.;
104 %let vBeg=%eval(&vBeg+&colBlock);
106 proc fcmp outlib=sasuser.fcmp.Excel;
107 deletefunc Excel._ExcelCol;
109 options &restoreCMPLIB.;
110 option notes;
111 %put NOTE: Dataset &_lib..&_dsn was copied to sheet &_sheet in workbook
113 libname _xl clear;
114 %let SYSDSN=&_SYSDSN;
116 %mend ExcelLoad;
117 data test;
118 array _v(*) var1-var300;
119 do i=1 to 300;
NOTE: The data set WORK.TEST has 1 observations and 301 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
123 %ExcelLoad(ExcelPath=c:\temp\forum\myFile.xlsx, lib=work, dsn=test,
NOTE: Libref _XL was successfully assigned as follows:
Physical Name: c:\temp\forum\myFile.xlsx
NOTE: PROCEDURE SQL used (Total process time):
real time 0.16 seconds
cpu time 0.03 seconds
ERROR: Error attempting to CREATE a DBMS table. ERROR: Execute: This table
contains cells that are outside the range of cells defined in this
WARNING: File deletion failed for _XL.'mySheet$GS1:KO2'n.DATA.
NOTE: Dataset work.test was copied to sheet mySheet in workbook
NOTE: Libref _XL has been deassigned.
05-24-2012 07:34 PM
That's the old version. Did you get it from
When I see it, it says modified on 2012-05-20 and is open to everyone. I just deleted the two archived versions. So, it should now be impossible to get an older version now. Thank you again Linlin.
05-25-2012 01:34 PM
No I hadn't. But now that I do, I don't get the results that I expected.
PROC TABULATE seems better at producing report tables than datasets.
05-28-2012 12:28 AM
It seems that proc tabulate is not so smart. Use SASHELP.CLASS as an example.
%macro mean(tname,vname); Proc means data=&tname noprint; var &vname ; output out=temp(drop=_:) n=n nmiss=nmiss min=min max=max mean=mean median=median ; run; data temp;length vname $ 40;set temp;vname="&vname";run; proc append base=want data=temp force;run; %mend mean; data _null_; set sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' and type='num')) ; call execute(cats('%mean(',libname,'.',memname,',',name,')')); run;
05-31-2012 01:59 PM
Looping over variables in a data set as you suggest is generally a bad idea. When the number of variables and obs is relatively small performance is not too bad but as either grow performance can suffer significantly.
In general it is better to let SAS summarize many variables at once. It is very good at that. However there is the problem of the output not in the format that is most pleasing. Of course that can be remedied more efficiently after the data are summarized as the number of obs to be process will be reduced.
proc summary data=sashelp.heart nway;
output out=Stats0 n= nmiss= min= max= mean= median= / autoname;
proc transpose data=stats0 out=stats1;
by sex _type_ _freq_;
length Variable $32 Statistic $8;
Variable = substr(_name_,1,p-2);
Statistic = substr(_name_,p);
proc sort data=stats1;
by Sex Variable;
proc transpose data=stats1 out=stats2(drop=_name_);
by Sex Variable;
06-01-2012 01:37 AM
No, I don't think your code is good one. Since you mentioned lots of variables in a table.
In your code, every variable will generate four new variables, when a table has lots of variables (i.e. twenty thousand ), you think proc means; _numeric_ ; can hold so many variables? , and more ,you are using proc transpose which is undisputed way to slow down . Therefore, I don't think your code is better or faster than mine.