I have 100 Variable in my dataset. I need to find n nmiss min max mean median for each variable..
When I tried writing proc means/summary with output statement it is giving only one record in out dataset(don't know which variable' summary stats is for). Is it not possible to get summary stats in to a dataset for all variables with one proc statement without using macros to pass each variable and appending it to the base table.
please help me on the same.
Sivaji.
Use autoname :
Proc means data=myData;
var _numeric_;
outpur out=myStats n= nmiss= min= max= mean= median= / autoname;
run;
PG
should be:
output out=myStats n= nmiss= min= max= mean= median= / autoname;
PG, have you made changes to your Excel macro? Thanks - Linin
Should be: Linlin !
Yes, I solved a few problems with the macro. Last update was on 2012-05-20. Please let me know, if you try it again. Thanks.
Regards
PG
Hi PG,
I downloaded your Macro a few minutes ago, the sas code has a modified date 05/17/2012. are you sure you put your updated version on line?
I have sas 9.3, Windons 7 and office 2010. Below is the log file.
31 %macro
31 ! ExcelLoad(ExcelPath=help,dsn=%scan(&SYSDSN,2),sheet=&dsn,lib=%scan(&SYSDSN,1))
31 ! ;
32 %let _dsn=&dsn;
33 %let _lib=&lib;
34 %let _sheet=&sheet;
35 %let _ExcelPath=&ExcelPath;
36
37 %if %quote(&_ExcelPath)=help %then %do;
38 %put NOTE: Use this macro to copy a SAS dataset with more than 255
38 ! variables to an Excel workbook.;
39 %put NOTE: Usage : ExcelLoad(ExcelPath=, lib=, dsn=, sheet=);
40 %put NOTE: ExcelPath : File path to Excel workbook;
41 %put NOTE: lib : SAS library (default: library of most recently created
41 ! dataset);
42 %put NOTE: dsn : SAS dataset name (default: most recently created dataset)
42 ! ;
43 %put NOTE: sheet : excel workbook sheet name to create (default: same as
43 ! dataset name);
44 %put NOTE: You cannot overwrite an existing sheet with ExcelLoad.;
45 %end;
46 %else %do;
47
48 %let colBlock=200;
49 %let _SYSDSN=&SYSDSN;
50
51 libname _xl Excel "&_ExcelPath";
52
53 proc sql noprint;
54 select count(*) into :sheetExists from
55 dictionary.tables where libname="_XL" and
56 (upcase(memname)=upcase("&_sheet.") or
56 ! upcase(memname)=upcase("&_sheet.$"));
57 quit;
58
59 %if &sheetExists %then %do;
60 %put ERROR: Sheet &_sheet already exists in workbook &_ExcelPath.;
61 %put ERROR: You must delete sheet &_sheet in Excel or choose another sheet
61 ! name.;
62 %end;
63 %else %do;
64 option nonotes;
65 proc fcmp outlib=sasuser.fcmp.excel;
66 function _ExcelCol(colNo) $4;
67 n = max(int(colNo),1);
68 length colStr $4;
69 colStr = "";
70 do while(n>0);
71 m = mod(n-1,26);
72 colStr = cat(byte(65+m),colStr);
73 n = int((n-m-1)/26);
74 end;
75 return (trim(colStr));
76 endsub;
77 run;
78 data _null_; call symput("restoreCMPLIB",getoption("CMPLIB","keyword"));
78 ! run;
79 options append=(cmplib=sasuser.fcmp);
80
81 data _null_;
82 call symput("nbVars", put(attrn(open("&_lib..&_dsn."),"NVARS"),6.0));
83 call symput("nbLines",
83 ! put(1+attrn(open("&_lib..&_dsn."),"NLOBS"),8.0));
84 run;
85
86 %let vBeg=1;
87 %do %while(&vBeg<=&nbVars);
88 %if &vBeg+&colBlock-1>&nbVars
89 %then %let vEnd=&nbVars;
90 %else %let vEnd=%eval(&vBeg+&colBlock-1);
91
92 proc sql noprint;
93 select name into :vars separated by ","
94 from dictionary.columns
95 where libname=upcase("&_lib") and upcase(memname)=upcase("&_dsn")
96 and varnum between &vBeg and &vEnd ;
97 quit;
98 %if &vBeg=1
99 %then %let tName=&_sheet;
100 %else %let
100! tName=&_sheet.$%sysfunc(compress(%sysfunc(_ExcelCol(&vBeg))1:%sysfunc(_ExcelCo
100! l(&vEnd))&nbLines));
101 proc sql;
102 create table _xl."&tName."n as select &vars. from &_lib..&_dsn.;
103 quit;
104 %let vBeg=%eval(&vBeg+&colBlock);
105 %end;
106 proc fcmp outlib=sasuser.fcmp.Excel;
107 deletefunc Excel._ExcelCol;
108 run;
109 options &restoreCMPLIB.;
110 option notes;
111 %put NOTE: Dataset &_lib..&_dsn was copied to sheet &_sheet in workbook
111! &_ExcelPath..;
112 %end;
113 libname _xl clear;
114 %let SYSDSN=&_SYSDSN;
115 %end;
116 %mend ExcelLoad;
117 data test;
118 array _v(*) var1-var300;
119 do i=1 to 300;
120 _v(i)=i;
121 end;
122 run;
NOTE: The data set WORK.TEST has 1 observations and 301 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
123 %ExcelLoad(ExcelPath=c:\temp\forum\myFile.xlsx, lib=work, dsn=test,
123! sheet=mySheet);
NOTE: Libref _XL was successfully assigned as follows:
Engine: EXCEL
Physical Name: c:\temp\forum\myFile.xlsx
NOTE: PROCEDURE SQL used (Total process time):
real time 0.16 seconds
cpu time 0.03 seconds
ERROR: Error attempting to CREATE a DBMS table. ERROR: Execute: This table
contains cells that are outside the range of cells defined in this
spreadsheet..
WARNING: File deletion failed for _XL.'mySheet$GS1:KO2'n.DATA.
NOTE: Dataset work.test was copied to sheet mySheet in workbook
c:\temp\forum\myFile.xlsx.
NOTE: Libref _XL has been deassigned.
That's the old version. Did you get it from
https://communities.sas.com/docs/DOC-1260 ?
When I see it, it says modified on 2012-05-20 and is open to everyone. I just deleted the two archived versions. So, it should now be impossible to get an older version now. Thank you again Linlin.
PG
Hi PG,
I tried but no luck. Maybe I am doing something wrong. I will email you the log file
https://communities.sas.com/docs/DOC-1260
Thanks - Linlin
Thanks for the help and it is giving it all columns info in a single row.. I need it by different rows for each variable.
If it's a table you want, then use proc tabulate :
proc tabulate data =myData format=best7.;
var _numeric_;
table (_numeric_),(n nmiss mean min max median);
run;
PG
Have you looked at the OUT= data from this TABULATE or even the ODS OUTPUT <don't know the name>=stats;
No I hadn't. But now that I do, I don't get the results that I expected.
PROC TABULATE seems better at producing report tables than datasets.
PG
If you remove the OUT= then ODS OUTPUT will work. However they are both the same, mostly.
It seems that proc tabulate is not so smart. Use SASHELP.CLASS as an example.
%macro mean(tname,vname); Proc means data=&tname noprint; var &vname ; output out=temp(drop=_:) n=n nmiss=nmiss min=min max=max mean=mean median=median ; run; data temp;length vname $ 40;set temp;vname="&vname";run; proc append base=want data=temp force;run; %mend mean; data _null_; set sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' and type='num')) ; call execute(cats('%mean(',libname,'.',memname,',',name,')')); run;
Ksharp
Looping over variables in a data set as you suggest is generally a bad idea. When the number of variables and obs is relatively small performance is not too bad but as either grow performance can suffer significantly.
In general it is better to let SAS summarize many variables at once. It is very good at that. However there is the problem of the output not in the format that is most pleasing. Of course that can be remedied more efficiently after the data are summarized as the number of obs to be process will be reduced.
proc summary data=sashelp.heart nway;
class sex;
var _numeric_;
output out=Stats0 n= nmiss= min= max= mean= median= / autoname;
run;
proc transpose data=stats0 out=stats1;
by sex _type_ _freq_;
run;
data stats1;
set stats1;
call scan(_name_,-1,p,l,'_');
length Variable $32 Statistic $8;
Variable = substr(_name_,1,p-2);
Statistic = substr(_name_,p);
run;
proc sort data=stats1;
by Sex Variable;
run;
proc transpose data=stats1 out=stats2(drop=_name_);
by Sex Variable;
var col1;
id Statistic;
idlabel Statistic;
run;
Proc print;
run;
NULL,
No, I don't think your code is good one. Since you mentioned lots of variables in a table.
In your code, every variable will generate four new variables, when a table has lots of variables (i.e. twenty thousand ), you think proc means; _numeric_ ; can hold so many variables? , and more ,you are using proc transpose which is undisputed way to slow down . Therefore, I don't think your code is better or faster than mine.
Ksharp
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.