DATA Step, Macro, Functions and more

Summary stats

Reply
Occasional Contributor
Posts: 17

Summary stats

I have 100 Variable in my dataset. I need to find n nmiss min max mean median for each variable..

When I tried writing proc means/summary with output statement it is giving only one record in out dataset(don't know which variable' summary stats is for). Is it not possible to get summary stats in to a dataset for all variables with one proc statement without using macros to pass each variable and appending it to the base table.

please help me on the same.

Sivaji.

Respected Advisor
Posts: 4,925

Re: Summary stats

Use autoname :

Proc means data=myData;

var _numeric_;

outpur out=myStats n= nmiss= min= max= mean= median= / autoname;

run;

PG

PG
Super Contributor
Posts: 1,636

Re: Summary stats

should be:

output out=myStats n= nmiss= min= max= mean= median= / autoname;

PG, have you made changes to your Excel macro?  Thanks - Linin

Respected Advisor
Posts: 4,925

Re: Summary stats

Should be: Linlin Smiley Wink!

Yes, I solved a few problems with the macro. Last update was on 2012-05-20. Please let me know, if you try it again. Thanks.

Regards

PG

PG
Super Contributor
Posts: 1,636

Re: Summary stats

Hi PG,

I downloaded your Macro a few minutes ago, the sas code has a modified date 05/17/2012. are you sure you put your updated version on line?

I have sas 9.3, Windons 7 and office 2010. Below is the log file.

31   %macro

31 ! ExcelLoad(ExcelPath=help,dsn=%scan(&SYSDSN,2),sheet=&dsn,lib=%scan(&SYSDSN,1))

31 ! ;

32   %let _dsn=&dsn;

33   %let _lib=&lib;

34   %let _sheet=&sheet;

35   %let _ExcelPath=&ExcelPath;

36

37   %if %quote(&_ExcelPath)=help %then %do;

38       %put NOTE: Use this macro to copy a SAS dataset with more than 255

38 ! variables to an Excel workbook.;

39       %put NOTE: Usage : ExcelLoad(ExcelPath=, lib=, dsn=, sheet=);

40       %put NOTE: ExcelPath : File path to Excel workbook;

41       %put NOTE: lib : SAS library (default: library of most recently created

41 ! dataset);

42       %put NOTE: dsn : SAS dataset name (default: most recently created dataset)

42 ! ;

43       %put NOTE: sheet : excel workbook sheet name to create (default: same as

43 ! dataset name);

44       %put NOTE: You cannot overwrite an existing sheet with ExcelLoad.;

45   %end;

46   %else %do;

47

48   %let colBlock=200;

49   %let _SYSDSN=&SYSDSN;

50

51   libname _xl Excel "&_ExcelPath";

52

53   proc sql noprint;

54   select count(*) into :sheetExists from

55   dictionary.tables where libname="_XL" and

56       (upcase(memname)=upcase("&_sheet.") or

56 ! upcase(memname)=upcase("&_sheet.$"));

57   quit;

58

59   %if &sheetExists %then %do;

60       %put ERROR: Sheet &_sheet already exists in workbook &_ExcelPath.;

61       %put ERROR: You must delete sheet &_sheet in Excel or choose another sheet

61 !  name.;

62   %end;

63   %else %do;

64       option nonotes;

65       proc fcmp outlib=sasuser.fcmp.excel;

66           function _ExcelCol(colNo) $4;

67               n = max(int(colNo),1);

68               length colStr $4;

69               colStr = "";

70               do while(n>0);

71                   m = mod(n-1,26);

72                   colStr = cat(byte(65+m),colStr);

73                   n = int((n-m-1)/26);

74               end;

75               return (trim(colStr));

76           endsub;

77       run;

78       data _null_; call symput("restoreCMPLIB",getoption("CMPLIB","keyword"));

78 ! run;

79       options append=(cmplib=sasuser.fcmp);

80

81       data _null_;

82           call symput("nbVars", put(attrn(open("&_lib..&_dsn."),"NVARS"),6.0));

83           call symput("nbLines",

83 ! put(1+attrn(open("&_lib..&_dsn."),"NLOBS"),8.0));

84       run;

85

86       %let vBeg=1;

87       %do %while(&vBeg<=&nbVars);

88           %if &vBeg+&colBlock-1>&nbVars

89               %then %let vEnd=&nbVars;

90               %else %let vEnd=%eval(&vBeg+&colBlock-1);

91

92           proc sql noprint;

93               select name into :vars separated by ","

94               from dictionary.columns

95               where libname=upcase("&_lib") and upcase(memname)=upcase("&_dsn")

96               and varnum between &vBeg and &vEnd ;

97           quit;

98           %if &vBeg=1

99               %then %let tName=&_sheet;

100              %else %let

100! tName=&_sheet.$%sysfunc(compress(%sysfunc(_ExcelCol(&vBeg))1:%sysfunc(_ExcelCo

100! l(&vEnd))&nbLines));

101          proc sql;

102              create table _xl."&tName."n as select &vars. from &_lib..&_dsn.;

103          quit;

104          %let vBeg=%eval(&vBeg+&colBlock);

105      %end;

106      proc fcmp outlib=sasuser.fcmp.Excel;

107          deletefunc Excel._ExcelCol;

108      run;

109      options &restoreCMPLIB.;

110      option notes;

111      %put NOTE: Dataset &_lib..&_dsn was copied to sheet &_sheet in workbook

111! &_ExcelPath..;

112  %end;

113  libname _xl clear;

114  %let SYSDSN=&_SYSDSN;

115  %end;

116  %mend ExcelLoad;

117  data test;

118  array _v(*) var1-var300;

119   do i=1 to 300;

120   _v(i)=i;

121   end;

122   run;

NOTE: The data set WORK.TEST has 1 observations and 301 variables.

NOTE: DATA statement used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

123   %ExcelLoad(ExcelPath=c:\temp\forum\myFile.xlsx, lib=work, dsn=test,

123! sheet=mySheet);

NOTE: Libref _XL was successfully assigned as follows:

      Engine:        EXCEL

      Physical Name: c:\temp\forum\myFile.xlsx

NOTE: PROCEDURE SQL used (Total process time):

      real time           0.16 seconds

      cpu time            0.03 seconds

ERROR: Error attempting to CREATE a DBMS table. ERROR: Execute: This table

       contains cells that are outside the range of cells defined in this

       spreadsheet..

WARNING: File deletion failed for _XL.'mySheet$GS1:KO2'n.DATA.

NOTE: Dataset work.test was copied to sheet mySheet in workbook

c:\temp\forum\myFile.xlsx.

NOTE: Libref _XL has been deassigned.

Respected Advisor
Posts: 4,925

Re: Summary stats

That's the old version. Did you get it from

https://communities.sas.com/docs/DOC-1260 ?

When I see it, it says modified on 2012-05-20 and is open to everyone. I just deleted the two archived versions. So, it should now be impossible to get an older version now. Thank you again Linlin.

PG

PG
Super Contributor
Posts: 1,636

Re: Summary stats

Hi PG,

I tried but no luck. Maybe I am doing something wrong. I will email you the log file

https://communities.sas.com/docs/DOC-1260

Thanks - Linlin

Occasional Contributor
Posts: 17

Re: Summary stats

Thanks for the help and it is giving it all columns info in a single row.. I need it by different rows for each variable.

Respected Advisor
Posts: 4,925

Re: Summary stats

If it's a table you want, then use proc tabulate :

proc tabulate data =myData format=best7.;

var _numeric_;

table (_numeric_),(n nmiss mean min max median);

run;

PG

PG
Respected Advisor
Posts: 3,799

Re: Summary stats

Have you looked at the OUT= data from this TABULATE or even the ODS OUTPUT <don't know the name>=stats;

Respected Advisor
Posts: 4,925

Re: Summary stats

Posted in reply to data_null__

No I hadn't. But now that I do, I don't get the results that I expected.

  1. The OUT= dataset contains only 1 obs with a variable for each combination of variable and statistic (i.e. one variable per cell in the table)
  2. The ODS OUTPUT Table=myTable statement genetrates the WARNING "Data set definition for 'Table' has been overridden." and no dataset.

PROC TABULATE seems better at producing report tables than datasets.

PG

PG
Respected Advisor
Posts: 3,799

Re: Summary stats

If you remove the OUT= then ODS OUTPUT will work.  However they are both the same, mostly.

Super User
Posts: 10,028

Re: Summary stats

It seems that proc tabulate is not so smart. Use SASHELP.CLASS as an example.

%macro mean(tname,vname);
Proc means data=&tname noprint;
var &vname ;
output out=temp(drop=_:) n=n nmiss=nmiss min=min max=max mean=mean median=median ;
run;
data temp;length vname $ 40;set temp;vname="&vname";run;
proc append base=want data=temp force;run;
%mend mean;
data _null_;
 set sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' and type='num'))  ;
 call execute(cats('%mean(',libname,'.',memname,',',name,')'));
run;



Ksharp

Respected Advisor
Posts: 3,799

Re: Summary stats

Looping over variables in a data set as you suggest is generally a bad idea.  When the number of variables and obs is relatively small performance is not too bad but as either grow performance can suffer significantly.

In general it is better to let SAS summarize many variables at once.  It is very good at that.  However there is the problem of the output not in the format that is most pleasing.  Of course that can be remedied more efficiently after the data are summarized as the number of obs to be process will be reduced.

proc summary data=sashelp.heart nway;

   class sex;

   var _numeric_;

   output out=Stats0 n= nmiss= min= max= mean= median= / autoname;

   run;

proc transpose data=stats0 out=stats1;

   by sex _type_ _freq_;

   run;

data stats1;

   set stats1;

   call scan(_name_,-1,p,l,'_');

   length Variable $32 Statistic $8;

   Variable  = substr(_name_,1,p-2);

   Statistic = substr(_name_,p);

   run;

proc sort data=stats1;

   by Sex Variable;

   run;

proc transpose data=stats1 out=stats2(drop=_name_);

   by Sex Variable;

   var col1;

   id Statistic;

   idlabel Statistic;

   run;

Proc print;

   run;

Super User
Posts: 10,028

Re: Summary stats

Posted in reply to data_null__

NULL,

No, I don't think your code is good one. Since you mentioned lots of variables in a table.

In your code, every variable will generate four new variables, when a table has lots of variables (i.e. twenty thousand ), you think proc means; _numeric_ ; can hold so many variables? , and more ,you are using proc transpose which is undisputed way to slow down . Therefore, I don't think your code is better or faster than mine.

Ksharp

Ask a Question
Discussion stats
  • 20 replies
  • 956 views
  • 3 likes
  • 7 in conversation