How do I create a large data table with descriptive statistics for each observation?

Reply
Occasional Contributor
Posts: 5

How do I create a large data table with descriptive statistics for each observation?

Hi everyone,

My goal is to create a large data table that has descriptive statistics accompanying each observation. I am loading in a large file for which each observation has its own mean, standard deviation and percentiles. The variable that I am using to calculate all of this is named 'tapefile'. Basically what I am trying to do is create a large data table that neatly displays these statistics. I have been trying to accomplish this by saving the proc univariate command as a variable and then printing. I am new to sas so I am not entirely sure if this is even possible. Below is a sample of the data table that I seek to achieve.

ObservationMeanStd Dev50th Percentile60th Percentile70th Percentile80th Percentile90th Percentile100th Percentile
1
2
3
4

Here is a sample of my code as well:

options ls=176 ps=44;

libname master 'library location';

filename input0 'file location';

data detail;

        infile input0 lrecl=605 recfm=f;

                        input (long list of ebcdic variables);

run;

proc sort data=detail;

        by tapefile;

run;

proc univariate data=detail noprint;

        var tapefile;

        output out=res mean=mean pctlpts = 50 to 100 by 10 pctlpre=p;

run;

proc print data=detail label;

        var tapefile tab_guca_mod_1 res;

        label

        p50="50th Percentile"

        p60="60th Percentile"

        p70="70th Percentile"

        p80="80th Percentile"

        p90="90th Percentile"

        p100="100th Percentile";

run;

Thank you for any help you may be able to provide; I greatly appreciate it.

Super User
Posts: 19,806

Re: How do I create a large data table with descriptive statistics for each observation?

Assuming you're asking to add in the overall statistics rather than the statistics for "each observation has its own mean, standard deviation and percentiles".

You can calculate the stats in proc means and then merge then in using a SQL cross join.

proc means data=sashelp.class ;

var height;

output out=stats mean= std= p50= p60= p70= p80= p90= max=/autoname;

run;

proc sql;

    create table class as

    select *

    from sashelp.class

    cross join stats;

quit;

Super User
Posts: 11,343

Re: How do I create a large data table with descriptive statistics for each observation?

Are you trying to get the same set of percentiles attached to all values of tapefile? Or are you using TAPEFILE as shorthand to indicate you want to do the same thing to each variable in the (long list of ebcdic variables)? You might show some dummy output of what you expect your final table to look like.

Regular Contributor
Posts: 184

Re: How do I create a large data table with descriptive statistics for each observation?

Try this technique:

data detail ;

input tapefile larry more curly ;

cards ;

3 31 32 33

1 11 12 13

2 21 22 23

;

proc sort data=detail ;

by tapefile ;

run ;

proc transpose data=detail out=long ;

by tapefile ;

var larry more curly ;

run ;

proc univariate data=long noprint ;

by tapefile ;

output out=res mean=mean

pctlpts = 50 to 100 by 10 pctlpre=p ;

run ;

proc print data=res label ;

label

p50="50th Percentile"

p60="60th Percentile"

p70="70th Percentile"

p80="80th Percentile"

p90="90th Percentile"

p100="100th Percentile" ;

run ;

Torri92 wrote:

Hi everyone,

My goal is to create a large data table that has descriptive statistics accompanying each observation. I am loading in a large file for which each observation has its own mean, standard deviation and percentiles. The variable that I am using to calculate all of this is named 'tapefile'. Basically what I am trying to do is create a large data table that neatly displays these statistics. I have been trying to accomplish this by saving the proc univariate command as a variable and then printing. I am new to sas so I am not entirely sure if this is even possible. Below is a sample of the data table that I seek to achieve.

ObservationMeanStd Dev50th Percentile60th Percentile70th Percentile80th Percentile90th Percentile100th Percentile
1
2
3
4

Here is a sample of my code as well:

options ls=176 ps=44;

libname master 'library location';

filename input0 'file location';

data detail;

        infile input0 lrecl=605 recfm=f;

                        input (long list of ebcdic variables);

run;

proc sort data=detail;

        by tapefile;

run;

proc univariate data=detail noprint;

        var tapefile;

        output out=res mean=mean pctlpts = 50 to 100 by 10 pctlpre=p;

run;

proc print data=detail label;

        var tapefile tab_guca_mod_1 res;

        label

        p50="50th Percentile"

        p60="60th Percentile"

        p70="70th Percentile"

        p80="80th Percentile"

        p90="90th Percentile"

        p100="100th Percentile";

run;

Thank you for any help you may be able to provide; I greatly appreciate it.

Ask a Question
Discussion stats
  • 3 replies
  • 617 views
  • 0 likes
  • 4 in conversation