Help using Base SAS procedures

comments about how can I simplify long code (demog table)

Reply
Super Contributor
Posts: 301

comments about how can I simplify long code (demog table)

Hi guys, in order of reproducing the next table (forget Obs, only col0 placebo drug and total), I wrote the code below .

Obscol0                 placebo               drug                 total

   1No of Subjects          19                   21                   40
   2race
   3Oriental                 00.00%           14.76%           12.50%
   4Asian                    15.26%           00.00%           12.50%
   5Black                    7   36.84%           7   33.33%          14   35.00%
   6Caucasian               11   57.89%          13   61.90%          24   60.00%
   7sex
   8Female                   9   47.37%          10   47.62%          19   47.50%
   9Male                    10   52.63%          11   52.38%          21   52.50%
  10Height
  11MIN           137.5                133.3                133.3
  12MEAN          150.4                147.6                148.9
  13MAX           162.4                154.1                162.4
  14N             19                   21                   40
  15STD           7.09                 5.40                 6.34
  16Age
  17MIN           11.0                 19.0                 11.0
  18MEAN          25.6                 30.8                 28.3
  19N             19                   21                   40
  20MAX           50.0                 56.0                 56.0
  21STD           8.78                 8.33                 8.84
  22Weight
  23N             19                   21                   40
  24MIN           52.7                 54.8                 52.7
  25MEAN          64.3                 66.8                 65.6
  26STD           7.50                 7.45                 7.48
  27MAX           75.5                 85.1                 85.1

Please, could you tell me if I could simplify some step in the next code  (because, I think is a bit long, and maybe I could avoid some steps)?

The code use three macros:

1) tablesc...calculate sex, and race

2) statx...calculate Height, Weight and Age.

3) transpos...Transposing the tables.

.....

procedures used:

1)  proc sql:

a)in order to summarize the variables in (placebo, drug and total) avoiding proc freq. and proc transpose.

b)add extra row

2)proc means:..to calculate min, max, mean and std

3)proc transpose: to get the right structure of the final table

4) proc append: to concatenate tables.

-------------------------------------------------

Source dataset newtot:

                  data newtot;

                  length sex race trt $20;

                  length subno weight height age 8;

                  input subno    weight    height     sex      race   age    trt;

                  datalines;

                   1      63.7     141.85    Male      Caucasian     19    Drug

                   2      72.6     149.22    Female    Black         27    Placebo

                   3      60.7     147.64    Female    Caucasian     26    Placebo

                   4      64.2     150.93    Male      Caucasian     21    Placebo

                   5      60.6     153.56    Male      Asian         41    Placebo

                   6      66.8     139.30    Male      Caucasian     50    Placebo

                   7      75.5     147.93    Female    Caucasian     24    Drug

                   8      55.8     137.51    Female    Caucasian     21    Placebo

                   9      73.7     146.30    Male      Caucasian     28    Drug

                  10      68.2     133.28    Male      Caucasian     24    Drug

                  11      56.3     148.04    Female    Caucasian     30    Drug

                  12      60.7     141.53    Male      Black         29    Drug

                  13      63.2     147.19    Female    Caucasian     21    Placebo

                  14      74.7     155.35    Male      Black         20    Placebo

                  15      70.8     152.61    Female    Caucasian     33    Drug

                  16      71.4     147.75    Male      Black         37    Drug

                  17      60.1     162.40    Female    Caucasian     28    Placebo

                  18      53.4     152.24    Male      Caucasian     19    Placebo

                  19      65.4     142.22    Male      Caucasian     37    Drug

                  20      52.7     161.64    Female    Black         34    Placebo

                  21      57.9     153.62    Female    Black         33    Drug

                  22      62.2     147.84    Male      Black         29    Drug

                  23      72.9     139.20    Male      Black         21    Placebo

                  24      67.4     143.55    Male      Caucasian     20    Placebo

                  25      69.4     154.10    Female    Caucasian     31    Drug

                  26      72.1     149.09    Male      Caucasian     36    Drug

                  27      64.6     152.56    Female    Black         25    Placebo

                  28      75.1     155.37    Female    Caucasian     19    Placebo

                  29      63.1     153.64    Female    Black         37    Drug

                  30      54.8     149.17    Female    Black         34    Drug

                  31      75.5     149.39    Female    Black         26    Placebo

                  32      72.7     149.27    Male      Caucasian     29    Drug

                  33      68.2     149.57    Male      Caucasian     11    Placebo

                  34      69.0     152.04    Female    Caucasian     20    Drug

                  35      57.9     151.13    Male      Caucasian     31    Placebo

                  36      61.5     138.67    Male      Caucasian     56    Drug

                  37      55.5     159.60    Male      Black         25    Placebo

                  38      71.0     148.42    Male      Oriental      19    Drug

                  39      85.1     152.91    Female    Black         37    Drug

                  40      58.2     148.92    Female    Caucasian     24    Drug

                  ;

                  run;

My Code:

---------------------------------------------------------------------------------------------------------------------------------------------

options symbolgen mprint mlogic;

  *using proc sql to summarize sex and race according to the final table....avoiding proc freq and proc transpose;

  proc sql;

  create table table0n as

  select 'No of Subjects' as col0, sum(trt='Drug') as Drug, sum(trt='Placebo') as Placebo, count(trt) as total

  from newtot;

  quit;

  *creating global macrovariables for porcentages;

  data null;

  set table0n;

  call symput('drugx',drug);

  call symput('placebox',placebo);

  call symput('totalx',total);

  run;

  %put &drugx;

  %put &placebox;

  %put &totalx; 

  *convert placebo, drug and total to character variables;

  data table0c (rename=(drugc=drug placeboc=placebo totalc=total)drop=drug placebo total);

  set table0n;

  drugc=put(drug,best12.);

  placeboc=put(placebo,best12.);

  totalc=put(total,best12.);

  run;

*macro for generate tablecx, x=1 sex, x=2 race ;

  %macro tablesc(num=,var=);

proc sql;

  create table table&num as

  select &var as col0, sum(trt='Drug') as Drug, sum(trt='Placebo') as Placebo, count(trt) as total

  from newtot

  group by &var

  order by &var descending;

  quit;

*change numeric variables to character variables;

data tablec&num  (rename=(drugc=drug placeboc=placebo totalc=total)drop=drug placebo total);

  length drugc $40 placeboc $40 totalc $40;

  set table#

  drugc=put(drug,best12.)||' ' || put(drug/&drugx,percent9.2);

  placeboc=put(placebo,best12.)||' ' || put(placebo/&placebox,percent9.2);

  totalc=put(total,best12.)||' ' || put(total/&totalx,percent9.2);

  run;

*Adding new record (Header of table);

  proc sql;

insert into tablec&num

(Drug,Placebo, total, Col0)

values

(' ',' ', ' ', "&var");

quit;

proc sort data=tablec# by total; run; *Fixing the Header at beginning of the table;

*accumulating tables sex and race to get the final table;

proc append base=name0ok data=tablec# run;

%mend;

*executing macro to get the table(sex,race) = tablectotal;

%tablesc(num=1,var=race);

%tablesc(num=2,var=sex);

*macro to generate stat table tablestatx, x=1 height, x=2 age, x=3 weight;

%macro statx(var=,out=);

*sorting the source table by trt before use proc means;

proc sort data=newtot out=newtots;

by trt;

run;

*proc means by trt;

proc means data=newtots noprint;

var &var;

output out=newstat (rename=(_stat_=col0) drop=_type_ _freq_);

by trt;

run;

*proc means (total);

proc means data=newtots noprint;

var &var;

output out=newstat2(rename=(_stat_=col0) drop=_type_ _freq_);

run;

*concatenaiting the tables (total+drug);

data totnewstat;

set newstat newstat2;

if trt= ' ' then trt='Total';

run;

*sorting these tables to use proc transpose;

proc sort data=totnewstat out=totnewstat&out (rename=(&var=col1));

by  col0;

run;

%mend;

%statx(var=height,out=h);

%statx(var=weight,out=w);

%statx(var=age,out=a);

*In this macro, we will traspose the (drug+total) statistical tables;

%macro transpos(in=,name=,var=);

*formating the these tables;

data newstatformat (drop=col1);

set totnewstat∈

if col0 in ('MEAN','MAX','MIN') then value=strip(put(col1,6.1));

else if col0 in ('STD') then value=strip(put(col1,6.2));

else value=strip(put(col1,6.0));

run;

*transposing these tables;

proc transpose data=newstatformat out=&name (drop=_name_);

by  col0;

id trt;

var value;

run;

*Adding new record (Header of table);

proc sql;

insert into &name

(Drug,Placebo,Total,col0)

values

(' ',' ', ' ', "&var");

quit;

proc sort data=&name; by total; run; *Fixing the Header at beginning of the table;

*accumulating tables sex and race to get the final table;

proc append base=nameok data=&name; run;

%mend;

%transpos(in=h,name=newhstat,var=Height);

%transpos(in=a,name=newastat,var=Age);

%transpos(in=w,name=newwstat,var=Weight);

data table;

length col0 $40 drug $40 placebo $40 total $40;

set table0c Name0ok Nameok;

run;

proc sql;

create table tabledemog as

select col0 label '------', placebo, drug, total

from table;

quit;

proc print data=tabledemog; run;

-------------------------------------------------------------------------------------

Thanks.

V

Respected Advisor
Posts: 3,799

Re: comments about how can I simplify long code (demog table)

Don't be loopy. You are looping on variable names that's not good.

Process all discrete variables at once (one call to PROC FREQ, or method of your choice).  Process all continuous variables at once (one call to PROC SUMMARY).  This may actually make the code seem more complex but it will be more flexible in the long term.

Make everything more dynamic and data driven.  Let the program figure out how many.  You can write a program that accepts a few macro variable as input to make this table.

data=

continuous=

discrete=

trt=

variableOrderInOutput=

Use existing meta data LABEL and FORMAT to label rows and columns.

Derive decimal precision for statistics from the format D of the associated variable.

Don't put summary statistic into macro variables.

Super Contributor
Posts: 301

Re: comments about how can I simplify long code (demog table)

Posted in reply to data_null__

Hi Data_null,  please, could you write a short code more efficient for this table?

I can learn from there.

Thanks.

V.

Respected Advisor
Posts: 3,799

Re: comments about how can I simplify long code (demog table)

You could use something like this..

proc summary data=sashelp.class;

   class sex;

   var age height weight;

   output out=stats;

   run;

This would give you all the continuous variables summarized in one pass of the data.  Then you can figure ways to transform/transpose the data into the stats in rows structure you need.  When I do this I also convert the statistics to character using the format+decimal offset for each statistic.  N=Integer, MEAN STD + 1 dp, MIN MAX  + 0 same precision as variable.  You can do this with PROC TRANSPOSE by transposing the numeric statistics with a character variable to cause a conversion to character.  You will need to code gen a FORMAT statement to add the offset to the D of each variable associated format.  The W is not so important as long as it is wide enough, 16 should suffice most of the time.

Super Contributor
Posts: 301

Re: comments about how can I simplify long code (demog table)

Posted in reply to data_null__

Thanks, it helps.

My idea of using macro was to make it more simple if you have many discrete (class variables) or continuous (var variables) involved.

The idea of using proc sql was to get an easy way to summarize (drug, placebo, and total), rather than: sorting, summarizing (proc summary or proc freq) and transposing, to get the same result.

Respected Advisor
Posts: 3,799

Re: comments about how can I simplify long code (demog table)

As the number of variables grows that is all the more reason why you should process them all at once instead of looping on each variable.  You don't want to keep reading the same data over and over.

Your SQL may seem simple but it has hard coded values.  It may take more code to make the program data driven but in the end it will be worth the effort.

When you get ride of all the loopy parts and start using the meta data properly and have the results data drive you can then wrap it all up into a macro that will be amazingly powerful, easy to use and worth all the effort.

Super Contributor
Posts: 301

Re: comments about how can I simplify long code (demog table)

Posted in reply to data_null__

Hi Data_null_, thanks for your comment...about your last comment

1. Please could you explain what are the loopy parts are in my code?...sorry but, I don't understand...I understand loop as a (do i=1 to x end) but, I would like to know

what you mean by loop in my code.

2. meta data?...again, what does "meta data" mean?

3. and one last question ...wrap it all up in a macro that will be amazing?...sounds good but I would like an example to understand it better.

Thank you very much for your time.

V

Respected Advisor
Posts: 3,799

Re: comments about how can I simplify long code (demog table)

michtka wrote:

Hi Data_null_, thanks for your comment...about your last comment

1. Please could you explain what are the loopy parts are in my code?...sorry but, I don't understand...I understand loop as a (do i=1 to x end) but, I would like to know

what you mean by loop in my code.

%transpos(in=h,name=newhstat,var=Height);

%transpos(in=a,name=newastat,var=Age);

%transpos(in=w,name=newwstat,var=Weight);
This is loopy.  You read all the input data for each variable.  You need to use ONE call to PROC SUMMARY to get summary stats for all variable at one time.  Similar for discrete variable do them all at once.


2. meta data?...again, what does "meta data" mean? LABELS and FORMATS

3. and one last question ...wrap it all up in a macro that will be amazing?...sounds good but I would like an example to understand it better. You code is not ready to become a macro yet.

Thank you very much for your time.

V

Super Contributor
Posts: 301

Re: comments about how can I simplify long code (demog table)

Posted in reply to data_null__

Interesting...thanks.

For example, if you see  the macro %tablesc for the discrete variables:

proc sql: I summarize the discrete variable (drug, placebo and total) in a table&num (being num the parameter I change in the macro)

dataset: I change from numeric to character  format in a tablec&num

proc sql:  I add a new record (header of the tables)

I think, maybe is a long code, but it is clear what I am doing.

Maybe you are right, but I do not find at the moment a more easy way to make (for example) this discrete table.

Respected Advisor
Posts: 3,799

Re: comments about how can I simplify long code (demog table)

For discrete variables considers and approach that uses the output from PROC FREQ.

data class;

   set sashelp.class;

   trt = rantbl(123,.5);

   run;

ods output CrossTabFreqs=Counts;

proc freq data=class;

   tables trt*(age sex);

   run;

ods output close;

proc print data=counts;

   run;

Super Contributor
Posts: 301

Re: comments about how can I simplify long code (demog table)

Posted in reply to data_null__

Dear data_null_

I know what you mean...but I prefer use proc sql for this demog table (placebo drug total).

Just now, I understood to use it for summarizing avoiding (proc sort proc freq and proc transpose).

I know these two procedure are more intuitive to visualize, but proc sql make the job in a mechanical form.

Thank you for your help. I will keep in mine your expert terminology.

Thanks,

V

Ask a Question
Discussion stats
  • 10 replies
  • 434 views
  • 6 likes
  • 2 in conversation