Help using Base SAS procedures

sum up across columns and have a criteria

Reply
Super Contributor
Posts: 401

sum up across columns and have a criteria

I'm having trouble doing this, please help.

I have about 20 columns (in no particular sequence, eg. COL1=MGF3654, COL2=NTD8475, etc.) and there are about 30 number of rows..with numeric data in the cells. I want to check which columns total to 1 so I can delete thos clumns, they are of no need).  Thanks

Have:

         

                 COL1     COL2 ... COL20

2001.01       5           0           23

2001.02       89         0           4

.

2006.12       12         0          8

if COL total to 1 then I want to delete it.

Want:

                 COL1     COL3..... COL20

2001.01       5             45             23

2001.02       89           2               4

.

2006.12       12            6              8

Trusted Advisor
Posts: 1,301

sum up across columns and have a criteria

data foo;

input dte $ mgf3654 emtcol1 ntd8475 col3;

cards;

2001.01 5 0 23 45

2001.02 89 0 4 2

2006.12 12 0 8 6

;

run;

data _null_;

length names nsums $32000;

retain names nsums;

set sashelp.vcolumn(where=(libname='WORK' and memname='FOO' and type='num')) end=eof;

names=catx(' ',of names name);

nsum=strip(name) || '_sum';

nsums=catx(' ',of nsums nsum);

if eof then

  do;

   call symputx('names',names);

   call symputx('nsums',nsums);

   call symputx('vars',put(_n_,best.));

  end;

run;

%put &vars &names &nsums;

data _null_;

set foo end=eof;

array v

  • &names;
  • array s

  • &nsums;
  • do i=1 to &vars;

      s+v;

    end;

    length drops $32000;

    retain drops;

    if eof then

      do;

       do i=1 to &vars;

        if s=0 then

         do;

          drop=scan("&names",i,' ');

          drops=catx(' ',of drops drop);

         end;

       end;

       call symputx('drops',drops);

      end;

    run;

    proc sql noprint;

    alter table foo

      drop column &drops;

    quit;

    Super Contributor
    Posts: 401

    sum up across columns and have a criteria

    Thanks FriedEgg, to be honest it's a little confusing and it didn't quite work.. maybe it's something 'm doing. But basically looking to get rid of all columns that do't have any positive numbers in them.. thanks

    PROC Star
    Posts: 7,487

    sum up across columns and have a criteria

    You will have to be clearer regarding what you have and what you want.  Don't abbreviate the have in your example.  Rather, show an example with ONLY three variables and a WANT with those same variables. YOU must make it clear as to what you have and what you want.

    Trusted Advisor
    Posts: 1,301

    Re: sum up across columns and have a criteria

    I will explain so maybe you can work out the issue you have.

    First I create some data to work with.

    data foo;

    input dte $ mgf3654 emtcol1 ntd8475 col3;

    cards;

    2001.01 5 0 23 45

    2001.02 89 0 4 2

    2006.12 12 0 8 6

    ;

    run;

    I have a character field (dte) and four numeric variables without a standard naming convention (mgf3654 emtcol1 ntd8475 col3).

    The first thing I want to do is collect the metadata I want to work with.

    data _null_;

    length names nsums $32000;

    retain names nsums;

    set sashelp.vcolumn(where=(libname='WORK' and memname='FOO' and type='num')) end=eof;

    names=catx(' ',of names name);

    nsum=strip(name) || '_sum';

    nsums=catx(' ',of nsums nsum);

    if eof then

      do;

       call symputx('names',names);

       call symputx('nsums',nsums);

       call symputx('vars',put(_n_,best.));

      end;

    run;

    I will collect, dynamically the names of my numeric variables.

    I will create a second copy of these names with '_sum' added to them for a calculation step next.

    I will collect the number of variables I'm working with into macro variable 'vars'

    Now I will run a sum for all columns.  If the columns sum is 0 then I will want to drop that variable.

    data _null_;

    set foo end=eof;

    array v

  • &names;
  • array s

  • &nsums;
  • do i=1 to &vars;

      s+v;

    end;

    length drops $32000;

    retain drops;

    if eof then

      do;

       do i=1 to &vars;

        if s=0 then

         do;

          drop=scan("&names",i,' ');

          drops=catx(' ',of drops drop);

         end;

       end;

       call symputx('drops',drops);

      end;

    run;

    Create arrays using the variable lists I collected in the previous step

    The first loop will calculate the sums for each column

    When I reach the last record in the file I will loop through each variable again.  If I have calculated this columns sum to be 0 I will collect the name of the variable into a macro variable 'drops' to use later.

    Now that I have found all the variables I want to drop I will perform the action.

    proc sql noprint;

    alter table foo

      drop column &drops;

    quit;

    Super User
    Super User
    Posts: 7,074

    Re: sum up across columns and have a criteria

    Why would you want to DELETE the variables?  Perhaps you mean that you want to exclude those variables from later analysis?

    What are you ultimately doing with the data?

    To sum a "column" use PROC SUMMARY.

    proc summary data=have;

    var _numeric_;

      output out=stats(drop=_type_ _freq_) sum=;

    run;

    Then use PROC TRANSPOSE to rotate so that each row represents a variable.

    proc transpose data=stats out=vars (rename=(col1=sum)) ;

    run;  

    Now you can use PROC SQL to create macro variables with either the list of variables to keep or those to drop.

    proc sql noprint ;

    %let keep=;

      select _name_ into :keep separated by ' '

        from vars

        where sum ne 1

      ;

    %let drop=;

      select _name_ into :drop separated by ' '

         from vars

         where sum = 1

      ;

    quit;

    Now you can write code to use the original dataset and select which group of variables you want.

    proc print data=have ;

       var &keep ;

    run;

    data want ;

       set have;

       drop &drop;

    run;

    Super Contributor
    Posts: 401

    sum up across columns and have a criteria

    Brilliant.. Thanks guys.. I really appreciate it..

    Ask a Question
    Discussion stats
    • 6 replies
    • 2750 views
    • 2 likes
    • 4 in conversation