BookmarkSubscribeRSS Feed
podarum
Quartz | Level 8

I'm having trouble doing this, please help.

I have about 20 columns (in no particular sequence, eg. COL1=MGF3654, COL2=NTD8475, etc.) and there are about 30 number of rows..with numeric data in the cells. I want to check which columns total to 1 so I can delete thos clumns, they are of no need).  Thanks

Have:

         

                 COL1     COL2 ... COL20

2001.01       5           0           23

2001.02       89         0           4

.

2006.12       12         0          8

if COL total to 1 then I want to delete it.

Want:

                 COL1     COL3..... COL20

2001.01       5             45             23

2001.02       89           2               4

.

2006.12       12            6              8

6 REPLIES 6
FriedEgg
SAS Employee

data foo;

input dte $ mgf3654 emtcol1 ntd8475 col3;

cards;

2001.01 5 0 23 45

2001.02 89 0 4 2

2006.12 12 0 8 6

;

run;

data _null_;

length names nsums $32000;

retain names nsums;

set sashelp.vcolumn(where=(libname='WORK' and memname='FOO' and type='num')) end=eof;

names=catx(' ',of names name);

nsum=strip(name) || '_sum';

nsums=catx(' ',of nsums nsum);

if eof then

  do;

   call symputx('names',names);

   call symputx('nsums',nsums);

   call symputx('vars',put(_n_,best.));

  end;

run;

%put &vars &names &nsums;

data _null_;

set foo end=eof;

array v

  • &names;
  • array s

  • &nsums;
  • do i=1 to &vars;

      s+v;

    end;

    length drops $32000;

    retain drops;

    if eof then

      do;

       do i=1 to &vars;

        if s=0 then

         do;

          drop=scan("&names",i,' ');

          drops=catx(' ',of drops drop);

         end;

       end;

       call symputx('drops',drops);

      end;

    run;

    proc sql noprint;

    alter table foo

      drop column &drops;

    quit;

    podarum
    Quartz | Level 8

    Thanks FriedEgg, to be honest it's a little confusing and it didn't quite work.. maybe it's something 'm doing. But basically looking to get rid of all columns that do't have any positive numbers in them.. thanks

    art297
    Opal | Level 21

    You will have to be clearer regarding what you have and what you want.  Don't abbreviate the have in your example.  Rather, show an example with ONLY three variables and a WANT with those same variables. YOU must make it clear as to what you have and what you want.

    FriedEgg
    SAS Employee

    I will explain so maybe you can work out the issue you have.

    First I create some data to work with.

    data foo;

    input dte $ mgf3654 emtcol1 ntd8475 col3;

    cards;

    2001.01 5 0 23 45

    2001.02 89 0 4 2

    2006.12 12 0 8 6

    ;

    run;

    I have a character field (dte) and four numeric variables without a standard naming convention (mgf3654 emtcol1 ntd8475 col3).

    The first thing I want to do is collect the metadata I want to work with.

    data _null_;

    length names nsums $32000;

    retain names nsums;

    set sashelp.vcolumn(where=(libname='WORK' and memname='FOO' and type='num')) end=eof;

    names=catx(' ',of names name);

    nsum=strip(name) || '_sum';

    nsums=catx(' ',of nsums nsum);

    if eof then

      do;

       call symputx('names',names);

       call symputx('nsums',nsums);

       call symputx('vars',put(_n_,best.));

      end;

    run;

    I will collect, dynamically the names of my numeric variables.

    I will create a second copy of these names with '_sum' added to them for a calculation step next.

    I will collect the number of variables I'm working with into macro variable 'vars'

    Now I will run a sum for all columns.  If the columns sum is 0 then I will want to drop that variable.

    data _null_;

    set foo end=eof;

    array v

  • &names;
  • array s

  • &nsums;
  • do i=1 to &vars;

      s+v;

    end;

    length drops $32000;

    retain drops;

    if eof then

      do;

       do i=1 to &vars;

        if s=0 then

         do;

          drop=scan("&names",i,' ');

          drops=catx(' ',of drops drop);

         end;

       end;

       call symputx('drops',drops);

      end;

    run;

    Create arrays using the variable lists I collected in the previous step

    The first loop will calculate the sums for each column

    When I reach the last record in the file I will loop through each variable again.  If I have calculated this columns sum to be 0 I will collect the name of the variable into a macro variable 'drops' to use later.

    Now that I have found all the variables I want to drop I will perform the action.

    proc sql noprint;

    alter table foo

      drop column &drops;

    quit;

    Tom
    Super User Tom
    Super User

    Why would you want to DELETE the variables?  Perhaps you mean that you want to exclude those variables from later analysis?

    What are you ultimately doing with the data?

    To sum a "column" use PROC SUMMARY.

    proc summary data=have;

    var _numeric_;

      output out=stats(drop=_type_ _freq_) sum=;

    run;

    Then use PROC TRANSPOSE to rotate so that each row represents a variable.

    proc transpose data=stats out=vars (rename=(col1=sum)) ;

    run;  

    Now you can use PROC SQL to create macro variables with either the list of variables to keep or those to drop.

    proc sql noprint ;

    %let keep=;

      select _name_ into :keep separated by ' '

        from vars

        where sum ne 1

      ;

    %let drop=;

      select _name_ into :drop separated by ' '

         from vars

         where sum = 1

      ;

    quit;

    Now you can write code to use the original dataset and select which group of variables you want.

    proc print data=have ;

       var &keep ;

    run;

    data want ;

       set have;

       drop &drop;

    run;

    podarum
    Quartz | Level 8

    Brilliant.. Thanks guys.. I really appreciate it..

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 6 replies
    • 5099 views
    • 2 likes
    • 4 in conversation