BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
x2PSx
Calcite | Level 5

Hello,

 

I have a table with 387 variables and 25 rows. I want to remove all variables that have missing values, but only if the missing values is between the 6th and the 25th rows (some variables have lags and so the first 5 rows may have missing values with no problem). I am guessing this is really easy with proc sql, but I can't figure out the code for it to work. For example, I want to remove "C_CNBA11M_LAG1" to "C_CNBA11M_LAG4" in this example.

 

Thank you very much!


example.jpg
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Unfortunately, since you may or may not need to run a step to remove variables, this requires macro language.  Fortunately, if you know macro language it's only medium difficult.  For example (untested code):

 

%macro remove;

   proc summary data=have (firstobs=6);

     var _numeric_;

     output out=missing_counts (drop=_type_ _freq_ ) nmiss=;

   run;

   proc transpose data=missing_counts out=transposed_counts;

     var _numeric_;

   run;

   %local remove_list;

   proc sql;

       select distinct  _name_  into : remove_list from transposed_counts where col1 > 0;

   quit;

   %if %length(&remove_list) %then %do;

       data want;

           set have (drop=&remove_list);

       run;

   %end;

   %else %put No variables had missing values from observation 6 onward.;

%mend remove;

 

%remove

View solution in original post

7 REPLIES 7
Reeza
Super User

If you want row level operations, a data step is a better approach. Especially if you need to identify rows. SQL does not necessarily honor order of the rows, especially if you don't have another variable that specifies the order of the rows.

Astounding
PROC Star

Unfortunately, since you may or may not need to run a step to remove variables, this requires macro language.  Fortunately, if you know macro language it's only medium difficult.  For example (untested code):

 

%macro remove;

   proc summary data=have (firstobs=6);

     var _numeric_;

     output out=missing_counts (drop=_type_ _freq_ ) nmiss=;

   run;

   proc transpose data=missing_counts out=transposed_counts;

     var _numeric_;

   run;

   %local remove_list;

   proc sql;

       select distinct  _name_  into : remove_list from transposed_counts where col1 > 0;

   quit;

   %if %length(&remove_list) %then %do;

       data want;

           set have (drop=&remove_list);

       run;

   %end;

   %else %put No variables had missing values from observation 6 onward.;

%mend remove;

 

%remove

x2PSx
Calcite | Level 5

Thank you, this worked for me:

 

proc summary data=HAVE(firstobs=6);

var _numeric_;

output out=MISSING_COUNTS (drop=_type__freq_)nmiss=;

run;

proc transpose data=MISSING_COUNTS out=TRANSPOSED_COUNTS;

var _numeric_;

run;

proc sql;

select distinct _name_ into : REMOVE_LIST separated by " "

from TRANSPOSED_COUNTS where col1 > 0

and _NAME_ not = "_FREQ_";

quit;

data WANT(drop=&REMOVE_LIST);

set HAVE ;

run;

 

 

However, I do have to do this for 3 tables, do you know of a way to each time "update" the macro variable REMOVE_LIST as to include variables with missing values for all 3 tables?

Astounding
PROC Star

This is a normal function of using macros.  The most common approach is to define the macro with parameters:

 

%macro remove (indata=, outdata=);

 

Then anywhere you are referring to either the input data or output data, make a reference to the parameter.  For example:

 

data &outdata;

set &indata;

 

Call the macro by supplying values for the parameters:

 

%remove (indata=have1, outdata=want1)

x2PSx
Calcite | Level 5

Thanks for your help,

 

I do know how to use macro language, but I don't know how to update a macro variable without erasing the previous list of variables to remove from the first table. Ideally, I would want my "REMOVE_LIST" to include all variables to drop from all 3 tables separated by a " " without repeating them, because I then have to remove these variables from all my tables with a simple data step. Thank you for your time!

Astounding
PROC Star

If I understand correctly, you want to remove a variable from one of your data sets if contains missing values in another of the data sets.

 

For that, the simplest way is to save three versions of &REMOVE_LIST:

 

%global remove_list1 remove_list2 remove_list3;

 

Then change the INTO destination, using one of those three variables instead of REMOVE_LIST.

 

Finally, just put them all together:

 

(drop=&remove_list1 &remove_list2 &remove_list3)

 

There may be some duplication in the lists, but you don't need to remove the duplication for SAS to figure out what to drop.

x2PSx
Calcite | Level 5

Thank you for your help, I managed to write my code. Here it is for future reference or anybody that it could help

 

 

%let SCENARIOS = S0 S1 S2;

%let REMOVE_LIST = REMOVE0 REMOVE1 REMOVE2;

 

%MACRO REMOVE_VARS_W_MISSING_VALUES ( FIRST_ROW = ,

TABLES_INPUT= ,

REMOVE_LIST=

);

 

%do i=1 %to %sysfunc(countw(&TABLES_INPUT.));

proc summary data=USERLIB.%scan(&TABLES_INPUT.,&i.)(firstobs=&FIRST_ROW.);

var _NUMERIC_;

output out=MISSING_COUNTS (drop=_TYPE__FREQ_)nmiss=;

run;

proc transpose data=MISSING_COUNTS out=TRANSPOSED_COUNTS;

var _NUMERIC_;

run;

proc sql;

select distinct _NAME_ into : %scan(&REMOVE_LIST.,&i.) separated by " "

from TRANSPOSED_COUNTS where col1 > 0

and _NAME_ not = "_FREQ_";

quit;

%end;

 

%do i=1 %to %sysfunc(countw(&TABLES_INPUT.));

data USERLIB.%scan(&TABLES_INPUT.,&i.)_MOD;

set USERLIB.%scan(&TABLES_INPUT.,&i.)(drop=&REMOVE0. &REMOVE1. &REMOVE2.);

run;

%end;

%MEND;

%REMOVE_VARS_W_MISSING_VALUES ( FIRST_ROW = 6,

TABLES_INPUT= &SCENARIOS.,

REMOVE_LIST= &REMOVE_LIST.

);

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2986 views
  • 0 likes
  • 3 in conversation