DATA Step, Macro, Functions and more

How to drop or keep variables conditionally...???

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

How to drop or keep variables conditionally...???

Hi,

I'm working on a dataset that includes 1353 observations and 15 variables.  If the variable A has a value of "No Match" then I need a output that includes variables A1, A2, and A.  If variable B has a value of "No Match" then I need output that includes variables B1, B2, and B.

 

Here is my code:

proc sql:

    create table have as

              select a.*,

                         b1.b1,

                          b1.b2,

                          b1.b3,

                          b1.b,

                          b1.b4,

                           

                          case when a.a1 not equal b1.b1 then "no match" end as alpha,

                          case when a.a2 not equal b1.b2 then "no match" end as beta

               from sail as a

               left join bus as b1 on (sail.bow = bus.wheel);

 

data want;

    set have;

         if alpha eq "no match" then output A1 A2 A;

         if beta eq "no match" then output B1 B2 B;

run;

 

(I know this last code [data want] doesn't work - but how do I tell SAS to keep a certain 3 variables from [table have] if the condition of "no match" is met?  Any ideas?

 

I'm thinking a macro or call symput would work?  please help!


Accepted Solutions
Solution
2 weeks ago
Respected Advisor
Posts: 4,737

Re: How to drop or keep variables conditionally...???

[ Edited ]

@belboy

A table has a descriptor part (that's all your variable definitions) and a data part. Everything descriptor gets created during compilation phase and you can't change it anymore during the execution phase (when you process your data). For this reason it's principally impossible to create, modify or drop variables during execution based on some data values.

 

Basically: All the information you can retrieve via a Proc Contents is "descriptor" and not changeable during execution.

 

What you can do is to map source variables into target variables - and this mapping can be conditionally. So your target table must always have the same named columns but you can in a SQL use a case statement and based on the value of some source variable populate the target value using different source variables.  

View solution in original post


All Replies
Respected Advisor
Posts: 4,737

Re: How to drop or keep variables conditionally...???

@belboy

The OUTPUT statement writes an observation to the target table. SAS tables are rectangular so you have always the same variables in all rows.

 

What you can do is the same than what you're doing in your SQL case statement: Based on a condition you assign a different value (variable) to an output variable.

 

if haveVar='Match' then wantVar=haveVar_A;
else if haveVar='No Match' then wantVar=haveVar_B;
Contributor
Posts: 37

Re: How to drop or keep variables conditionally...???

[ Edited ]

Hmmm... I don't think that gives me the desired outcome....because i don't get a new dataset with only the desired columns, if a variable has "No Match".   See this example below I pulled from @Haikuo .  This gives me only one output, but I need multiple outputs based on a condition.

 

/*this code will give me some of what i want - but not all.*/

data
_null_; set have end=done; array p(*) a1 a2 a3 a4 a5 b1-b15; array _name(5) $30 _temporary_ ; do i=1 to dim(p); if p(i)='No Match' then _name(i)=vname(p(i)); end;
if done then call symputx ('del',catx(' ', of _name(*))); run; data final; set have(drop=&del); run;



If a2 = "No Match" then all other columns are dropped in the new dataset. 

 

However, I want to keep a few columns. For instance, if a2 = "No Match" then I want a new dataset with a1, a2, and a3. If b3 = "No Match" then I want t a new dataset with b3, b4, and b5....or whatever columns I choose based on the condition of a2 or b3. */

Is there a way to do this? Build a new dataset where columns(variables) are kept / dropped based on a condition of a column(variable) in a previous table??

Respected Advisor
Posts: 3,061

Re: How to drop or keep variables conditionally...???

You have to create two data sets, one for each condition.

--
Paige Miller
Contributor
Posts: 37

Re: How to drop or keep variables conditionally...???

Posted in reply to PaigeMiller

I was checking this code from @Haikuo

 

 

data _null_;
set have end=done;
array p(*) a1 a2 a3 a4 a5 a6 b1 b2 b3 b4;
array _name(10) $30 _temporary_ ;
do i=1 to dim(p);
if p(i)='No Match' then _name(i)=vname(p(i));
end;
if done then call symputx ('del',catx(' ', of _name(*)));
run;


data final;
set have(drop=&del);
run;

 

This works a little, because if a2 = "No Match" then it will drop any other variable in a new dataset.  I could swap drop= for keep= if I wanted to keep certain variables.  

 

However, if b3 = "No Match" then the same variables contained under &del in the previous datastep will be dropped...  I need different variables conditionally to be dropped if a certain variable is selected.

 

Is there any way to modify the macro above?  It gives me the first "want" I'm looking for - but the macro is unable to drop different variables if a condition is met (in this case, if b3 = "No Match").

Super User
Posts: 6,785

Re: How to drop or keep variables conditionally...???

[ Edited ]

In addition to creating two data sets,  you have to allow that both conditions might generate "no match" and the same observation should get flagged twice.  Along the lines of:

 

data want_a_nomatch (keep=a1 a2 a)

want_b_nomatch (keep=b1 b2 b);

set have;

if alpha = "no match" then output want_a_nomatch;

if beta = "no match" then output want_b_nomatch;

run;

 

****************** EDITED:

  

As @Patrick noted, this was recommended.

 

data want_a_nomatch (keep=a1 a2 a)

want_b_nomatch (keep=b1 b2 b);

 

But this is different (and not recommended):

 

data want_a_nomatch (keep=a1 a2 a);

data want_b_nomatch (keep=b1 b2 b);

Contributor
Posts: 37

Re: How to drop or keep variables conditionally...???

Posted in reply to Astounding

So I set this up as you stated - 

data sod (keep=Role SOD_Conflict conflict sodconflictstatus);
data bfd (keep='bfd'n Description bfd_status);
set &app.RoleStatusRpt;
if sodconflictstatus = "No Match" then output work.sod;
if bfd_status = "No Match" then output work.bfd;
run;

 

But received the following error:


26 data bfd (keep='business friendly description'n Description bfd_status);

27 set &app.RoleStatusRpt;
28 if sodconflictstatus = "No Match" then output work.sod;
________
455
ERROR 455-185: Data set was not specified on the DATA statement.

29 if bfd_status = "No Match" then output work.bfd;
30 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.BFD may be incomplete. When this step was stopped there were 0 observations and 3 variables.
WARNING: Data set WORK.BFD was not replaced because this step was stopped.

Respected Advisor
Posts: 4,737

Re: How to drop or keep variables conditionally...???

@belboy

Each data step starts with statement DATA and ends with RUN; You can't have two DATA statements - but you can define multiple tables within the same DATA statement.

 

Below should work:

data 
  sod (keep=Role SOD_Conflict conflict sodconflictstatus)
  bfd (keep=bfd Description bfd_status)
  ;
Super User
Posts: 23,776

Re: How to drop or keep variables conditionally...???

You need either two queries or use a data step. 

 

PROC SQL - each query creates a single table at most. A data step can create multiple tables at once so you can change your join to a SAS merge and do it in a data step. Untested as no data was provided.

 

data     alpha (keep = a1 a2 a3)  
            beta (keep = b1 b2 b3);

merge sail (in=s) 
           bus (in=b rename=wheel=bow);

by bow; * variable names need to be the same in a data step merge;

if s; *replicates a left join;

if a1=b1 then output alpha;
if a2 ne b2 then output beta;

run;

@belboy wrote:

Hi,

I'm working on a dataset that includes 1353 observations and 15 variables.  If the variable A has a value of "No Match" then I need a output that includes variables A1, A2, and A.  If variable B has a value of "No Match" then I need output that includes variables B1, B2, and B.

 

Here is my code:

proc sql:

    create table have as

              select a.*,

                         b1.b1,

                          b1.b2,

                          b1.b3,

                          b1.b,

                          b1.b4,

                           

                          case when a.a1 not equal b1.b1 then "no match" end as alpha,

                          case when a.a2 not equal b1.b2 then "no match" end as beta

               from sail as a

               left join bus as b1 on (sail.bow = bus.wheel);

 

data want;

    set have;

         if alpha eq "no match" then output A1 A2 A;

         if beta eq "no match" then output B1 B2 B;

run;

 

(I know this last code [data want] doesn't work - but how do I tell SAS to keep a certain 3 variables from [table have] if the condition of "no match" is met?  Any ideas?

 

I'm thinking a macro or call symput would work?  please help!


 

Super User
Super User
Posts: 8,125

Re: How to drop or keep variables conditionally...???

Your problem statement is not clear. You mention you have a single dataset, but then you show an SQL join between TWO different datasets.  Are you starting with one input dataset or two?

 

Do you want one output dataset or two datasets?

You mention what variables you want, but it is not clear what observations you want.

 

Please create example input that has the variables you have mentioned and what result you want from that example.  Only include enough variables and observations to demonstrate the issue.  

 

Please show both in the form of a data step that can be run to create the data. That way people trying to solve the problem can just copy and paste the code into SAS to create the example datasets.

 

Contributor
Posts: 37

Re: How to drop or keep variables conditionally...???

Hi Tom,

The problem statement is clear.

 

I have a dataset "have" that is created via Proc Sql join (doesn't matter how many datasets it takes to create it).  I'm just providing example code to show how I created it.

 

I want one dataset "want", that contains (or keeps) variables based on the condition of the if statement.  If a variable contains "No Match", then I want the dataset to keep only those variables and/or drop the others.   

 

Unfortunately, the "want" in my original code isn't giving me the results I want.

 

I will try to think of alternatives as well - but thanks to all the members hear and the options provided so far!  Hope we can get this solved today.

 

Solution
2 weeks ago
Respected Advisor
Posts: 4,737

Re: How to drop or keep variables conditionally...???

[ Edited ]

@belboy

A table has a descriptor part (that's all your variable definitions) and a data part. Everything descriptor gets created during compilation phase and you can't change it anymore during the execution phase (when you process your data). For this reason it's principally impossible to create, modify or drop variables during execution based on some data values.

 

Basically: All the information you can retrieve via a Proc Contents is "descriptor" and not changeable during execution.

 

What you can do is to map source variables into target variables - and this mapping can be conditionally. So your target table must always have the same named columns but you can in a SQL use a case statement and based on the value of some source variable populate the target value using different source variables.  

Super User
Super User
Posts: 8,125

Re: How to drop or keep variables conditionally...???

[ Edited ]

So if you have a dataset that include A,B,A1,A2,B1,B2 and possibly some other variables. 

So it might look something like this.

data have ;
  length id 8 A B $8 A1 A2 B1 B2 8 ;
  infile cards dsd truncover ;
  input id -- B2 ;
cards;
1,Match,No Match,1,2,3,4
2,No Match,Match,5,6,7,8
3,Match,Match,9,10,11,12
4,No Match,No Match,13,14,15,16
;

What output do you want from that dataset?

Do you want two separate datasets? Should they each have 2 observations or 4?

 

New Contributor
Posts: 2

Re: How to drop or keep variables conditionally...???

1- DO a macro variable that conditionally comments and uncomments the keep and drop statements.

                                 OR

2- TRY a macro call that uses %if and %else to alternate between the keep and drop statements

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 309 views
  • 4 likes
  • 7 in conversation