BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mklangley
Lapis Lazuli | Level 10

Suppose I have data like the following:

data have;
    input col1 $ col2 $ col_with_diff_naming_convention $ col4 $ col5 $;
    datalines;
T  F  T  W  O
T  T  T  T  T
W  E  T  H  T
P  T  H  J  T
T  G  T  H  L
;
run;

I would like to omit all rows where the columns all equal "T". In the example above, I want all rows except row 2.

 

I could do this with hard-coding (like below) or using DICTIONARY.COLUMNS.

data want;
    set have;
    where col1 ne 'T'
        and col2 ne 'T'
        and col_with_diff_naming_convention ne 'T'
        and col4 ne 'T'
        and col5 ne 'T';
run;

But I am wondering if anybody knows a slick way to do this using arrays (or another similarly concise approach).

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

This works for your example data and is likely extensible to more "real" data like words instead of single letter values:

data want;
   set have;
   if not ('T' =  col1 = col2 = col_with_diff_naming_convention = col4 = col5 );
run;

SAS will allow multiple comparisons such as =, < , > all in one expression though < and > and similar are generally not a good idea for character values until you really understand how they work.

 

If the data is actually single character then you could concatenate all the variables into one and use the COUNTC function to see how many times T appears and if it matches the number of variables then delete the observation.

View solution in original post

4 REPLIES 4
ballardw
Super User

This works for your example data and is likely extensible to more "real" data like words instead of single letter values:

data want;
   set have;
   if not ('T' =  col1 = col2 = col_with_diff_naming_convention = col4 = col5 );
run;

SAS will allow multiple comparisons such as =, < , > all in one expression though < and > and similar are generally not a good idea for character values until you really understand how they work.

 

If the data is actually single character then you could concatenate all the variables into one and use the COUNTC function to see how many times T appears and if it matches the number of variables then delete the observation.

SASKiwi
PROC Star

This works for your example but something more flexible may be preferable:

data want;
  set have;
  if cats(of _character_) ne 'TTTTT';
run;
mklangley
Lapis Lazuli | Level 10

Thank you, @ballardw  and @SASKiwi ! I appreciate your prompt responses. Both of those are good approaches--wish I could accept both as solutions. For my use case, 'T' = col1 = col2... will be easier to maintain, so I'll go with that. 

ballardw
Super User

You may need to consider case for such comparisons and require use of either UPCASE or LOWCASE functions on all of your variables for the comparison as "ABC" is not equal to "abc" "Abc" "aBc" (etc.).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 438 views
  • 2 likes
  • 3 in conversation