BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mklangley
Lapis Lazuli | Level 10

Suppose I have data like the following:

data have;
    input col1 $ col2 $ col_with_diff_naming_convention $ col4 $ col5 $;
    datalines;
T  F  T  W  O
T  T  T  T  T
W  E  T  H  T
P  T  H  J  T
T  G  T  H  L
;
run;

I would like to omit all rows where the columns all equal "T". In the example above, I want all rows except row 2.

 

I could do this with hard-coding (like below) or using DICTIONARY.COLUMNS.

data want;
    set have;
    where col1 ne 'T'
        and col2 ne 'T'
        and col_with_diff_naming_convention ne 'T'
        and col4 ne 'T'
        and col5 ne 'T';
run;

But I am wondering if anybody knows a slick way to do this using arrays (or another similarly concise approach).

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

This works for your example data and is likely extensible to more "real" data like words instead of single letter values:

data want;
   set have;
   if not ('T' =  col1 = col2 = col_with_diff_naming_convention = col4 = col5 );
run;

SAS will allow multiple comparisons such as =, < , > all in one expression though < and > and similar are generally not a good idea for character values until you really understand how they work.

 

If the data is actually single character then you could concatenate all the variables into one and use the COUNTC function to see how many times T appears and if it matches the number of variables then delete the observation.

View solution in original post

4 REPLIES 4
ballardw
Super User

This works for your example data and is likely extensible to more "real" data like words instead of single letter values:

data want;
   set have;
   if not ('T' =  col1 = col2 = col_with_diff_naming_convention = col4 = col5 );
run;

SAS will allow multiple comparisons such as =, < , > all in one expression though < and > and similar are generally not a good idea for character values until you really understand how they work.

 

If the data is actually single character then you could concatenate all the variables into one and use the COUNTC function to see how many times T appears and if it matches the number of variables then delete the observation.

SASKiwi
PROC Star

This works for your example but something more flexible may be preferable:

data want;
  set have;
  if cats(of _character_) ne 'TTTTT';
run;
mklangley
Lapis Lazuli | Level 10

Thank you, @ballardw  and @SASKiwi ! I appreciate your prompt responses. Both of those are good approaches--wish I could accept both as solutions. For my use case, 'T' = col1 = col2... will be easier to maintain, so I'll go with that. 

ballardw
Super User

You may need to consider case for such comparisons and require use of either UPCASE or LOWCASE functions on all of your variables for the comparison as "ABC" is not equal to "abc" "Abc" "aBc" (etc.).

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 932 views
  • 2 likes
  • 3 in conversation