Hello!
I have a dataset with 6 variables. one of those variables is state (california, idaho, nevada, ect.) How would I go about excluding states that don't begin with the letter M?
data want;
set have;
if state=:'M' then output;
if state=:'N' then output;
else delete;
run;
My goal was to delete all states that did not begin with the letter M or N. This worked for me. thanks all!
Editted note: this particular example assume STATE is all capital letters. If not, then use
UPCASE(STATE)
in every place I have used
STATE
If the old days you could
data want;
set have;
where state<'M' or state >='N';
run;
But a while ago SAS introduced the "=:" relation, which allows you to:
data want;
set have;
where not (state=:'M');
run;
The '"=:" relation, exclusively used for character values, tells SAS to take the two values in question, and see if the shorter one exactly matches the leading characters of the longer one. By "shorter" and "longer", I don't mean storage length. Instead I mean the number of characters in the value excluding the trailing blanks.
So I have tried this, but it still produces a data set with all states not just the states that begin with the letter M. I think I need a piece of code like this:
data want;
set have;
if state= "M--------" then keep;
else delete;
run;
But i'm not sure how to go about doing that...
With the IF statement this is one solution. You should read the documentation on the different types of expressions you can use in an IF statement. Trying to guess how statements work is only going to get you frustrated.
This link is a good one to explore: https://documentation.sas.com/?docsetId=lrcon&docsetTarget=lrconwhatsnew94.htm&docsetVersion=9.4&loc...
if state =: "M" then output;
Alternatively: ne is mnemonic for not equal, and eq for equal
where state ne: 'M';
data want;
set have;
if state=:'M' then output;
if state=:'N' then output;
else delete;
run;
My goal was to delete all states that did not begin with the letter M or N. This worked for me. thanks all!
Mark, instead of upcasing the whole thing, can do:
where state in: ("m","M") ;
which in my WHERE tests against sashelp.zipcode(keep=statecode) times 1000 works more than twice as fast compared to:
where upcase (state) =: "M" ;
Looks like UPCASE (and also $UPCASE1. - have tested that, too) is much more expensive than testing for another value via IN.
Kind regards
Paul D.
Nice.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.