Hello
I am trying to subset observations that are 9 digits and in AAA-SS-XXXX format (e.g. 132980984). I found some variations including values that are fewer than 9 digits, missing (.), 0, and some values that doesn't have any posted meaning (e.g. 999999999 or 888888888) So far, I am considering anything except AAA-SS-XXXX as being unknown. Can someone help me figure out how to do this?
Thank you!
What do you have so far?
Are you looking for help with a regular expression or using BASE SAS functions?
Regular Expressions is likely faster, but it's not easy to understand or modify.
Please clarify what this means:
I am trying to subset observations that are 9 digits and in AAA-SS-XXXX format (e.g. 132980984).
I would expect the example to look like 132-98-0984 from your "format" comment. By anychance has the SSN format been assigned to this variable?
Is this variable character or numeric? Your statement of values of . implies numeric but I've been fooled before.
You might post some example values and what you want as a result for those.
If some values are out of expected ranges such as your 999999999 then you should provide expected ranges.
Hello-
Sorry for the confusion- yes, they are SSN and it is numeric. Here are a few examples:
209876896
498304981
0
54376548
4326583
I basically want to grab every observation that is 9 digits thus exclude the followings:
0 / '.'/ 999999999/ 888888888
I also found a few cases that are 9 digits, but start with 888 which I am not sure how to deal with, but would like to include them for now.
Thank you!
In a data step:
if variablename in (. 0 99999999 888888888) then <do what ever>.
If you are actually creating a subset data with these values then
data want;
set have;
if variablename in (. 0 99999999 888888888);
run;
to exclude:
data want;
set have;
if variablename not in (. 0 99999999 888888888);
run;
Or use a dataset option where clause (where=( variablename not in (. 0 99999999 888888888))) if you don't want to create a separate data set but use the same data set for analysis.
Thank you both! super helpful
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.