Hello
I am trying to subset observations that are 9 digits and in AAA-SS-XXXX format (e.g. 132980984). I found some variations including values that are fewer than 9 digits, missing (.), 0, and some values that doesn't have any posted meaning (e.g. 999999999 or 888888888) So far, I am considering anything except AAA-SS-XXXX as being unknown. Can someone help me figure out how to do this?
Thank you!
What do you have so far?
Are you looking for help with a regular expression or using BASE SAS functions?
Regular Expressions is likely faster, but it's not easy to understand or modify.
Please clarify what this means:
I am trying to subset observations that are 9 digits and in AAA-SS-XXXX format (e.g. 132980984).
I would expect the example to look like 132-98-0984 from your "format" comment. By anychance has the SSN format been assigned to this variable?
Is this variable character or numeric? Your statement of values of . implies numeric but I've been fooled before.
You might post some example values and what you want as a result for those.
If some values are out of expected ranges such as your 999999999 then you should provide expected ranges.
Hello-
Sorry for the confusion- yes, they are SSN and it is numeric. Here are a few examples:
209876896
498304981
0
54376548
4326583
I basically want to grab every observation that is 9 digits thus exclude the followings:
0 / '.'/ 999999999/ 888888888
I also found a few cases that are 9 digits, but start with 888 which I am not sure how to deal with, but would like to include them for now.
Thank you!
In a data step:
if variablename in (. 0 99999999 888888888) then <do what ever>.
If you are actually creating a subset data with these values then
data want;
set have;
if variablename in (. 0 99999999 888888888);
run;
to exclude:
data want;
set have;
if variablename not in (. 0 99999999 888888888);
run;
Or use a dataset option where clause (where=( variablename not in (. 0 99999999 888888888))) if you don't want to create a separate data set but use the same data set for analysis.
Thank you both! super helpful
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.