Hi there! I have to create a dataset, but one data point is weird (>36). The variable is supposed to be numeric, and I'm not sure how to do that. It keeps coming up as missing when I run it as is. This is the code I've been running:
DATA filename;
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
I would love any help you could give me! Thank you!
What do you want instead of missing?
The best way is probably to read it as character and clean it.
As @ChrisNZ wrote, one option is to read the variable as character and then clean it.
If you are absolutely sure that you do not have any ">" characters anywhere else, the fast and dirty solution can be to add it as a delimiter:
DATA filename;
infile datalines delimiter=' >';
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
Another quick fix is to modify the infile buffer:
DATA filename;
input @;
if _infile_=:'>' then
_infile_=substr(_infile_,2);
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
This is not a programmatical, but a design question: what does ">36" mean? If you want that column to be numeric, you have to lay down a rule that tells you to what discrete number any ">dd" (d = digit) string has to be converted. If you could simply say that such strings should result in a missing numerical value, the solution could be quite easy.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.