03-08-2018 10:09 PM
Hi there! I have to create a dataset, but one data point is weird (>36). The variable is supposed to be numeric, and I'm not sure how to do that. It keeps coming up as missing when I run it as is. This is the code I've been running:
INPUT age class $;
I would love any help you could give me! Thank you!
03-09-2018 03:34 AM
As @ChrisNZ wrote, one option is to read the variable as character and then clean it.
If you are absolutely sure that you do not have any ">" characters anywhere else, the fast and dirty solution can be to add it as a delimiter:
DATA filename; infile datalines delimiter=' >'; INPUT age class $; Datalines; 13 B 18 C >36 A 21 B ; Run;
Another quick fix is to modify the infile buffer:
DATA filename; input @; if _infile_=:'>' then _infile_=substr(_infile_,2); INPUT age class $; Datalines; 13 B 18 C >36 A 21 B ; Run;
03-09-2018 03:41 AM
This is not a programmatical, but a design question: what does ">36" mean? If you want that column to be numeric, you have to lay down a rule that tells you to what discrete number any ">dd" (d = digit) string has to be converted. If you could simply say that such strings should result in a missing numerical value, the solution could be quite easy.