Hi there! I have to create a dataset, but one data point is weird (>36). The variable is supposed to be numeric, and I'm not sure how to do that. It keeps coming up as missing when I run it as is. This is the code I've been running:
DATA filename;
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
I would love any help you could give me! Thank you!
What do you want instead of missing?
The best way is probably to read it as character and clean it.
As @ChrisNZ wrote, one option is to read the variable as character and then clean it.
If you are absolutely sure that you do not have any ">" characters anywhere else, the fast and dirty solution can be to add it as a delimiter:
DATA filename;
infile datalines delimiter=' >';
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
Another quick fix is to modify the infile buffer:
DATA filename;
input @;
if _infile_=:'>' then
_infile_=substr(_infile_,2);
INPUT age class $;
Datalines;
13 B
18 C
>36 A
21 B
;
Run;
This is not a programmatical, but a design question: what does ">36" mean? If you want that column to be numeric, you have to lay down a rule that tells you to what discrete number any ">dd" (d = digit) string has to be converted. If you could simply say that such strings should result in a missing numerical value, the solution could be quite easy.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.