- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to do some analyses on a variable that SAS read in as character when it is numeric. I read it the data in as a csv with proc import. There are 100 variables but here is a snippet:
ID BMI
1 25.987
2 23.192
3 29.901
4 21.009
And my code to read it in is
proc import datafile='~/dig_sas.csv'
out=dig
DBMS=csv
replace;
guessingrows=MAX;
getnames=YES;
run;
It reads in BMI as character. I couldn't get the correct informat/ format (should be 6.3) in further steps. I also tried doing an input statement but coudn't get it to work. I don't want to do a data step with an infile/input because there are so many variables.
thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Without doing a data step/infile I'm not sure how you can control in informat, but doing a second data step after the import is how I've handled this in the past.
data dig2;
set dig;
BMI2 = input(BMI,6.3);
run;
/* or, to reuse the variable name already established */
data dig2(drop=_BMI);
set dig(rename=(BMI=_BMI));
BMI = input(_BMI,6.3);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since it read it as character, I would guess that one or more records have a value that isn't numeric. I'd simply fix the problem with a very short data step:
data have; input bmi $; cards; 29.901 21.009 x 22.4 ; data want (drop=_:); set have (rename=(bmi=_bmi)); bmi=input(_bmi,?? 8.); run;
Art, CEO, AnalystFinder.com
p.s. the ?? is to set the value to missing if it isn't numeric and not cause SAS to consider it an error
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want control over the process, then don't use proc import. It's that simple.
See Maxim 22.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Show use some values that are character. I suspect that you have something such as NULL, NA, N/A or similar in the data somewhere and occuring often enough that the GUESSING elment of Proc import says it is character. If you know the first 5 rows are actually numeric in appearance them perhaps use Guessingrows=5 (or what ever number of rows at the top work) but his may cause issues with lengths of other character variables.
Or write a data step. The proc import code generated one and it should be in the log that you could copy, paste into the editor and modify the informat to be best6. or similar instead of the extemely likely $6. .