Quick solution:
FEMALE is not the same as female. The case difference does matter to SAS, though you can make it not matter by applying a format to make it consistent. Case doesn't matter to code, but it does matter to data. So proc freq is the same as PROC FREQ. In comparison, Python and R are both case sensitive in language AND data.
proc freq data=have;
table sex / missing;
format sex $upcase18.;
run;
However, to fix the unknown you will need to actually fix your data that you imported. I'm guessing you used PROC IMPORT and didn't write a data step? In that case, I would recommend adding the following statement to the PROC IMPORT code:
guessingrows=max;
This forces SAS to scan the full row before it imports the data, so it will really slow down your import process but you'll get cleaner data.
Optimal solution:
Write an import step that will correctly read the file. You can use the code from the log as a starter version.
And then to correct the case, use a data step and clean up the data, likely using PROPCASE, which will convert everything to lowercase and upper case the first character. Then run your proc freq.
data clean;
set raw_data_from_import;
sex = propcase(sex);
run;
proc freq data=clean;
table sex;
run;
@CatPaws wrote:
Hi!
I had an issue with SAS converting my numeric variables to character variables when imported from excel. To go around that, I saved it as a CSV then imported it. Now, I am doing a proc frequency on a character variable, and SAS is duplicating those variables. See pic below! On my spread sheet, I only have Female, Male, and Unknown, however, why are they being duplicated?
P.S How can I get UNKNOW to display the full name (UNKNOWN)?