Hi, I have a large dataset with over a thousand columns. All are coded with some special missing characters. Is there any way to export this to a .csv without retaining the special missings? I want to only have one type of missing data because this is causing my csv to have unique values like this:
1
0
A
T
F
Instead I want only unique values:
1
0
Please help.
Thanks!
-Brianna
You could also create an intermediary dataset for export where you convert all the special missings to standard missings as done below:
data period_a;
missing X I;
input Id $4. Foodpr1 Foodpr2 Foodpr3 Coffeem1 Coffeem2;
datalines;
1001 115 45 65 I 78
1002 86 27 55 72 86
1004 93 52 X 76 88
1015 73 35 43 112 108
1027 101 127 39 76 79
;
run;
options missing=' ';
data ds_for_export;
set period_a;
array nums _numeric_;
do over nums;
if missing(nums) then call missing(nums);
end;
run;
options missing=' '; is used to then print/write the standard missings as blanks and not as dots.
So you want to exclude all missing data?
@briannalindsay wrote:
Hi, I have a large dataset with over a thousand columns. All are coded with some special missing characters. Is there any way to export this to a .csv without retaining the special missings? I want to only have one type of missing data because this is causing my csv to have unique values like this:
1
0
A
T
F
Instead I want only unique values:
1
0
Please help.
Thanks!
-Brianna
No, I want all missing data (including special missing) to be exported as missing rather than coded as 'A', 'F', 'T' etc. The program that I'm using doesn't recognize special missings therefore I have many columns that have random single character responses.
Is the only way to do this to recode before I export?
How are you exporting your data?
Can you create a replication using the CLASS or CARS data set?
I have tried exporting both by proc export and by ods.
In SAS the data would look like this:
data test;
input var1 $ var2 $ var3 ;
datalines;
a a _a
b b _b
c c _c
run;
When I export to csv it would look like this:
var1 var2 var3
a a A
b b B
c c C
but I want it to look like this:
var1 var2 var3
a a .
b b .
c c .
You could try a custom format.
options missing='.';
proc format;
value num_custom_fmt
., .A-.Z = .
other = [best12.];
run;
proc print data=test;
format var3 num_custom_fmt.;
run;
Would this work for all columns or would I have to do each column one by one?
You have to apply to each column so if you have comma or currency formats it can be a pain.
You can use variable lists at least:
_numeric_ -> all numeric variables
var1 -- var100 -> all variables that are listed between var1 and var100
var1-var100 all variables indexed with var1-var100
You could also create an intermediary dataset for export where you convert all the special missings to standard missings as done below:
data period_a;
missing X I;
input Id $4. Foodpr1 Foodpr2 Foodpr3 Coffeem1 Coffeem2;
datalines;
1001 115 45 65 I 78
1002 86 27 55 72 86
1004 93 52 X 76 88
1015 73 35 43 112 108
1027 101 127 39 76 79
;
run;
options missing=' ';
data ds_for_export;
set period_a;
array nums _numeric_;
do over nums;
if missing(nums) then call missing(nums);
end;
run;
options missing=' '; is used to then print/write the standard missings as blanks and not as dots.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.