- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have data with missing values. Is there anyway I can convert missing values ( . or " ") into '-9' regardless of the type of variable.
data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1 ram male 1 1.1 8.0 3.0 1.45 1
2 mohan male 1 1.1 8.5 2.9 1.24 .
3 shyam . 2 2.1 6.8 3.1 2.23 2
4 radha female 2 2.1 7.2 3.8 3.80 .
5 gita female 2 2.1 . 2.7 3.45 3
6 sita . 2 3.1 8.9 6.8 4.56 2
7 ranu female 3 2.1 7.5 5.9 1.83 5
8 tina female 3 2.1 3.6 2.8 1.23 8
9 shan male 4 8.1 2.8 1.1 1.67 9
;
run;
data I want like this
1 ram male 1 1.1 8.0 3.0 1.45 1
2 mohan male 1 1.1 8.5 2.9 1.24 -9
3 shyam -9 2 2.1 6.8 3.1 2.23 2
4 radha female 2 2.1 7.2 3.8 3.80 -9
5 gita female 2 2.1 -9 2.7 3.45 3
6 sita -9 2 3.1 8.9 6.8 4.56 2
7 ranu female 3 2.1 7.5 5.9 1.83 5
8 tina female 3 2.1 3.6 2.8 1.23 8
9 shan male 4 8.1 2.8 1.1 1.67 9
I have tried a below code-
data have;
set want ;
if name = " " then name = "-9";
if TC = . then TC = -9 ;
if Ttg = . then Ttg = -9;
run;
But if we have 100 variables, it's not possible to write all the variables in one command. Is there any other way?
Thanks in advance
- Tags:
- missing values
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Kurt_Bremser 's great intuition of variable with length 1 byte is something you need a new assignment aka copy the values to the new variable assigned with a higher length and the rest is a breeze
data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1 ram male 1 1.1 8.0 3.0 1.45 1
2 mohan male 1 1.1 8.5 2.9 1.24 .
3 shyam . 2 2.1 6.8 3.1 2.23 2
4 radha female 2 2.1 7.2 3.8 3.80 .
5 gita female 2 2.1 . 2.7 3.45 3
6 sita . 2 3.1 8.9 6.8 4.56 2
7 ranu female 3 2.1 7.5 5.9 1.83 5
8 tina female 3 2.1 3.6 2.8 1.23 8
9 shan male 4 8.1 2.8 1.1 1.67 9
;
run;
data want;
set have;
array c(*) _character_ ;
array num _numeric_;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=left(-9);
end;
do _n_=1 to dim(num);
if missing(num(_n_)) then num(_n_)=-9;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think ARRAYs would work.
But in general, the easiest way to handle missing values is to NOT convert them to –9. SAS knows what to do with missing values, it does not know what to do with –9.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply.
Actually, I have to do a genetic analysis where protocol suggests that missing values should be converted into -9
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I quite agree with @PaigeMiller, it makes no sense to convert missings to -9, let alone changing character string to -9. I mean, really no sense at all for characters. For numerics, its possible that the stato has planned that, however I would question it and get specific confirmation. Now if it really is just numbers, then maybe:
options missing=-9;
Will work for you, if not:
data want; set have; array n{*} _numeric_; do i=1 to dim(n); if n=. then n=-9; end; run;
As for character, its not worth providing code, is a more complex process for something which doesn't make any sense.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And what will you do with an empty character variable of length 1?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply.
Yes, first I have to change the format in to 2 characters then only i can add missing value data
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mehul4frnds wrote:
Thanks for the reply.
Yes, first I have to change the format in to 2 characters then only i can add missing value data
Changing the format won't do it. You need to create a new variable to replace the old one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. I agree with you that a new variable needs to be created.
Kindly suggest about overall conversion
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Retrieve variable attributes from dictionary.columns or sashelp.vcolumn, and create a data step from that with call execute.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Kurt_Bremser 's great intuition of variable with length 1 byte is something you need a new assignment aka copy the values to the new variable assigned with a higher length and the rest is a breeze
data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1 ram male 1 1.1 8.0 3.0 1.45 1
2 mohan male 1 1.1 8.5 2.9 1.24 .
3 shyam . 2 2.1 6.8 3.1 2.23 2
4 radha female 2 2.1 7.2 3.8 3.80 .
5 gita female 2 2.1 . 2.7 3.45 3
6 sita . 2 3.1 8.9 6.8 4.56 2
7 ranu female 3 2.1 7.5 5.9 1.83 5
8 tina female 3 2.1 3.6 2.8 1.23 8
9 shan male 4 8.1 2.8 1.1 1.67 9
;
run;
data want;
set have;
array c(*) _character_ ;
array num _numeric_;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=left(-9);
end;
do _n_=1 to dim(num);
if missing(num(_n_)) then num(_n_)=-9;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you @novinosrin so much for the reply.
I used @Kurt_Bremser 's suggestion and your codes work very well.
Thanks again
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mehul4frnds You are welcome. I am glad!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another way is to read the data as desired in the first case:
proc format library=work; invalue $nomiss ' ','.'='-9' other=_same_; invalue nomiss ' ','.'=-9 other=_same_; run; data have; length name $ 10 gen $ 6; informat id nomiss. name $nomiss. gen $nomiss. PROCHI DATE TC HDL LDL Ttg nomiss.; input id name$ gen$ PROCHI DATE TC HDL LDL Ttg; datalines; 1 ram male 1 1.1 8.0 3.0 1.45 1 2 mohan male 1 1.1 8.5 2.9 1.24 . 3 shyam . 2 2.1 6.8 3.1 2.23 2 4 radha female 2 2.1 7.2 3.8 3.80 . 5 gita female 2 2.1 . 2.7 3.45 3 6 sita . 2 3.1 8.9 6.8 4.56 2 7 ranu female 3 2.1 7.5 5.9 1.83 5 8 tina female 3 2.1 3.6 2.8 1.23 8 9 shan male 4 8.1 2.8 1.1 1.67 9 ; run;
This requires a little more from you to set the lengths of the string variables with either a Length or Attrib statement to prevent the character informat from setting the length to 2. I did not attempt to set a default length for the $nomiss informat as I have no idea what range of values you might be attempting to read.