BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mehul4frnds
Obsidian | Level 7

Hi all,

 

I have data with missing values. Is there anyway I can convert missing values ( . or " ") into '-9' regardless of the type of variable.

 

data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1   ram      male      1   1.1   8.0   3.0   1.45    1
2   mohan male      1   1.1   8.5   2.9   1.24     .
3   shyam   .           2   2.1   6.8   3.1    2.23    2 
4   radha   female   2   2.1   7.2   3.8    3.80    .
5   gita      female   2   2.1    .      2.7    3.45    3
6   sita       .            2   3.1    8.9  6.8    4.56    2
7   ranu     female  3    2.1   7.5   5.9    1.83   5
8   tina      female   3   2.1   3.6   2.8    1.23   8
9   shan    male      4   8.1   2.8   1.1    1.67   9 
;
run;

 

data I want like this 

 

1   ram      male      1   1.1   8.0   3.0   1.45    1
2   mohan male      1   1.1   8.5   2.9   1.24     -9
3   shyam   -9           2   2.1   6.8   3.1    2.23    2 
4   radha   female   2   2.1   7.2   3.8    3.80    -9
5   gita      female   2   2.1    -9      2.7    3.45    3
6   sita       -9           2   3.1    8.9  6.8    4.56    2
7   ranu     female  3    2.1   7.5   5.9    1.83   5
8   tina      female   3   2.1   3.6   2.8    1.23   8
9   shan    male      4   8.1   2.8   1.1    1.67   9 

 

I have tried a below code- 

data have;

set want ;

if name = " " then name = "-9";

if TC = . then TC = -9 ;

if Ttg = . then Ttg = -9;

run;

 

But if we have 100 variables, it's not possible to write all the variables in one command. Is there any other way?

 

Thanks in advance

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

@Kurt_Bremser 's great intuition of variable with length 1 byte is something you need a new assignment aka copy the values to the new variable assigned with a higher length and the rest is a breeze

data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1   ram      male      1   1.1   8.0   3.0   1.45    1
2   mohan male      1   1.1   8.5   2.9   1.24     .
3   shyam   .           2   2.1   6.8   3.1    2.23    2 
4   radha   female   2   2.1   7.2   3.8    3.80    .
5   gita      female   2   2.1    .      2.7    3.45    3
6   sita       .            2   3.1    8.9  6.8    4.56    2
7   ranu     female  3    2.1   7.5   5.9    1.83   5
8   tina      female   3   2.1   3.6   2.8    1.23   8
9   shan    male      4   8.1   2.8   1.1    1.67   9 
;
run;


data want;
set have;
array c(*) _character_ ;
array num _numeric_;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=left(-9);
end;
do _n_=1 to dim(num);
if missing(num(_n_)) then num(_n_)=-9;
end;
run;

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

I think ARRAYs would work.

 

But in general, the easiest way to handle missing values is to NOT convert them to –9. SAS knows what to do with missing values, it does not know what to do with –9.

--
Paige Miller
mehul4frnds
Obsidian | Level 7

Thanks for the reply.

Actually, I have to do a genetic analysis where protocol suggests that missing values should be converted into -9

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I quite agree with @PaigeMiller, it makes no sense to convert missings to -9, let alone changing character string to -9.  I mean, really no sense at all for characters.  For numerics, its possible that the stato has planned that, however I would question it and get specific confirmation.  Now if it really is just numbers, then maybe:

options missing=-9;

Will work for you, if not:

data want;
  set have;
  array n{*} _numeric_;
  do i=1 to dim(n);
    if n=. then n=-9;
  end;
run;

As for character, its not worth providing code, is a more complex process for something which doesn't make any sense.

mehul4frnds
Obsidian | Level 7

Thanks for the reply.

 

Yes, first I have to change the format in to 2 characters then only i can add missing value data

Kurt_Bremser
Super User

@mehul4frnds wrote:

Thanks for the reply.

 

Yes, first I have to change the format in to 2 characters then only i can add missing value data


Changing the format won't do it. You need to create a new variable to replace the old one.

mehul4frnds
Obsidian | Level 7

Thanks. I agree with you that a new variable needs to be created. 

 

Kindly suggest about overall conversion 

novinosrin
Tourmaline | Level 20

@Kurt_Bremser 's great intuition of variable with length 1 byte is something you need a new assignment aka copy the values to the new variable assigned with a higher length and the rest is a breeze

data have;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1   ram      male      1   1.1   8.0   3.0   1.45    1
2   mohan male      1   1.1   8.5   2.9   1.24     .
3   shyam   .           2   2.1   6.8   3.1    2.23    2 
4   radha   female   2   2.1   7.2   3.8    3.80    .
5   gita      female   2   2.1    .      2.7    3.45    3
6   sita       .            2   3.1    8.9  6.8    4.56    2
7   ranu     female  3    2.1   7.5   5.9    1.83   5
8   tina      female   3   2.1   3.6   2.8    1.23   8
9   shan    male      4   8.1   2.8   1.1    1.67   9 
;
run;


data want;
set have;
array c(*) _character_ ;
array num _numeric_;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=left(-9);
end;
do _n_=1 to dim(num);
if missing(num(_n_)) then num(_n_)=-9;
end;
run;
mehul4frnds
Obsidian | Level 7

Thank you @novinosrin so much for the reply.

 

I used @Kurt_Bremser 's suggestion and your codes work very well.

 

Thanks again

 

novinosrin
Tourmaline | Level 20

@mehul4frnds You are welcome. I am glad!

ballardw
Super User

Another way is to read the data as desired in the first case:

proc format library=work;
invalue $nomiss
' ','.'='-9'
other=_same_;
invalue nomiss
' ','.'=-9
other=_same_;
run;

data have;
length name $ 10 gen $ 6;

informat id nomiss. name $nomiss. gen $nomiss. PROCHI DATE TC HDL LDL Ttg nomiss.;
input id name$ gen$ PROCHI DATE TC HDL LDL Ttg;
datalines;
1   ram   male    1   1.1   8.0   3.0   1.45   1
2   mohan male    1   1.1   8.5   2.9   1.24   .
3   shyam .       2   2.1   6.8   3.1   2.23   2 
4   radha female  2   2.1   7.2   3.8   3.80   .
5   gita  female  2   2.1   .     2.7   3.45   3
6   sita  .       2   3.1   8.9   6.8   4.56   2
7   ranu  female  3   2.1   7.5   5.9   1.83   5
8   tina  female  3   2.1   3.6   2.8   1.23   8
9   shan  male    4   8.1   2.8   1.1   1.67   9 
;
run;

This requires a little more from you to set the lengths of the string variables with either a Length or Attrib statement to prevent the character informat from setting the length to 2. I did not attempt to set a default length for the $nomiss informat as I have no idea what range of values you might be attempting to read.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1463 views
  • 8 likes
  • 6 in conversation