Hi All,
I hope you are doing well. I was wondering how I could make "D" "P" "M" as "." Thanks!
Obs | idauniq | cesd1 | cesd2 | cesd3 | cesd4 | cesd5 | cesd6 | cesd7 | cesd8 | cesd9 |
1 | 100044 | 0 | 1 | 1 | D | |||||
2 | 100056 | 3 | 1 | 5 | ||||||
3 | 100058 | 1 | 0 | P | P | P | ||||
4 | 100075 | 0 | 0 | 1 | ||||||
5 | 100080 | 1 | 0 | 1 | ||||||
6 | 100104 | 5 | 0 | P | ||||||
7 | 103713 | 4 | 4 | 4 | ||||||
8 | 103730 | 3 | 1 | 0 | P | |||||
9 | 103735 | 1 | 0 | 0 | ||||||
10 | 103739 | 0 | 4 | M | ||||||
11 | 103743 | 1 | 2 | P | ||||||
12 | 103762 | 6 | 3 | P | ||||||
13 | 103765 | 0 | 0 | 0 | ||||||
14 | 103766 | 4 | 2 | 1 | ||||||
15 | 103788 | 0 | 0 | 1 | ||||||
16 | 103798 | 1 | 1 | 0 | ||||||
17 | 103814 | 5 | 2 | 3 | ||||||
18 | 103816 | 0 | 1 | 0 | ||||||
19 | 103820 | 0 | 1 | 1 | ||||||
20 | 103832 | 1 | 1 | P | 0 |
If they're numeric, then they're already missing. They're just special missing.
If you want to change them all to non-special missing, you could do it like (untested);
data want;
set have;
array nums {*} _numeric_ ;
do i=1 to dim(nums);
if missing(nums{i}) then nums{i}=.;
end;
drop i;
run;
But note that you'll be losing information if you do this, as those missing value letters probably have different meanings.
This implies that you have character variables not numeric. If you convert to numeric, do you need to differentiate between D/P/M and the blanks? In that case, it may be best to use special missing characters.
If it's numeric, do those D/P/M have a period in front of them?
If they're numeric, then they're already missing. They're just special missing.
If you want to change them all to non-special missing, you could do it like (untested);
data want;
set have;
array nums {*} _numeric_ ;
do i=1 to dim(nums);
if missing(nums{i}) then nums{i}=.;
end;
drop i;
run;
But note that you'll be losing information if you do this, as those missing value letters probably have different meanings.
@nwang5 wrote:
It's numeric. Yes, I want to make D/P/M as blank or "."
Then this question is important to have answered:
Do you need to differentiate between D/P/M and the blanks?
I see you have an answer but I'll also add, this is just a visual difference. The values would not be included in calculations or affect any filtering using MISSING() function.
Normally, we work with SAS data sets, and whatever this is that you are showing us is not a SAS data set.
I'm guessing that ... no never mind ... I shouldn't guess. Please tell us if these variables are numeric or character in your SAS data set, according to PROC CONTENTS or similar.
@nwang5 wrote:
It's numeric
If your data set has those variables as NUMERIC and display values like D P or M then it is very likely that your value is already missing but has had the special missing assigned. While we typically only see the . for missing value SAS has an additional 27 special missing values that are assigned as a dot followed by a letter or the underscore character. This feature is often used to designate numeric values that are missing for a specific reason. If this is survey or question results I might think that the .D was assigned when a response was "Don't Know".
You can check my thought by running code like:
ods select nlevels; proc freq data=yourdatasetname nlevels; tables cesd3; /* of if you want results for all the variables the cesd: <the colon is a wildcard for names starting with cesd*/ run;
The ODS select is to just show the nlevels table. The output table will show a summary of counts of levels and missing
An example: The data step shows one way to read data with special missing into numeric values. If you open the Have table you will see P M an D
data have; input x y; datalines; 1 .M .P 3 1 .P 1 .D . . ; ods select nlevels; proc freq data=have nlevels; tables x y; run;
The output:
Number of Variable Levels | |||
---|---|---|---|
Variable | Levels | Missing Levels | Nonmissing Levels |
x | 3 | 2 | 1 |
y | 5 | 4 | 1 |
Levels is the total number of different levels for the variable, Missing levels is number of different missing levels. We read x with . and .P so this reports 2 missing levels. We get 4 missing levels because y uses . .P .M and .D
These special missing values are treated as missing for calculations, reports or participation in models unless you use options to force the missing values into the analysis or report. They main difference is that appearance you see AND that you could assign a format so that Proc Print/Report/tabulate displays more descriptive text so there is meaning in the report. You can also test if that specific missing is present for a variable.
I typically use .D , Don't Know, .R, Refused to Answer, .N , Not Sure with survey data.
Other data you might suppress values because the sample size is unreliable/ too small or similar. The special missing allows that and then your report might show text like "Insufficient Sample". Many uses.
Use the MISSING statement to tell SAS which single letter you want the INPUT statement to treat as indications of special missing values.
missing dmp;
data have;
input idauniq (cesd1-cesd9) (2.);
count=n(of cesd1-cesd9);
mean=mean(of cesd1-cesd9);
cards;
100044 0 1 1 D
100056 3 1 5
100058 1 0 P P P
100075 0 0 1
100080 1 0 1
100104 5 0 P
103713 4 4 4
103730 3 1 0 P
103735 1 0 0
103739 0 4 M
103743 1 2 P
103762 6 3 P
103765 0 0 0
103766 4 2 1
103788 0 0 1
103798 1 1 0
103814 5 2 3
103816 0 1 0
103820 0 1 1
103832 1 1 P 0
;
Result
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.