BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nwang5
Obsidian | Level 7

Hi All,

 

I hope you are doing well. I was wondering how I could make "D" "P" "M" as "." Thanks!

 

Obs idauniq cesd1 cesd2 cesd3 cesd4 cesd5 cesd6 cesd7 cesd8 cesd9
1 100044 0 1 1 D          
2 100056 3 1 5            
3 100058 1 0 P   P P      
4 100075 0 0 1            
5 100080 1 0     1        
6 100104 5 0   P          
7 103713 4 4 4            
8 103730 3 1   0 P        
9 103735 1 0 0            
10 103739 0 4 M            
11 103743 1 2 P            
12 103762 6 3   P          
13 103765 0 0 0            
14 103766 4 2 1            
15 103788 0 0 1            
16 103798 1 1 0            
17 103814 5 2 3            
18 103816 0 1 0            
19 103820 0 1 1            
20 103832 1 1 P 0          
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

If they're numeric, then they're already missing.  They're just special missing.

 

If you want to change them all to non-special missing, you could do it like (untested);

 

data want;
  set have;
  array nums {*} _numeric_ ;
  do i=1 to dim(nums);
    if missing(nums{i}) then nums{i}=.;
  end;
  drop i;
run;

But note that you'll be losing information if you do this, as those missing value letters probably have different meanings.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

13 REPLIES 13
Reeza
Super User

This implies that you have character variables not numeric. If you convert to numeric, do you need to differentiate between D/P/M and the blanks? In that case, it may be best to use special missing characters. 

 

If it's numeric, do those D/P/M have a period in front of them?

nwang5
Obsidian | Level 7
It's numeric. Yes, I want to make D/P/M as blank or "."
Quentin
Super User

If they're numeric, then they're already missing.  They're just special missing.

 

If you want to change them all to non-special missing, you could do it like (untested);

 

data want;
  set have;
  array nums {*} _numeric_ ;
  do i=1 to dim(nums);
    if missing(nums{i}) then nums{i}=.;
  end;
  drop i;
run;

But note that you'll be losing information if you do this, as those missing value letters probably have different meanings.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
nwang5
Obsidian | Level 7
Thank you so much! Really appreciated
Reeza
Super User

@nwang5 wrote:
It's numeric. Yes, I want to make D/P/M as blank or "."

Then this question is important to have answered:

Do you need to differentiate between D/P/M and the blanks? 

nwang5
Obsidian | Level 7
nope~
Reeza
Super User

I see you have an answer but I'll also add, this is just a visual difference. The values would not be included in calculations or affect any filtering using MISSING() function. 

 

 

nwang5
Obsidian | Level 7
Thank you so much for pointing it! I haven't noticed that. I was wondering how I could make D/N/P as missing or blank that can be included as missing value.
PaigeMiller
Diamond | Level 26

Normally, we work with SAS data sets, and whatever this is that you are showing us is not a SAS data set.

 

I'm guessing that ... no never mind ... I shouldn't guess. Please tell us if these variables are numeric or character in your SAS data set, according to PROC CONTENTS or similar.

--
Paige Miller
nwang5
Obsidian | Level 7
It's numeric
ballardw
Super User

@nwang5 wrote:
It's numeric

If your data set has those variables as NUMERIC and display values like D P or M then it is very likely that your value is already missing but has had the special missing assigned. While we typically only see the . for missing value SAS has an additional 27 special missing values that are assigned as a dot followed by a letter or the underscore character. This feature is often used to designate numeric values that are missing for a specific reason. If this is survey or question results I might think that the .D was assigned when a response was "Don't Know".

 

You can check my thought by running code like:

ods select nlevels;
proc freq data=yourdatasetname nlevels;
   tables cesd3; /* of if you want results for all the variables the cesd: <the colon is a wildcard for names starting with cesd*/
run;

The ODS select is to just show the nlevels table. The output table will show a summary of counts of levels and missing

An example: The data step shows one way to read data with special missing into numeric values. If you open the Have table you will see P M an D

data have;
  input x y;
datalines;
1 .M
.P 3
1  .P
1 .D
. .
;

ods select nlevels;
proc freq data=have nlevels;
   tables x y;
run;

The output:

Number of Variable Levels
Variable Levels Missing Levels Nonmissing Levels
x 3 2 1
y 5 4 1

Levels is the total number of different levels for the variable, Missing levels is number of different missing levels. We read x with . and .P so this reports 2 missing levels. We get 4 missing levels because y uses . .P .M and .D

 

These special missing values are treated as missing for calculations, reports or participation in models unless you use options to force the missing values into the analysis or report. They main difference is that appearance you see AND that you could assign a format so that Proc Print/Report/tabulate displays more descriptive text so there is meaning in the report. You can also test if that specific missing is present for a variable.

 

I typically use .D , Don't Know, .R, Refused to Answer, .N , Not Sure with survey data.

Other data you might suppress values because the sample size is unreliable/ too small or similar. The special missing allows that and then your report might show text like "Insufficient Sample". Many uses.

Tom
Super User Tom
Super User

Use the MISSING statement to tell SAS which single letter you want the INPUT statement to treat as indications of special missing values.

missing dmp;
data have;
   input idauniq (cesd1-cesd9) (2.);
   count=n(of cesd1-cesd9);
   mean=mean(of cesd1-cesd9);
cards;
100044 0 1 1 D          
100056 3 1 5            
100058 1 0 P   P P      
100075 0 0 1            
100080 1 0     1        
100104 5 0   P          
103713 4 4 4            
103730 3 1   0 P        
103735 1 0 0            
103739 0 4 M            
103743 1 2 P            
103762 6 3   P          
103765 0 0 0            
103766 4 2 1            
103788 0 0 1            
103798 1 1 0            
103814 5 2 3            
103816 0 1 0            
103820 0 1 1            
103832 1 1 P 0    
;

Result

Tom_0-1682746092708.png

 

 

nwang5
Obsidian | Level 7
Thank you so much! Have a nice weekend.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1042 views
  • 11 likes
  • 6 in conversation