- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a question on how to work with variables with blank values. I recently imported in a dataset using the SAS import wizard. When I brought it in, I have a variable that has some missing values. When it brought the dataset in, insteading of assigning those values as a '.' they're blank instead. I'm trying to use that to exclude those cases, however, it isn't working since the values are blank. Any insight would be appreciated!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When character strings are blank, refer to them as a blank within quotes:
if variable=' ' then delete;
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When character strings are blank, refer to them as a blank within quotes:
if variable=' ' then delete;
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Numeric variables denote missing as '.'.
Use can use the missing()/nmiss/cmiss functions to determine a missing value.
For example
if missing(var) then do ...;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I assume that it is a character variable, not a numeric variable. Indeed, missing values of character variables are stored as single blanks, not periods. To exclude observations with missing values for that variable you could use a WHERE or IF contition of the form varname ne ' ' or, more generally (suitable also for numeric variables), not missing(varname).
Example:
data have;
length c $8;
input c;
cards;
abc
.
def
; /* Please note that the period in the raw data indicates the missing value although C is character! */
data want;
set have;
where not missing(c);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To check if a cell is empty, do not do
if var='';
use the function anyalnum :
if anyalnum(substr(strip(var),1,1)) = 0 then A="empty____"
if anyalnum(substr(strip(var),1,1)) ne 0 then A="not empty"
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Tell us why you make that suggestion.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I imported a file from Excel using the command
libname go pcfiles path = C:\temp\myfile.xlsx' textsize = 32767;
data nicefile ;
set go.'Sheet1$'n;
if _N_=1 then call execute ('proc sort data=nicefile; by var1; run;'); run;
libname go clear;
and I had a process based on missing values in a character variables.
Some cells looked empty but they where not, so my condition if var1='' was sometimes not valid.
Hence the use of the function
if anyalnum(substr(strip(var1),1,1)) ne 0
to make sure var1 starts with a letter or a number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think your advice is useful for eliminating all cells that don't contain numbers or letters, but I'm uncomfortable designating it as a way to see "if a cell is empty". I would not term a cell with non-printable characters as "empty", even if it is unwanted.
data _null_;
x='09'x;
put x= ; /* X looks like a true blank */
put x=hex2.; /* but printed in hex, we see it is not a hex 20 (e.g. a true blank) */
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------