Desktop productivity for business analysts and programmers

Efficient data cleaning and recoding

Reply
New Contributor
Posts: 2

Efficient data cleaning and recoding

I am new to SAS EG 7.1 and i want to recode the zero values and character string specifically this type "?" in columns X1 to X13 in my dataset to NA. I have in total 13 variables and a response variable Y. i have imported the data and its been converted to a SAS data. I have also succeeded in coding the "0" and "1" in the response variable to "No" and "Yes" as i want. please help!. N:B the data is an xlsx file. Thanks

Super User
Posts: 22,844

Re: Efficient data cleaning and recoding

Posted in reply to ritaedeigba0
Why NA? if the ariables are numeric, wouldn't a SAS missing value be more appropriate?
New Contributor
Posts: 2

Re: Efficient data cleaning and recoding

The original dataset contains 64 variables and i Response variable and
there are so many of the missing values everywhere in the dataset. My
professor said it was better to code them as NA since they were alot.
However, is there an alternative suggestion? I would like to learn it.
Super User
Posts: 22,844

Re: Efficient data cleaning and recoding

Posted in reply to ritaedeigba0

NA isn't typically used in SAS so I would review how SAS stores and treats missing and decide if you want missing or some other variable. A lot depends on what you're doing with the data down the line. 

 

 

PROC Star
Posts: 1,263

Re: Efficient data cleaning and recoding

SAS has "special" missing values; you can use "dot" followed by a letter.

 

So, while both . and .n would be considered missing by computational procedures, you could tell the difference between them in a data step.

 

Tom

Frequent Contributor
Posts: 113

Re: Efficient data cleaning and recoding

Posted in reply to ritaedeigba0

data test;

  array var(3) $2 x1-x3 ("0" "?" "1"); /* X1='0', X2='?' and x3='1' */

   put X1 X2 X3; 

 do i=1 to dim(var);

    if var(i) in ('0','?') then var(i)='NA'; /*The value for  X1='NA', X2='NA' and x3='1' */

   drop i;

 end;

run;

Super User
Posts: 22,844

Re: Efficient data cleaning and recoding

Posted in reply to ShiroAmada

@ShiroAmada Some of those columns are numeric and some will likely be character. 

Ask a Question
Discussion stats
  • 6 replies
  • 165 views
  • 0 likes
  • 4 in conversation