Architecting, installing and maintaining your SAS environment

keeping some values of the observations unchanged

Reply
Regular Contributor
Posts: 161

keeping some values of the observations unchanged

Hello dear SAS experts,

I have a large data set that needs to be recoded to binary, each item, obviously, have a different answer but all have the same value for OMITTED, NOT ADMINISTERED, and NOT REACHED.

Is there a way to recode but not have to do the same codes over and over for each item?

Thank you

R

Trusted Advisor
Posts: 2,115

keeping some values of the observations unchanged

binary?  That's just two values.  A true binary variable can't even have any representation for NULL or missing. 

Please provide an example with a couple of variables and their before and after values.

Respected Advisor
Posts: 3,156

keeping some values of the observations unchanged

Since OP has not answered yet, my wild guess would be:

data _null_;

a='OMITTED';

put a $binary.;

run;

Regards,

Haikuo

Regular Contributor
Posts: 161

keeping some values of the observations unchanged

these are binary with:

  • missing: 99 or 9 changed to 9
  • not administered: 98 or 8 changed all to  8

right now this is what I am doing, which is long for coding:

IF M022043=4 THEN M022043=1;

else if M022043=8 then M022043=8;

else if M022043=98 then M022043=8;

else if M022043=99 then M022043=9;

else if M022043=9 then M022043=9;

ELSE M022043=0;

IF M022046=10 or  M022046=19 THEN M022046=1;

else if M022046=8 then M022046=8;

else if M022046=98 then M022046=8;

else if M022046=99 then M022046=9;

else if M022046=9 then M022046=9;

ELSE M022046=0;

below is an example of before and after.

beforeafter
M0M1M02M0M1M02
1102010
4994191
5104011
4102110
4103110
1994091
4104111
1104011
4714101
1104011
4102110
5104011
4102110
1101010
410411

1

Thank you

R

Respected Advisor
Posts: 4,173

Re: keeping some values of the observations unchanged

Something along the line of the code example below should do:

proc format;

  value M0Recode

    4    =1

    8,98 =8

    9,99 =9

    otherwise=0

    ;

  value M1Recode

    10,19 =1

    8,98  =8

    9,99  =9

    otherwise=0

    ;

run;

data have;

  infile datalines dsd;

  input M0 M1 M02;

  M0_recoded=input(put(M0,M0Recode.),8.);

  M1_recoded=input(put(M1,M1Recode.),8.);

  M02_recoded=input(put(M02,M0Recode.),8.);

datalines;

1,10,2

4,99,4

5,10,4

4,10,2

4,10,3

1,99,4

4,10,4

1,10,4

4,71,4

1,10,4

4,10,2

5,10,4

4,10,2

1,10,1

4,10,4

;

run;

proc print data=have;

run;

Regular Contributor
Posts: 161

keeping some values of the observations unchanged

Hello Patrick,

I understand the first part (shown below) but don't quit get the rest. Am I creating a new format and then putting it in a permanent data set? if yes how does the rest does that? Also, It seems like I have use other, rather than otherwise, is that correct?

Thank you

proc format;

  value M0Recode

    4    =1

    8,98 =8

    9,99 =9

    otherwise=0

    ;

  value M1Recode

    10,19 =1

    8,98  =8

    9,99  =9

    otherwise=0

    ;

run;

Super User
Posts: 11,343

keeping some values of the observations unchanged

Yes you are creating a new format, not it is not put in any permanent data set.

The example that Patrick gave stores the 1,8,9 or 0 value in the target recoded variable. If you have other variables to recode that have different values you wish to code to 1,8,9 and 0 follow the pattern in the examples.

Yes, should be OTHER instead of OTHERWISE in the value statements.

Regular Contributor
Posts: 161

keeping some values of the observations unchanged

thank you

Regular Contributor
Posts: 161

keeping some values of the observations unchanged

Hi,

I have another quetion,

When I apply the format to the variables, it looks like they are stored as characher variable since they are all left justidied. I was wondering why is that the case.

Thank you

R

Respected Advisor
Posts: 4,173

Re: keeping some values of the observations unchanged

M1_recoded=input(put(M1,M1Recode.),8.);

1. proc format;

    value M0Recode

    ....

Creates a numeric format and stores it in a macro catalog in the work library (default behaviour)

For permanent storage of formats use: proc format lib=<permanent library>...

Then add the catalog to the format searchpath (best in autoexec or configuration file) options FMTSEARCH=(catalog-specification-1... catalog-specification-n)

2. put(M1,M1Recode.)

Applies a numeric format to a numeric variable. The returned value of a put function writes always a string (character).

3. input(<return string from put()>,8.);

Reads a string, uses informat 8. to interprete this string, and returns a numeric value.

4. M1_recoded=

The result of the input function (numeric value) gets assigned to M1_recoded. As input() returns an numeric value variable M1_recoded will be created as numeric variable.

Running PROC CONTENS on your data set can show you how variables are defined:

Proc Contents data=work.have;

quit;

HTH

Patrick

Regular Contributor
Posts: 161

keeping some values of the observations unchanged

this is great. But would you please check my code for accuracy as well.

will these variables be still numeric?

Thank you

proc format LIBRARY=LIB;

value R4C

  1 =1

  8,98 =8

  9,99 =9

  other=0

;

value R2C

2=1

8,98 =8

  9,99 =9

  other=0

;

RUN;

LIBNAME LIB 'C:\Users\Roofia\Desktop\GA\DATA\SAS\sascontrol';

OPTIONS FMTSEARCH=(LIB);

RUN;

dATA LIB.SUBSET_BK5_6try;

SET LIB.GRADE8A ;

format M022043 R4C. M022049 R2C.;

run;

Respected Advisor
Posts: 4,173

keeping some values of the observations unchanged

-  You should define your libname statement before you use it. I also would use some other name for the libref than LIB (which is a keyword).

LIBNAME LIB 'c:\temp';

proc format LIBRARY=LIB;
value R4C
1 =1
8,98 =8
9,99 =9
other=0
;
value R2C
2=1
8,98 =8
9,99 =9
other=0
;
RUN;

-  You don't need a 'run;' statement after a SAS global command. You should use 'append;' to not overwrite existing definitions.

OPTIONS APPEND=(FMTSEARCH=LIB) ;

-  Assigning a format to a variable doesn't change the value of the variable (no recoding). It only changes how the variable values will be printed. As a format statement doesn't create new variables, the variables will still be of the same type (numeric in your case). The format statement just changes the format attribute of these variables.

dATA LIB.SUBSET_BK5_6try;
  SET LIB.GRADE8A ;
  format M022043 R4C. M022049 R2C.;
run;

Ask a Question
Discussion stats
  • 11 replies
  • 464 views
  • 0 likes
  • 5 in conversation