BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Denali
Quartz | Level 8

I have a pretest survey and posttest survey with 4 categories and missing value. I would like to combine 'Definitely yes' and 'Probably yes' into "Yes", 'Probably not' and 'Definitely not' into "No" so each survey question will have "yes, no and missing", three categories in total. Since I have a list of 14 questions like this, I would like to develop a macro so I don't need to write 14 redundant codes like below:

 

data Smoke1;
set Smoke;

/*pretest question 1*/
if pre_B1_dan_c = 'Definitely yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably not' then pre_B1_dan_c_D = " No";
if pre_B1_dan_c = 'Definitely not' then pre_B1_dan_c_D = " No";

/*posttest question 1*/

if po_B1_dan_c = 'Definitely yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably not' then po_B1_dan_c_D = " No";
if po_B1_dan_c = 'Definitely not' then po_B1_dan_c_D = " No";

 

run;

 

Could anyone please help me with the macro? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

You can use a format :

 

proc format;
    value $yn
        "Definitely yes", "Probably yes"="Yes"
        "Definitely not", "Probably not"="No"
        other=" ";
run;

data have;
    length pre_B1_dan_c po_B1_dan_c $15.;
    pre_B1_dan_c="Definitely yes"; po_B1_dan_c="Probably not"; output;
    pre_B1_dan_c="Probably yes"; po_B1_dan_c="Definitely not"; output;
    pre_B1_dan_c=" "; po_B1_dan_c="Probably not"; output;
run;

data want;
    set have;
    pre_B1_dan_c=put(pre_B1_dan_c,$yn.);
	po_B1_dan_c=put(po_B1_dan_c,$yn.);
run;

View solution in original post

10 REPLIES 10
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Use a format:

proc format;
  value $ynm
    "Definately yes","Probably yes"="Yes"
    other="Yes";
run;
data smoke1;
  set smoke;
  format pre_b1_dan_c po_b1_dan_c $ynm.;
run;

Note, not tested as you have not supplied any test data in the form of a datastep which we always ask for and is mentioned under the Post question button.

gamotte
Rhodochrosite | Level 12

You can use a format :

 

proc format;
    value $yn
        "Definitely yes", "Probably yes"="Yes"
        "Definitely not", "Probably not"="No"
        other=" ";
run;

data have;
    length pre_B1_dan_c po_B1_dan_c $15.;
    pre_B1_dan_c="Definitely yes"; po_B1_dan_c="Probably not"; output;
    pre_B1_dan_c="Probably yes"; po_B1_dan_c="Definitely not"; output;
    pre_B1_dan_c=" "; po_B1_dan_c="Probably not"; output;
run;

data want;
    set have;
    pre_B1_dan_c=put(pre_B1_dan_c,$yn.);
	po_B1_dan_c=put(po_B1_dan_c,$yn.);
run;

Denali
Quartz | Level 8

Thanks for your suggestions. I am attaching my dataset here. 

 

Below is the list of variables to be dichotimized:

pre_B1_dan_c po_B1_dan_c pre_B2_dan_ec po_B2_dan_ec pre_B3_harm_c po_B3_harm_c
pre_B4_harm_ec po_B4_harm_ec pre_B5_quit_ec po_B5_quit_ec pre_B6_addict_c po_B6_addict_c pre_B7_addict_ec po_B7_addict_ec pre_B8_media po_B8_media

 

I was testing the proc format code that yousuggested, but I would like to keep the original variable "pre_B1_dan_c" and create a new variable  "pre_B1_dan_c_D" with dichotimize categories. This way I will have both raw information and combined categories in the dataset.

 

There are 3 pairs of questions ( pre_B6_addict_c, po_B6_addict_c, pre_B7_addict_ec, po_B7_addict_ec, pre_B8_media and po_B8_media) have different categories to dichotomize, please see format $ynB.

 

proc format;


value $ynA
"Definitely yes", "Probably yes"="Yes"
"Definitely not", "Probably not"="No"
other=" ";

 

value $ynB
"Very likely", "Somewhat likely" = "Yes"
"Neither likely or unlikely", 'Somewhat unlikely', 'Very unlikely' = "No";
run;

 

 

Could you please help me with refining my code? Thank you!

Denali
Quartz | Level 8

Right now my code is as below, but it is very long if I include the all the variables.....

 

data Smoke1;
set Smoke;

 

if pre_B1_dan_c = 'Definitely yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably not' then pre_B1_dan_c_D = " No";
if pre_B1_dan_c = 'Definitely not' then pre_B1_dan_c_D = " No";

 

if po_B1_dan_c = 'Definitely yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably not' then po_B1_dan_c_D = " No";
if po_B1_dan_c = 'Definitely not' then po_B1_dan_c_D = " No";

 

if pre_B2_dan_ec = 'Definitely yes' then pre_B2_dan_ec_D = " Yes";
if pre_B2_dan_ec = 'Probably yes' then pre_B2_dan_ec_D = " Yes";
if pre_B2_dan_ec = 'Probably not' then pre_B2_dan_ec_D = " No";
if pre_B2_dan_ec = 'Definitely not' then pre_B2_dan_ec_D = " No";

 

if po_B2_dan_ec = 'Definitely yes' then po_B2_dan_ec_D = " Yes";
if po_B2_dan_ec = 'Probably yes' then po_B2_dan_ec_D = " Yes";
if po_B2_dan_ec = 'Probably not' then po_B2_dan_ec_D = " No";
if po_B2_dan_ec = 'Definitely not' then po_B2_dan_ec_D = " No";

 

if pre_B3_harm_c = 'Definitely yes' then pre_B3_harm_c_D = " Yes";
if pre_B3_harm_c = 'Probably yes' then pre_B3_harm_c_D = " Yes";
if pre_B3_harm_c = 'Probably not' then pre_B3_harm_c_D = " No";
if pre_B3_harm_c = 'Definitely not' then pre_B3_harm_c_D = " No";

 

if po_B3_harm_c = 'Definitely yes' then po_B3_harm_c_D = " Yes";
if po_B3_harm_c = 'Probably yes' then po_B3_harm_c_D = " Yes";
if po_B3_harm_c = 'Probably not' then po_B3_harm_c_D = " No";
if po_B3_harm_c = 'Definitely not' then po_B3_harm_c_D = " No";

 

run;

 

 

Reeza
Super User
Did you look at using arrays as mentioned? The tutorial is pretty straightforward.
gamotte
Rhodochrosite | Level 12
The use of a format allows to avoid all those ifs. By using an array as suggested by@Reeza, you can perform the same treatments for all your variables.
Denali
Quartz | Level 8

I am not familiar with the array code...... right now I am still using less simplified code below and got what I want. But I was hoping to refine it using an easier way........

 

data Smoke1;
set Smoke;

if pre_B1_dan_c = 'Definitely yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably yes' then pre_B1_dan_c_D = " Yes";
if pre_B1_dan_c = 'Probably not' then pre_B1_dan_c_D = "No";
if pre_B1_dan_c = 'Definitely not' then pre_B1_dan_c_D = "No";

if po_B1_dan_c = 'Definitely yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably yes' then po_B1_dan_c_D = " Yes";
if po_B1_dan_c = 'Probably not' then po_B1_dan_c_D = "No";
if po_B1_dan_c = 'Definitely not' then po_B1_dan_c_D = "No";

if pre_B2_dan_ec = 'Definitely yes' then pre_B2_dan_ec_D = " Yes";
if pre_B2_dan_ec = 'Probably yes' then pre_B2_dan_ec_D = " Yes";
if pre_B2_dan_ec = 'Probably not' then pre_B2_dan_ec_D = "No";
if pre_B2_dan_ec = 'Definitely not' then pre_B2_dan_ec_D = "No";

if po_B2_dan_ec = 'Definitely yes' then po_B2_dan_ec_D = " Yes";
if po_B2_dan_ec = 'Probably yes' then po_B2_dan_ec_D = " Yes";
if po_B2_dan_ec = 'Probably not' then po_B2_dan_ec_D = "No";
if po_B2_dan_ec = 'Definitely not' then po_B2_dan_ec_D = "No";

if pre_B3_harm_c = 'Definitely yes' then pre_B3_harm_c_D = " Yes";
if pre_B3_harm_c = 'Probably yes' then pre_B3_harm_c_D = " Yes";
if pre_B3_harm_c = 'Probably not' then pre_B3_harm_c_D = "No";
if pre_B3_harm_c = 'Definitely not' then pre_B3_harm_c_D = "No";

if po_B3_harm_c = 'Definitely yes' then po_B3_harm_c_D = " Yes";
if po_B3_harm_c = 'Probably yes' then po_B3_harm_c_D = " Yes";
if po_B3_harm_c = 'Probably not' then po_B3_harm_c_D = "No";
if po_B3_harm_c = 'Definitely not' then po_B3_harm_c_D = "No";

if pre_B4_harm_ec = 'Definitely yes' then pre_B4_harm_ec_D = " Yes";
if pre_B4_harm_ec = 'Probably yes' then pre_B4_harm_ec_D = " Yes";
if pre_B4_harm_ec = 'Probably not' then pre_B4_harm_ec_D = "No";
if pre_B4_harm_ec = 'Definitely not' then pre_B4_harm_ec_D = "No";

if po_B4_harm_ec = 'Definitely yes' then po_B4_harm_ec_D = " Yes";
if po_B4_harm_ec = 'Probably yes' then po_B4_harm_ec_D = " Yes";
if po_B4_harm_ec = 'Probably not' then po_B4_harm_ec_D = "No";
if po_B4_harm_ec = 'Definitely not' then po_B4_harm_ec_D = "No";

if pre_B5_quit_ec = 'Definitely yes' then pre_B5_quit_ec_D = " Yes";
if pre_B5_quit_ec = 'Probably yes' then pre_B5_quit_ec_D = " Yes";
if pre_B5_quit_ec = 'Probably not' then pre_B5_quit_ec_D = "No";
if pre_B5_quit_ec = 'Definitely not' then pre_B5_quit_ec_D = "No";

if po_B5_quit_ec = 'Definitely yes' then po_B5_quit_ec_D = " Yes";
if po_B5_quit_ec = 'Probably yes' then po_B5_quit_ec_D = " Yes";
if po_B5_quit_ec = 'Probably not' then po_B5_quit_ec_D = "No";
if po_B5_quit_ec = 'Definitely not' then po_B5_quit_ec_D = "No";

if pre_B6_addict_c = 'Very likely' then pre_B6_addict_c_D = " Yes";
if pre_B6_addict_c = 'Somewhat likely' then pre_B6_addict_c_D = " Yes";
if pre_B6_addict_c = 'Neither likely or unlikely' then pre_B6_addict_c_D = "No";
if pre_B6_addict_c = 'Somewhat unlikely' then pre_B6_addict_c_D = "No";
if pre_B6_addict_c = 'Very unlikely' then pre_B6_addict_c_D = "No";

if po_B6_addict_c = 'Very likely' then po_B6_addict_c_D = " Yes";
if po_B6_addict_c = 'Somewhat likely' then po_B6_addict_c_D = " Yes";
if po_B6_addict_c = 'Neither likely or unlikely' then po_B6_addict_c_D = "No";
if po_B6_addict_c = 'Somewhat unlikely' then po_B6_addict_c_D = "No";
if po_B6_addict_c = 'Very unlikely' then po_B6_addict_c_D = "No";

if pre_B7_addict_ec = 'Very likely' then pre_B7_addict_ec_D = " Yes";
if pre_B7_addict_ec = 'Somewhat likely' then pre_B7_addict_ec_D = " Yes";
if pre_B7_addict_ec = 'Neither likely or unlikely' then pre_B7_addict_ec_D = "No";
if pre_B7_addict_ec = 'Somewhat unlikely' then pre_B7_addict_ec_D = "No";
if pre_B7_addict_ec = 'Very unlikely' then pre_B7_addict_ec_D = "No";

if po_B7_addict_ec = 'Very likely' then po_B7_addict_ec_D = " Yes";
if po_B7_addict_ec = 'Somewhat likely' then po_B7_addict_ec_D = " Yes";
if po_B7_addict_ec = 'Neither likely or unlikely' then po_B7_addict_ec_D = "No";
if po_B7_addict_ec = 'Somewhat unlikely' then po_B7_addict_ec_D = "No";
if po_B7_addict_ec = 'Very unlikely' then po_B7_addict_ec_D = "No";

if pre_B8_media = 'Very likely' then pre_B8_media_D = " Yes";
if pre_B8_media = 'Somewhat likely' then pre_B8_media_D = " Yes";
if pre_B8_media = 'Neither likely or unlikely' then pre_B8_media_D = "No";
if pre_B8_media = 'Somewhat unlikely' then pre_B8_media_D = "No";
if pre_B8_media = 'Very unlikely' then pre_B8_media_D = "No";

if po_B8_media = 'Very likely' then po_B8_media_D = " Yes";
if po_B8_media = 'Somewhat likely' then po_B8_media_D = " Yes";
if po_B8_media = 'Neither likely or unlikely' then po_B8_media_D = "No";
if po_B8_media = 'Somewhat unlikely' then po_B8_media_D = "No";
if po_B8_media = 'Very unlikely' then po_B8_media_D = "No";

run;

Reeza
Super User

The first & third examples in the link exactly your situation, recoding variables. Is there something there that's not making sense? Did you have any issues running the example code?

 


@Denali wrote:

I am not familiar with the array code...... right now I am still using less simplified code below and got what I want. But I was hoping to refine it using an easier way........

 

gamotte
Rhodochrosite | Level 12

Here is a solution using sashelp.vcolumn to determine the columns to transform  :

 

proc format;
    value $yn
        "Definitely yes", "Probably yes"="Yes"
        "Definitely not", "Probably not"="No"
        other=" ";
run;

data have;
    length pre_B1_dan_c po_B1_dan_c $15.;
    pre_B1_dan_c="Definitely yes"; po_B1_dan_c="Probably not"; output;
    pre_B1_dan_c="Probably yes"; po_B1_dan_c="Definitely not"; output;
    pre_B1_dan_c=" "; po_B1_dan_c="Probably not"; output;
run;

data _NULL_;
    call execute("data want; set have;");
    
    do until(eof);
        set sashelp.vcolumn end=eof;
        where LIBNAME="WORK" and MEMNAME="HAVE" 
          and (upcase(NAME)=:"PRE_B" or upcase(NAME)=:"PO_B");

        call execute('length '||cats(NAME,'_D')||' $3;');
        call execute(cats(NAME, '_D=put(',NAME,',$yn.);'));
    end;
    call execute('run;');

    stop;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2034 views
  • 1 like
  • 4 in conversation