Hello,
I currently have a dataset with 3 variables I need to merge, "person" "tele" and "phone". each variable has a yes/no response. I want to create one variable "visit_type" that codes "yes" for person as "1", "yes" for "tele" as "2" and "yes" for phone as "3" while dropping all of the "no"s. Can anyone help with this? Thanks.
Are your values in data set named DATASET lower case 'yes', or are some of the letters upper case?
What if all three are YES then what do you want to do?
Please provide example data. Provide input example and what output you want for that input.
All three categories are mutually exclusive, so if the visit was in person it is not tele or phone.
For example:
Input data:
person tele phone
yes no no
no yes no
no no yes
Output:
Visit_Type
1
2
3
Straightforward coding and readable:
data want;
    set have;
    /* Person, Tele and Phone are mutually exclusive */
    if person='yes' then visit_type=1;
    else if tele='yes' then visit_type=2;
    else if phone='yes' then visit_type=3;
run;Hi Paige,
I tried running that code and this is what the output looked like when I attempted to run a chi squared analysis using the new variable:
data dataseta;
set dataset;
/* Person, Tele and Phone are mutually exclusive */
if consentperson='yes' then visit_typeconsent=1;
else if consenttele='yes' then visit_typeconsent=2;
else if consentphone='yes' then visit_typeconsent=3;
run;
proc freq data=dataseta;
TABLES visit_typeconsent*consentdeviation/chisq expected nopercent;
run;
proc freq data=dataseta;
tables visit_typeconsent*consent_complete/chisq expected nopercent;
run;
| reach | 
| For visit_typeconsent * consentdeviation all data are missing because all levels of variable visit_typeconsent are missing. | 
| reach | 
| For visit_typeconsent * consent_complete all data are missing because all levels of variable visit_typeconsent are missing. | 
Are there errors or warnings in the LOG? Please show us the complete LOG, with nothing chopped out, for this data step, pasting it as text into the window that appears when you click on the </> icon. DO NOT SKIP THIS STEP.
Are your values lower case 'yes', or are some of the letters upper case?
Alternative, less coding but perhaps not as logically intuitive if you look at it
data want;
    set have;
    /* Person, Tele and Phone are mutually exclusive */
    visit_type=whichc('yes',person,tele,phone);
run;
Here is the log:
4020 data dataseta;
4021 set dataset;
4022 /* Person, Tele and Phone are mutually exclusive */
4023 if consentperson='yes' then visit_type=1;
4024 else if consenttele='yes' then visit_type=2;
4025 else if consentphone='yes' then visit_type=3;
4026 run;
NOTE: There were 166 observations read from the data set WORK.DATASET.
NOTE: The data set WORK.DATASETA has 166 observations and 254 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds
4027
4028
4029 proc freq data=dataseta;
4030 TABLES consentperson*consentdeviation/chisq expected nopercent;
4031 run;
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.14 seconds
cpu time 0.09 seconds
4032
4033 proc freq data=dataseta;
4034 tables consentperson*consent_complete/chisq expected nopercent;
4035 run;
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.13 seconds
cpu time 0.03 seconds
4036 data dataseta;
4037 set dataset;
4038 /* Person, Tele and Phone are mutually exclusive */
4039 if consentperson='yes' then visit_typeconsent=1;
4040 else if consenttele='yes' then visit_typeconsent=2;
4041 else if consentphone='yes' then visit_typeconsent=3;
4042 run;
NOTE: There were 166 observations read from the data set WORK.DATASET.
NOTE: The data set WORK.DATASETA has 166 observations and 254 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
4043
4044
4045 proc freq data=dataseta;
4046 TABLES visit_typeconsent*consentdeviation/chisq expected nopercent;
4047 run;
NOTE: No statistics are computed for visit_typeconsent * consentdeviation because all data are
missing.
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.06 seconds
cpu time 0.00 seconds
4048
4049 proc freq data=dataseta;
4050 tables visit_typeconsent*consent_complete/chisq expected nopercent;
4051 run;
NOTE: No statistics are computed for visit_typeconsent * consent_complete because all data are
missing.
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.10 seconds
cpu time 0.03 seconds
It's important that when you show us the LOG, you show it to us following the instructions given, which formats the LOG properly to maximize readability, and this then gets you results faster. I no longer bother trying to decipher LOGs where the instructions have not been followed. Thus, I repeat, with emphasis:
Please show us the complete LOG, with nothing chopped out, for this data step, pasting it as text into the window that appears when you click on the </> icon. DO NOT SKIP THIS STEP.
apologies..
4036  data dataseta;
4037      set dataset;
4038      /* Person, Tele and Phone are mutually exclusive */
4039      if consentperson='yes' then visit_typeconsent=1;
4040      else if consenttele='yes' then visit_typeconsent=2;
4041      else if consentphone='yes' then visit_typeconsent=3;
4042  run;
NOTE: There were 166 observations read from the data set WORK.DATASET.
NOTE: The data set WORK.DATASETA has 166 observations and 254 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
4043
4044
4045  proc freq data=dataseta;
4046  TABLES visit_typeconsent*consentdeviation/chisq expected nopercent;
4047  run;
NOTE: No statistics are computed for visit_typeconsent * consentdeviation because all data are
      missing.
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds
4048
4049  proc freq data=dataseta;
4050  tables visit_typeconsent*consent_complete/chisq expected nopercent;
4051  run;
NOTE: No statistics are computed for visit_typeconsent * consent_complete because all data are
      missing.
NOTE: There were 166 observations read from the data set WORK.DATASETA.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.10 seconds
      cpu time            0.03 seconds
Are your values in data set named DATASET lower case 'yes', or are some of the letters upper case?
Thanks so much I did not realize it was case sensitive. I corrected this and it is working great now. Thank you!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.
