I have a SAS Data Set containing a Character Variable: 'Question' with values as follows:
DQ01, DQ02, DQ03,...,DQ59.
I would like to recode my variable 'Question' so that the values become;
SQ01, SQ02, SQ03,..., SQ59.
How can I recode the above variable using one or more of the following:
a. ARRAYs
b. PROC Format
c. Data Steps ( If A then B Statements )
d. Other Constructs such as CASE
Thanks for your help
How about this?
data have;
input question $;
datalines;
DQ01
DQ02
DQ03
;
run;
data want;
set have;
substr(question,1,2)="SQ";
run;
How about this?
data have;
input question $;
datalines;
DQ01
DQ02
DQ03
;
run;
data want;
set have;
substr(question,1,2)="SQ";
run;
Your particular case doesn't lend itself to any of the mentioned methods.
Since your replacing a D with an S, the translate() function is sufficient. Or Substr as indicated.
Quick note, there's no CASE within a data step, that's the SQL method for implementing if-then logic.
Again, your situation is a simple recode there's no need for anything fancy. Here's a demo of the SUBSTR, TRANSLATE, and a CASE statement, though that would get long fast so it's not really a recommended method.
If your just interested in all the ways to recode a variable I suggest you search on Lexjansen.com, there's many papers written on how to implement the various ways, as well as why one may be better than another.
data want;
set have;
question2=question;
question1=question;
question = translate(question, 'S', 'D');
substr(question1, 1, 1)='S';
select(question2);
when ('DQ01') question2='SQ01';
when ('DQ02') question2='SQ02';
when ('DQ03') question2='SQ03';
otherwise question2='ERROR';
end;
run;
This is something that comes up time and time again. Having many columns is not ideal for any process. It may be useful in an output report, however at all other times working with long data will make your life so much easier. Take your problem, if your data looked like this (long format know as normalised):
data have; question="DQ01"; answer="Y"; output; question="DQ02"; answer="N"; output; question="DQ03"; answer="Y"; output; run;
Your problem is so simple to achieve:
data want; set have; substr(question,1,1)="S"; run;
No need for arrays, couting variables, changing structures etc. a simple statement. This is why programming with Normalised data is far prefereable than working with transposed data - nothing stopping you from transposing the above when you need it for report as the data you program with does not need to look like the output data.
Really can't understand where this need to work with transposed data comes from? Most standards that I have seen (take CDISC for instance) all use normalised datasets as its just far easier to work with.
Perhaps post some of your data (test) in the form of a datastep (so we can read it in), and what the output should look like. Am not really seeing the picture from the two posts.
We all work under NDA's, I am asking for some test data - which mirrors your structure, and what the output should look like. Something simple such as (which cover each occurence):
data have1; var1="abc"; var2=23; output; var1=...; run;
For each of the datasets involved. The reason is I can't see the picture of what is to happen and from what. You say you have a variable called Question, then the values go down the dataset yes? So none of your original questions match. You later mention that you need to merge datasets - there is not indication of what each of the datasets look like, so can't tell you how to do it. If they all have QUESTION and RESULT only, then you might not need to change the variable at all:
proc sql; create table WANT as select COALESCE(A.QUESTION,B.QUESTION...) as QUESTION, A.RESULT as RESULT_A, B.RESULT as RESULT_B, ... from HAVE A full join HAVE B on substr(A.QUESTION,2)=substr(B.QUESTION,2) ... quit;
Sorry, no attachment. I wouldn't be able to download certain files due to security risk anyways, and the other files I wouldn't type in. Please see this post where Reeza provides information on how to create a datastep for test data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.