Hello,
My data has a variable "conditions" which has values from 0 to 8, depending upon what the patient has. I want to do univariate regression analysis for each one of these condition. so, need to recode into separate variables for each value. For example: data is following where each number indicates a disease, here 0: smoking status, 1: liver disease, 2: renal disease and so on.
Condition Smoker liverdiseases
0 1 0
0,1 1 1
0,2 1 0
1,3 0 1
1,2 0 1
0 1 0
3,4,6,7 0 0
Smoker and liver diseases are examples for outcome that I am looking for.
I want to recode my data into new variables : smoker, liver, renal, and so on, where 1 is no smoking, 0 is smoking, and same for other variables.
When I used code:
data name;
set previous;
where Conditions contains '0';
smoker='1';
run;
I get the data with only '0' values, and remaining gets deleted. I want to sort all and keep all the data.
Thanks
@Sakshi13 wrote:
Hello,
My data has a variable "conditions" which has values from 0 to 8, depending upon what the patient has. I want to do univariate regression analysis for each one of these condition. so, need to recode into separate variables for each value. For example: data is following where each number indicates a disease, here 0: smoking status, 1: liver disease, 2: renal disease and so on.
Condition
0
0,1
0,2
1,3
1,2
2,3,5
3,4,6,7
I want to recode tmy data into new variables : smoker, liver, renal, and so on, where 1 is no smoking, 0 is smoking, and same for other variables.
When I used code:
data name;
set previous;
where Conditions contains '0';
smoker='1';
run;
I get the data with only '0' values, and remaining gets deleted. I want to sort all and keep all the data.
Thanks
I suggest that you use 1 for a "true" or 'yes' condition. Especially since it isn't very clear what a 0 = "no liver" or "no renal" might mean??
This might get you started:
data example; informat codestr $18.; input codestr; array v(0:9) tobacco cond2-cond10;/*< place your variable names here*/ /* in the order of the code VALUES*/ do i=0 to 9; v[i]= index(codestr,put(i,best1.))=0; end; drop i; datalines; 0 0,2 3,4,5,6 0,1,2,3,4,5,6,7,8,9 ; run;
Since you didn't provide much detail in the form of which code goes with which topic can't go much more than add variables to suit.
This will not work if you have 2 digit codes or ones with decimals.
The code uses SAS behavior that a comparison will return 1 for true and 0 for false. So when the index value is not found this assigns 1.
ALL codes are single digits, from 0 to 8. entries may include anything from single code to multiple code, depending upon the health conditions a person has
I tried this but again it only defines '0' if there is only '0' in value. I want to include all entries that has zero in it.
@Sakshi13 wrote:
I tried this but again it only defines '0' if there is only '0' in value. I want to include all entries that has zero in it.
???????
Example data for ALL the variables involved in this recode AND an example of what the actual output should look like for that given data.
You should provide enough examples to see the overall picture and exercise logic for each variable involved.
Sorry, my fault. By default, INDEXW uses only blanks as delimiters. To use both blanks and commas as delimiters:
if indexw(condition, '0', ', ') then smoker=1;
else smoker=0;
@Sakshi13 wrote:
Hello,
My data has a variable "conditions" which has values from 0 to 8, depending upon what the patient has. I want to do univariate regression analysis for each one of these condition. so, need to recode into separate variables for each value. For example: data is following where each number indicates a disease, here 0: smoking status, 1: liver disease, 2: renal disease and so on.
Condition
0
0,1
0,2
1,3
1,2
2,3,5
3,4,6,7
I want to recode tmy data into new variables : smoker, liver, renal, and so on, where 1 is no smoking, 0 is smoking, and same for other variables.
When I used code:
data name;
set previous;
where Conditions contains '0';
smoker='1';
run;
I get the data with only '0' values, and remaining gets deleted. I want to sort all and keep all the data.
Thanks
I suggest that you use 1 for a "true" or 'yes' condition. Especially since it isn't very clear what a 0 = "no liver" or "no renal" might mean??
This might get you started:
data example; informat codestr $18.; input codestr; array v(0:9) tobacco cond2-cond10;/*< place your variable names here*/ /* in the order of the code VALUES*/ do i=0 to 9; v[i]= index(codestr,put(i,best1.))=0; end; drop i; datalines; 0 0,2 3,4,5,6 0,1,2,3,4,5,6,7,8,9 ; run;
Since you didn't provide much detail in the form of which code goes with which topic can't go much more than add variables to suit.
This will not work if you have 2 digit codes or ones with decimals.
The code uses SAS behavior that a comparison will return 1 for true and 0 for false. So when the index value is not found this assigns 1.
Hello, I tried that, please confirm if I should be mentioning all the entries in datalines??
@Sakshi13 wrote:
Hello, I tried that, please confirm if I should be mentioning all the entries in datalines??
I guess that you didn't mark it as solution means that you didn't fully understand the proposed code.
You didn't tell us what the different codes mean; which code should populate a '1' into which variable.
The only thing you need to amend to make the proposed code work is to list the variables in the array statement in the order of your codes. For example if variable liverdiseases should be set to 1 for code 3 then make sure this variable is listed as 3rd element in the array statement.
Hello, yes, I did mention the variables in the same order (Below is list of variables):
data example;
informat Comorbidities $18.;
input Comorbidities;
array v(0:8) smoker dm cad stroke cancer chf liver renal none ;/*< place your variable names here*/
/* in the order of the code VALUES*/
do i=0 to 8;
v[i]= index(Comorbidities,put(i,best1.))=0;
end;
drop i;
datalines;
***should I mention the complete data here? Or can I skip this part? ****
run;
You might be right, I am still trying to understand the code.
Regards
@Sakshi13 wrote:
datalines;
***should I mention the complete data here? Or can I skip this part? ****
run;
All the Infile/Input/Datalines parts are only to create the sample data. If you've got this data already in a table then of course you will replace all of this with a SET statement.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.