Hello Guys,
I have the following four variables. Based on their available values I want to create another variable. The conditions are like:
if all the available values of tum1, tum2, tum3 and tum4 are equal/same the new variable "homogeneity"=1;
if at least one of the values differ then "homogeneity"=2.
I appreciate your help. Cheers
The missing values are displaying as a dot. Does this mean that the four variables are actually numeric, and "Positive" and "Negative" are formatted values rather than actual values? That would make things easier:
data want;
set have;
if nmiss(of tum1-tum4) >= 3 then homogeneity = 3;
else if max(of tum1-tum4) = min(of tum1-tum4) then homogeneity = 1;
else homogeneity = 2;
run;
Hi Daniel,
Unfortunately, this code doesn't give me something I wanted. Running the code gives "homogeneity"=2 for all the observation, may be because of the missing values in Tum4. I want to create homogeneity based on the available info. I know I have some observation with only one value (ID-9), I want them to be coded as 3. I wanted something like the following:
Thanks in advanced. I appreciate your help. Happy New Year.
STR
And what logic should be used to handle missing values? If all values are missing, should we use homogeneity=1? If three are Negative and one is missing, what should the result be?
Yes mate, missing values are causing problem. If we have more than one available values (say we have tum1 and tum2), and both are negative, then homogeneity should be 1. If one is positive and other one is negative, then homogeneity should be 2. If all the values are missing or if only one value is available, then I would code them as 3.
Can you help me with that?
Cheers
STR
If you create new variables where "Negative" and "Positive" are converted to numeric dummy variables, 0 or 1 or missing, then computing homogeneity is a simple mathematical operation, taking the range of the dummy variables in each row, and then adjust in the case where there are zero or one non-missing in a row.
The missing values are displaying as a dot. Does this mean that the four variables are actually numeric, and "Positive" and "Negative" are formatted values rather than actual values? That would make things easier:
data want;
set have;
if nmiss(of tum1-tum4) >= 3 then homogeneity = 3;
else if max(of tum1-tum4) = min(of tum1-tum4) then homogeneity = 1;
else homogeneity = 2;
run;
It worked perfectly. Thanks a ton.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.