- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello Guys,
I have the following four variables. Based on their available values I want to create another variable. The conditions are like:
if all the available values of tum1, tum2, tum3 and tum4 are equal/same the new variable "homogeneity"=1;
if at least one of the values differ then "homogeneity"=2.
I appreciate your help. Cheers
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The missing values are displaying as a dot. Does this mean that the four variables are actually numeric, and "Positive" and "Negative" are formatted values rather than actual values? That would make things easier:
data want;
set have;
if nmiss(of tum1-tum4) >= 3 then homogeneity = 3;
else if max(of tum1-tum4) = min(of tum1-tum4) then homogeneity = 1;
else homogeneity = 2;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Set work.have;
If tum1=tum2 and tum2=tum3 and tum3=tum4 then homogeneity =1;
Else homogeneity = 2;
Run;
Typed on my mobile and untested. It's always useful to post sample data with questions. Merry Christmas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Daniel,
Unfortunately, this code doesn't give me something I wanted. Running the code gives "homogeneity"=2 for all the observation, may be because of the missing values in Tum4. I want to create homogeneity based on the available info. I know I have some observation with only one value (ID-9), I want them to be coded as 3. I wanted something like the following:
Thanks in advanced. I appreciate your help. Happy New Year.
STR
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And what logic should be used to handle missing values? If all values are missing, should we use homogeneity=1? If three are Negative and one is missing, what should the result be?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes mate, missing values are causing problem. If we have more than one available values (say we have tum1 and tum2), and both are negative, then homogeneity should be 1. If one is positive and other one is negative, then homogeneity should be 2. If all the values are missing or if only one value is available, then I would code them as 3.
Can you help me with that?
Cheers
STR
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you create new variables where "Negative" and "Positive" are converted to numeric dummy variables, 0 or 1 or missing, then computing homogeneity is a simple mathematical operation, taking the range of the dummy variables in each row, and then adjust in the case where there are zero or one non-missing in a row.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The missing values are displaying as a dot. Does this mean that the four variables are actually numeric, and "Positive" and "Negative" are formatted values rather than actual values? That would make things easier:
data want;
set have;
if nmiss(of tum1-tum4) >= 3 then homogeneity = 3;
else if max(of tum1-tum4) = min(of tum1-tum4) then homogeneity = 1;
else homogeneity = 2;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It worked perfectly. Thanks a ton.