Hi,
I have a dataset below and want to compute the columns shown in red:
Individual Month_01 Month_02 Month_03 Month_04 INSTANCES_EQUAL NUMBER_POSITIVE RATIO TYPE
X . 100 100 . 1 2 0.5 FIXED
Y 300 300 300 300 3 4 0.75 FIXED
Z 450 500 450 500 0 4 0 VARIABLE
INSTANCES_EQUAL compares the exact value in each consecutive month in pairs (i.e. month_01 against month_02, month_02 against month_03 etc). I also want to allow a leeway of +-5 either way. So taking individual X, if they had a value between 95 and 105 in Month_03 then this is fine and we can say that this is an instance. Also, it's important that it only compares values that are non-missing (i.e. greater than zero).
NUMBER_POSITIVE provides the number of positive values (i.e. greater than zero) for each row.
RATIO is a calculation where INSTANCES_EQUAL / NUMBER_POSITIVE.
TYPE is a character variable which takes ratio value and categorises this as "FIXED" if >=0.5 and "VARIABLE" if < 0.5
I feel like there is a shortcut way of doing this in one program node using ARRAYS but not sure where to start?!
Would be grateful for any advice.
Thanks
How about
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if . < abs(m[i+1]-m[i]) <= 5 then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
Try this
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if m[i+1] = m[i] then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
Thanks!
How do I amend the do loop so we accept instances where the values are within +-5 when comparing? In other words it doesn't have to be an exact match but has to be close...
How about
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if . < abs(m[i+1]-m[i]) <= 5 then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
@zishaq wrote:
Thanks!
How do I amend the do loop so we accept instances where the values are within +-5 when comparing? In other words it doesn't have to be an exact match but has to be close...
So you mean instead of testing
x = y
You want to test
-5 <= (x-y) <= 5
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.