Hi,
I have a dataset below and want to compute the columns shown in red:
Individual Month_01 Month_02 Month_03 Month_04 INSTANCES_EQUAL NUMBER_POSITIVE RATIO TYPE
X . 100 100 . 1 2 0.5 FIXED
Y 300 300 300 300 3 4 0.75 FIXED
Z 450 500 450 500 0 4 0 VARIABLE
INSTANCES_EQUAL compares the exact value in each consecutive month in pairs (i.e. month_01 against month_02, month_02 against month_03 etc). I also want to allow a leeway of +-5 either way. So taking individual X, if they had a value between 95 and 105 in Month_03 then this is fine and we can say that this is an instance. Also, it's important that it only compares values that are non-missing (i.e. greater than zero).
NUMBER_POSITIVE provides the number of positive values (i.e. greater than zero) for each row.
RATIO is a calculation where INSTANCES_EQUAL / NUMBER_POSITIVE.
TYPE is a character variable which takes ratio value and categorises this as "FIXED" if >=0.5 and "VARIABLE" if < 0.5
I feel like there is a shortcut way of doing this in one program node using ARRAYS but not sure where to start?!
Would be grateful for any advice.
Thanks
How about
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if . < abs(m[i+1]-m[i]) <= 5 then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
Try this
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if m[i+1] = m[i] then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
Thanks!
How do I amend the do loop so we accept instances where the values are within +-5 when comparing? In other words it doesn't have to be an exact match but has to be close...
How about
data have;
input Individual $ Month_01-Month_04;
datalines;
X . 100 100 .
Y 300 300 300 300
Z 450 500 450 500
;
data want(drop=i);
set have;
array m{*} Month_01-Month_04;
instances_equal = 0;
do i=1 to dim(m)-1;
if . < abs(m[i+1]-m[i]) <= 5 then instances_equal+1;
end;
number_positive = n(of m[*]);
ratio = instances_equal / number_positive;
type = ifc(ratio ge .5, "FIXED", "VARIABLE");
run;
@zishaq wrote:
Thanks!
How do I amend the do loop so we accept instances where the values are within +-5 when comparing? In other words it doesn't have to be an exact match but has to be close...
So you mean instead of testing
x = y
You want to test
-5 <= (x-y) <= 5
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.