## Creating a variable from comparing consecutive values

Hello,

Please, I need help creating a variable from comparing consecutive values. Baseline is the baseline lab values, Month_01-Month-05 is for follow up visit lab values and PCHNG_01-PCHNG_05 is the percent change from baseline.

The first level of the variable is where baseline is =>400 and two consecutive follow-up visits are =>350;

the second level is where the baseline is =>300 but <400 and two consecutive follow-up visits are =>300 and were also within 25% of the baseline;

the third level is where the baseline is <200 and two consecutive follow-ups were also within 25% of the baseline;

the 4th level did not meet any of these criteria.

Thank you!

``````data have;
input ID Baseline \$ Month_01-Month_05 PCHNG_01-PCHNG_05;
datalines;
1 250 .   200 100 100 150 . -20 -60 -60 -40
2 400 300 350 380 300 300 -25 -12.5 -5 -25 -25
3 600 450 500 450 500 500 -33.3 -16.7 -33.3 -16.7 -16.7
4 160 150 140 150 140 130 -6.3 -12.5 -5.3 -12.5 -18.85 350 320 330 200 200 250 -8.6 -5.7 -42.9 -42.9 -28.6
;
run;``````

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Creating a variable from comparing consecutive values

Providing sample data AND the expected result really helped to clarify things. See if below also works for your real data.

``````data have;
input ID Baseline \$ Month_01-Month_05 PCHNG_01-PCHNG_05 new_expected;
datalines;
2 400 300 350 380 300 300 -25 -12.5 -5 -25 -25 1
3 600 450 500 450 500 500 -33.3 -16.7 -33.3 -16.7 -16.7 1
5 350 320 330 200 200 250 -8.6 -5.7 -42.9 -42.9 -28.6 2
4 160 150 140 150 140 130 -6.3 -12.5 -5.3 -12.5 -18.8 3
1 250 . 200 100 100 150 . -20 -60 -60 -40 4
;

data want(drop=_:);
set have;

array aMth {*} month_:;
array aPct {*} pchng_:;

/*baseline is =>400 and two consecutive follow-up visits are =>350 */
if baseline>=400 then
do;
do _i=1 to dim(aMth)-1;
if aMth[_i]>=350 and aMth[_i+1]>350 then
do;
new_derived=1;
leave;
end;
end;
end;

/* baseline is =>300 but <400 and two consecutive follow-up visits are =>300 and were also within 25% of the baseline */
if missing(new_derived) and baseline>=300 and baseline<400 then
do;
do _i=1 to dim(aMth);
if aMth[_i]>=300 and aMth[_i+1]>300 then
do;
if abs(aPct[_i])<=25 and abs(aPct[_i+1])<=25 then
do;
new_derived=2;
leave;
end;
end;
end;
end;

/* baseline is <200 and two consecutive follow-ups were also within 25% of the baseline */
if missing(new_derived) and baseline<200 then
do;
do _i=1 to dim(aPct)-1;
if 0<=abs(aPct[_i])<=25 and 0<=abs(aPct[_i+1])<=25 then
do;
new_derived=3;
leave;
end;
end;
end;

/*4th level did not meet any of these criteria */
if missing(new_derived) then
do;
new_derived=4;
end;

run;

proc print data=want;
run;
`````` As a variation to above code here a version that uses a GOTO. I consider this one of the rather rare cases where using GOTO is still legitimate because it avoids execution of logical steps once a case has been identified.

``````data want(drop=_:);
set have;

array aMth {*} month_:;
array aPct {*} pchng_:;

/*baseline is =>400 and two consecutive follow-up visits are =>350 */
if baseline>=400 then
do;
do _i=1 to dim(aMth)-1;
if aMth[_i]>=350 and aMth[_i+1]>350 then
do;
new_derived=1;
/* case identified, do not check further */
goto done;
end;
end;
end;

/* baseline is =>300 but <400 and two consecutive follow-up visits are =>300 and were also within 25% of the baseline */
if baseline>=300 and baseline<400 then
do;
do _i=1 to dim(aMth);
if aMth[_i]>=300 and aMth[_i+1]>300 then
do;
if abs(aPct[_i])<=25 and abs(aPct[_i+1])<=25 then
do;
new_derived=2;
/* case identified, do not check further */
goto done;
end;
end;
end;
end;

/* baseline is <200 and two consecutive follow-ups were also within 25% of the baseline */
if baseline<200 then
do;
do _i=1 to dim(aPct)-1;
if 0<=abs(aPct[_i])<=25 and 0<=abs(aPct[_i+1])<=25 then
do;
new_derived=3;
/* case identified, do not check further */
goto done;
end;
end;
end;

/*4th level did not meet any of these criteria */
if missing(new_derived) then
do;
new_derived=4;
end;

/* target of goto statement */
done:;

run;
``````
7 REPLIES 7

## Re: Creating a variable from comparing consecutive values

Please show the expected output. What should the new variable show for each record in your data set?

## Re: Creating a variable from comparing consecutive values

Thank you for your response. It would level1=1, level2=2 etc.

## Re: Creating a variable from comparing consecutive values

@UcheOkoro wrote:

It would level1=1, level2=2 etc.

I still do not understand what you want. And by the way, this is not what @Reeza asked for, which was to SHOW US the expected output and what the value of the new variable would be on each row of the data set.

--
Paige Miller

## Re: Creating a variable from comparing consecutive values

My apologies for the confusion. The following is a table of the output I expect to see.

 ID Baseline Month_01 Month_02 Month_03 Month_04 Month_05 PCHNG_01 PCHNG_02 PCHNG_03 PCHNG_04 PCHNG_05 New_variable 2 400 300 350 380 300 300 -25 -12.5 -5 -25 -25 1 3 600 450 500 450 500 500 -33.3 -16.7 -33.3 -16.7 -16.7 1 5 350 320 330 200 200 250 -8.6 -5.7 -42.9 -42.9 -28.6 2 4 160 150 140 150 140 130 -6.3 -12.5 -5.3 -12.5 -18.8 3 1 250 . 200 100 100 150 . -20 -60 -60 -40 4

## Re: Creating a variable from comparing consecutive values

Providing sample data AND the expected result really helped to clarify things. See if below also works for your real data.

``````data have;
input ID Baseline \$ Month_01-Month_05 PCHNG_01-PCHNG_05 new_expected;
datalines;
2 400 300 350 380 300 300 -25 -12.5 -5 -25 -25 1
3 600 450 500 450 500 500 -33.3 -16.7 -33.3 -16.7 -16.7 1
5 350 320 330 200 200 250 -8.6 -5.7 -42.9 -42.9 -28.6 2
4 160 150 140 150 140 130 -6.3 -12.5 -5.3 -12.5 -18.8 3
1 250 . 200 100 100 150 . -20 -60 -60 -40 4
;

data want(drop=_:);
set have;

array aMth {*} month_:;
array aPct {*} pchng_:;

/*baseline is =>400 and two consecutive follow-up visits are =>350 */
if baseline>=400 then
do;
do _i=1 to dim(aMth)-1;
if aMth[_i]>=350 and aMth[_i+1]>350 then
do;
new_derived=1;
leave;
end;
end;
end;

/* baseline is =>300 but <400 and two consecutive follow-up visits are =>300 and were also within 25% of the baseline */
if missing(new_derived) and baseline>=300 and baseline<400 then
do;
do _i=1 to dim(aMth);
if aMth[_i]>=300 and aMth[_i+1]>300 then
do;
if abs(aPct[_i])<=25 and abs(aPct[_i+1])<=25 then
do;
new_derived=2;
leave;
end;
end;
end;
end;

/* baseline is <200 and two consecutive follow-ups were also within 25% of the baseline */
if missing(new_derived) and baseline<200 then
do;
do _i=1 to dim(aPct)-1;
if 0<=abs(aPct[_i])<=25 and 0<=abs(aPct[_i+1])<=25 then
do;
new_derived=3;
leave;
end;
end;
end;

/*4th level did not meet any of these criteria */
if missing(new_derived) then
do;
new_derived=4;
end;

run;

proc print data=want;
run;
`````` As a variation to above code here a version that uses a GOTO. I consider this one of the rather rare cases where using GOTO is still legitimate because it avoids execution of logical steps once a case has been identified.

``````data want(drop=_:);
set have;

array aMth {*} month_:;
array aPct {*} pchng_:;

/*baseline is =>400 and two consecutive follow-up visits are =>350 */
if baseline>=400 then
do;
do _i=1 to dim(aMth)-1;
if aMth[_i]>=350 and aMth[_i+1]>350 then
do;
new_derived=1;
/* case identified, do not check further */
goto done;
end;
end;
end;

/* baseline is =>300 but <400 and two consecutive follow-up visits are =>300 and were also within 25% of the baseline */
if baseline>=300 and baseline<400 then
do;
do _i=1 to dim(aMth);
if aMth[_i]>=300 and aMth[_i+1]>300 then
do;
if abs(aPct[_i])<=25 and abs(aPct[_i+1])<=25 then
do;
new_derived=2;
/* case identified, do not check further */
goto done;
end;
end;
end;
end;

/* baseline is <200 and two consecutive follow-ups were also within 25% of the baseline */
if baseline<200 then
do;
do _i=1 to dim(aPct)-1;
if 0<=abs(aPct[_i])<=25 and 0<=abs(aPct[_i+1])<=25 then
do;
new_derived=3;
/* case identified, do not check further */
goto done;
end;
end;
end;

/*4th level did not meet any of these criteria */
if missing(new_derived) then
do;
new_derived=4;
end;

/* target of goto statement */
done:;

run;
``````

## Re: Creating a variable from comparing consecutive values

Thank you so much, Patrick for your help!

## Re: Creating a variable from comparing consecutive values

This need to compare successive values in arrays is a good situation to use the UNTIL condition.

Edit: Let me restate that:  the need to stop a loop once a condition has been met is ready-made for the UNTIL clause.  And if the until condition is related to the new variable in this case, you can avoid a good deal of extra code.

``````data have;
input ID \$ Baseline  Month_01-Month_05 PCHNG_01-PCHNG_05 new_expected;
datalines;
2 400 300 350 380 300 300 -25    -12.5   -5   -25   -25    1
3 600 450 500 450 500 500 -33.3  -16.7  -33.3 -16.7 -16.7  1
5 350 320 330 200 200 250  -8.6   -5.7  -42.9 -42.9 -28.6  2
4 160 150 140 150 140 130  -6.3  -12.5   -5.3 -12.5 -18.8  3
1 250 .   200 100 100 150    .   -20    -60   -60   -40    4
;

data want;
set have;
array f {*} month_01-month_05 ;
array p {*} pchng_01-pchng_05 ;

new=4;
if baseline >= 400    then do i=2 to dim(f) until (new=1);
if f{i}>=350 and f{i-1}>=350 then new=1;
end;
else if baseline>=300 then do i=2 to dim(f) until (new=2);
if n(f{i},f{i-1})=2 and  min(f{i},f{i-1})>=300 and   max(abs(p{i}),abs(p{i-1})) <=25 then new=2;
end;
else if baseline<200 then do i=2 to dim(f) until (new=3);
if n(p{i},p{i-1})=2 and max(abs(p{i}),abs(p{i-1}))<=25 then new=3;
end;
run;``````

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 7 replies
• 309 views
• 3 likes
• 5 in conversation