turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- please help with use of do loops and possibly macr...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-20-2015 09:09 AM

Hello all,

I am quite new to SAS programming therefore would appreciate any input on the possible solution of the problem I have!

I have the following data

value1 value2 value3 average1 required_result

1 9 12 7 7

8 6 9 6 7

4 7 10 2 7

6 2 1 8 7

. . . .

. . . .

. . . .

7 9 16 24 24

. . . . 24

X Y Z W

I need to do the following:

- If row n has got the following trend - value1 > value2> value3, then use the first encountered average1 value for the next 4 rows;

- after 4 rows calculate if value1 > value2 > value3 is true. If it is true then use the next first encountered average1 value for another 4 rows;

- repeat.

for example, in the first row value1 > value2> value3 (1 > 9 > 12) therefore we need to use average1 = 7 the next 4 rows starting from the first one.

can you please advise how this can be coded in SAS?

thank you so much for all your help!

Accepted Solutions

Solution

05-21-2015
08:58 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-21-2015 08:58 AM

OK, it's looking a little clearer. The OR condition should look at SAVED_AVERAGE, rather than REQUIRED_RESULT. SAVED_AVERAGE is retained, but REQUIRED_RESULT begins each observation as missing:

If ((value1 < value2 < value3) or (value1 < saved_average and value2 < saved_average and value3 < saved_average)) and (_n_ > 2) and (_n_ > max_recordnum) then do;

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-20-2015 09:47 AM

I am confused. Why 1 > 9 > 12 ?

data have;

input value1 value2 value3 average1 ;

cards;

1 9 12 7

8 6 9 6

4 7 10 2

6 2 1 8

7 9 16 24

7 9 16 24

7 9 16 24

7 9 16 24

;

run;

data have;

set have;

if mod(_n_,4)=1 then n+1;

run;

data want;

set have;

by n;

retain flag;

if first.n then do;

flag=.;

if value1 < value2 and value2 < value3 then flag=average1;

end;

run;

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

05-20-2015 09:55 AM

Apologies, Xia, quick typing!

I obviously meant value1 < value2 < value3

is your solution still going to be ok for this logic?

I will try it out and let you know.

thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-20-2015 10:13 AM

I think you'll need something a little more flexible. Here's a mildly similar approach:

data want;

set have;

if (value1 < value2 < value3) and (saved_average=.) then do;

max_recordnum = _n_ + 3;

saved_average = average1;

end;

retain max_recordnum saved_average;

if _n_ <= max_recordnum then required_result = saved_average;

if _n_ = max_recordnum then saved_average = .;

drop saved_average max_recordnum;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

05-20-2015 10:55 AM

Hello Xia and all,

i think I did not explain my problem quite well before, sorry!

I have attached now the picture of the excel file with the sample data (I have only my iPad with me therefore not able to do anything better).

ideally what I need is the following:

- I need to work out the required_result column using the following logic:

- for row 2 and 3 select average1 value (without using any other logic). Therefore required_result = 2 for row 2 and required_result = 3 for row 3;

- for all other rows use the following logic: if value1 < value2 < value3 then use the first encountered value of the average1 for at least 4 rows (7 in this case)

- continue using 7 after 4 rows if the following is FALSE - value1 < value2 < value3;

- if after 4 rows above the following is TRUE - value1 < value2 < value3, then use next first encountered average1 value (24 in this case) for at least 4 rows and then repeat the logic.

thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-20-2015 11:19 AM

That's actually a simpler program. One way:

data want;

set have;

retain saved_average;

if (value1 < value2 < value3) then saved_average = average1;

is saved_average > . then required_result = saved_average;

else required_result = average1;

drop saved_average;

run;

Good luck.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

05-20-2015 11:43 AM

Thanks a lot for trying, Astounding!

however, in my example I have numbers instead of "." - I just did not want to write it all up, now I have and attachend a new file to explain.

- Green area - first two rows should always have required_result = average1 (without any other logic involved);

- Orange area, that's where we start comparing value1< value2 < value3 and if it is true, we need to use the first encountered average1 and use it for the next 4 rows regardless whether condition (value1< value2 < value3) is true or not (That's why it is 7 for all 4 rows).

- White area - we continue to use previous average1 value if condition (value1< value2 < value3) is FALSE (that's why it is 7 for rows 8, 9, 10).

- Blue area - if condition (value1< value2 < value3) is true then choose the first encountered average1 and use it for the next 4 rows regardless whether condition (value1< value2 < value3) is true or not (that's why it is 24 for the next 4 rows.)

- White area - we continue to use previous average1 value if condition (value1 < value2 < value3) is false (that's why it's 24 for another row. i have ended file here however I have a lot more entries down as well).

hope it makes sense.

thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-20-2015 12:15 PM

Seems to make sense ... I'll go back to a variation of the first program:

data want;

set have;

if (value1 < value2 < value3) and (_n_ > 2) and (_n_ > max_recordnum) then do;

max_recordnum = _n_ + 3;

saved_average = average1;

end;

retain max_recordnum saved_average;

if saved_average = . then required_result = average1;

else required_result = saved_average;

drop saved_average max_recordnum;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

05-20-2015 01:48 PM

thank you so much, you are a genius, Astounding!

That's great for what I explained so far, however what if we had additional condition:

- In addition to what we had before, we also have another condition to check (value1 < required_result and value2 < required_result and value3 < required result). If that's true or previous condition (value1 < value2 < value3) is true then required result takes the value of the first encountered average1.

As a result, you can see that the RED block has got now the required_result = 5 repeated for 4 rows and then White block immediately after is still using Required_result = 5 as both of our conditions do not hold true here (value1 < required_result and value2 < required_result and value3 < required result) is FALSE and (value1 < value 2 < value3) is also FALSE.

The challenge here is that 2nd condition is using the newly created variable required_result.

It would be great if you can help!

Thank you!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-20-2015 02:29 PM

Red,

I'm not sure how much more time I can devote, but let's start at least.

Why does line 15 remain as is, instead of beginning a new set of 4 with required_result=6? It appears that the conditions are in place: value1 < 24 and value2 < 24 and value3 < 24.

I don't see it. Why does line 16 should trigger a new block of 4, but line 15 does not?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

05-20-2015 02:57 PM

Thank you for your time, Astounding, I really appreciate it!

I am sorry, I made a silly error typing numbers in on my iPad - in line 15 it should be 27 instead of 7 (please see updated numbers attached).

therefore both conditions do not hold for line 15.

I would be so grateful if you can help here!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

05-21-2015 07:21 AM

Hi all,

I tried to develop on Astounding's code and make sure that I will include the comparison with the newly extended average, however it does not work the way it should be.

can you please advise what needs to be done so that condition (value1 < saved_average and value2 < saved_average and value3 < saved_average) always tests against not saved_average but the newly created required_result variable?

Data want;

Set have;

If (value1 < value2 < value3) and (_n_ > 2) and (_n_ > max_recordnum) then do;

max_recordnum = _n_ + 3;

Saved_average = average1;

End;

Retain max_recordnum saved_average;

If saved_average = . Then required_result = average1;

Else required_result = saved_average;

If (value1 < saved_average and value2 < saved_average and value3 < saved_average) and (_n_ > max_recordnums) and (_n_ > max_recordnum) then do;

max_recordnums = _n_ + 3;

Saved_average2 = average1;

End;

Retain max_recordnums saved_average2;

If saved_average2 = . And saved_average = . Then required_result = average1;

Else required_result = saved_average;

If saved_average2 <> . Then required_result = saved_average2;

Run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-21-2015 08:19 AM

Red,

Here's the statement from the original that determines whether to begin another block of 4 records:

if (value1 < value2 < value3) and (_n_ > 2) and (_n_ > max_recordnum) then do;

It seems you would just need to add another condition and alter just this one statement:

if ( (value1 < value2 < value3) or (value1 < required_result and value2 < required_result and value3 < required_result) )

and (_n_ > 2) and (_n_ > max_recordnum) then do;

Does that seem about right to you?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

05-21-2015 08:25 AM

Thanks, Astounding.

yes, I have already tried and the additional condition does not work.

The code results stay unchanged as if there was not any additional condition.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Red_Squirrel

05-21-2015 08:37 AM

Are you sure you used OR in the right spot:

if ( (value1 < value2 < value3) **or** (value1 < required_result and value2 < required_result and value3 < required_result) )

and (_n_ > 2) and (_n_ > max_recordnum) then do;

It might help if you showed the exact code you tested.