turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- rolling standard deviation separated by many categ...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-23-2016 09:19 AM

Hello all,

The ture data has many categories(mor than 400). I want the result to show the rolling 3years std by treatment, but each treatment has different starting points. Now assuming two treatments, could you do me a great favor to get the result as follows(either one is fine)? Thanks a lot.

data have;

infile cards dlm=',' truncover;

input treatment year x;

cards;

1,1980,0

1,1981,0

1,1982,0

1,1983,0

1,1984,5

1,1985,7

1,1986,0

1,1987,6

1,1988,2

1,1989,1

2,1980,0

2,1981,0

2,1982,0

2,1983,3

2,1984,4

2,1985,6

2,1986,0

2,1987,7

2,1988,3

2,1989,2

2,1990,1

2,1991,1;

run;

**I hope the result like this:**

treatment year x std3yr

1 1980 0 .

1 1981 0 .

1 1982 0 .

1 1983 0 .

1 1984 5 .

1 1985 7 .

1 1986 0 3.6055512755

1 1987 6 3.7859388972

1 1988 2 3.0550504633

1 1989 1 2.6457513111

2 1980 0 .

2 1981 0 .

2 1982 0 .

2 1983 3 .

2 1984 4 .

2 1985 6 1.5275252317

2 1986 0 3.0550504633

2 1987 7 3.7859388972

2 1988 3 3.5118845843

2 1989 2 2.6457513111

2 1990 1 1

2 1991 1 0.5773502692

Or

treatment year x std3yr

1 1986 0 3.6055512755

1 1987 6 3.7859388972

1 1988 2 3.0550504633

1 1989 1 2.6457513111

2 1985 6 1.5275252317

2 1986 0 3.0550504633

2 1987 7 3.7859388972

2 1988 3 3.5118845843

2 1989 2 2.6457513111

2 1990 1 1

2 1991 1 0.5773502692

Accepted Solutions

Solution

01-03-2017
02:38 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 11:26 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 09:21 AM

The solutions previously provide can be easily adapted with a BY statement. Which method did you choose to go with, rather than starting from scratch.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-23-2016 09:52 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 09:57 AM

You posted this question for a single category earlier. Several solutions were provided. Which one did you want to work with? If you post that we can help modify it for multiple groups.

Rather than revisit every possible solution, it's better to start with something you have and understand.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-23-2016 10:02 AM

@Reeza

I prefer using PROC SQL, could you do me a great favor to modify the code?Also, you can use the others. Thanks.

I prefer using PROC SQL, could you do me a great favor to modify the code?Also, you can use the others. Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 01:19 PM

I couldn't think of a way to identify the starting records in SQL while also including 0's in the data.

SQL doesn't have a concept of rows that allows counting, unless you use Cursor logic, which isn't available in SAS.

I'm sure there's a way, but its probably cumbersome...and more code than I'm willing to write at the moment.

Here's a data step solution, that's relatively straightforward.

```
data have;
infile cards dlm=',' truncover;
input treatment year x;
cards;
1,1980,0
1,1981,0
1,1982,0
1,1983,0
1,1984,5
1,1985,7
1,1986,0
1,1987,6
1,1988,2
1,1989,1
2,1980,0
2,1981,0
2,1982,0
2,1983,3
2,1984,4
2,1985,6
2,1986,0
2,1987,7
2,1988,3
2,1989,2
2,1990,1
2,1991,1
;
run;
data want1;
/*Set temporary array to store the last 3 records*/
array p{0:2} _temporary_;
set have;
by treatment; /*BY GROUP PROCESSING*/
retain count;/*Keep value of count from row to row*/
p{mod(_n_,3)} = x; /*Assign value to the array for calculations*/
/*If first treatment record then reset
the array to missing and set the counter to 0*/
if first.treatment then do;
call missing (of p(*));
count=0;
end;
/*Increment Counter if the value if starting record is found*/
if x ne 0 or count>0 then count+1;
/*Calculate STDEV*/
if nmiss(of p(*)) = 0 and max(of p(*)) ne 0 and count>= 3
then stdev = std(of p{*});
run;
proc print data=want1; run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 11:05 AM

Jennifer925 wrote:

The ture data has many categories(mor than 400). I want the result to show the rolling 3years std by treatment, but each treatment has different starting points. Now assuming two treatments, could you do me a great favor to get the result as follows(either one is fine)? Thanks a lot.

Are the 400 categories levels of the treatment variable?

Do you have, or can you make, a dataset with the "different starting point" for each treatment.

For example it looks like you might have

Treatment Startyear

1 1986

2 1985

If so then then you could combine the rule data with the actual data by treatment, and only calculate the value for years greater than or equal to the start year.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-23-2016 01:42 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 06:02 PM

Jennifer:

This is the same problem you submitted earlier, but with treatment as a by-group. Think about the logic of the single treatment solution that you accepted:

**data** want (drop=n_nonzeroes);

set have;

n_nonzeroes+(x^=**0**);

if n_nonzeroes=**0** then delete; /* No input for std until first non-zero record*/

std3yr=std(x,lag(x),lag2(x));

if lag2(n_nonzeroes)>**0**; /* No output until 2nd record following first non-zero*/

**run**;

All you have to do is

- Reset n_nonzeroes to 0 at the beginning of each BY group - a perfectly ordinary technique of SAS by-group programming.
- And change the LAG2 test to include a test for same treatment. The latter is neccessary because otherwise the "lag2(n_nonzeroes)>0" test for a given treatment could be contaminated by results from the prior treatment.

Thebelow show the changes:**bold-italics**

**data** want (drop=n_nonzeroes);

set have;

* by treatment;*

* if first.treatment=1 then n_nonzeroes=0;*

n_nonzeroes+(x^=**0**);

if n_nonzeroes=**0** then delete;

std3yr=std(x,lag(x),lag2(x));

if lag2(n_nonzeroes)>**0**** and lag2(treatment)=treatment **; /* No output until 2nd record following first non-zero */

**run**;

I notice that unlike your original question, your WANT sample now includes the records with leading zeroes in each by group. That actually makes the program a bit more compact, since there are no subsetting IF's or DELETE statements.:

**data** want2 (drop=n_nonzeroes);

set have;

by treatment;

if first.treatment=**1** then n_nonzeroes=**0**;

n_nonzeroes+(x^=**0**);

std3yr=ifn(lag2(n_nonzeroes)>**0** and lag2(treatment)=treatment,std(x,lag(x),lag2(x)),**.**);

**run**;

Solution

01-03-2017
02:38 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jennifer925

12-23-2016 11:26 PM