turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Data Management
- /
- Forum
- /
- Creating intervals from adjacent observations

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-05-2017 01:37 PM

Hi, I'm looking for an efficient way to transform data describing a step function from a format where observations give the locations of "jumps" to a format that has intervals where the step function is constant. I want to define step functions to start at 0 with a value of 0 (unless a different value is specified in the input dataset). I usually have success searching SAS forums using Google or the like, but unfortunately not this time! I'm using SAS V9.4.

For example, let's say I have data like the following:

```
data test;
input id x y;
cards;
1 0 20
1 5 15
1 10 17
2 1 5
2 10 2
3 7 8
;
run;
```

Here *id* is identifies a particular step function (there could be more than one defined in a particular dataset), *x* is the location of a jump and *y* is the value the step function takes on at that jump (technically, I guess I'm imagining that this function is right-continuous). So, function 1 starts at 0 with a value of 20, is constant until x=5 when it jumps to a value of 15, stays constant until x=10 when it assumes a value of 17 and is constant thereafter.

I'd like to end up with a dataset like:

data test2; input id xstart xstop y; cards; 1 0 5 20 1 5 10 15 1 10 . 17 2 0 1 0 2 1 10 5 2 10 . 2 3 0 7 0 3 7 . 8 ; run;

Where *id* is the same id as in the previous dataset, but xstart to xstop defines intervals where the step function is constant and *y* is the value of the function during a particular interval.

The solutions I've tried become very complex and I'd be happy for any pointers towards efficient solutions using either a data step or PROC SQL (or both).

Thanks!

Accepted Solutions

Solution

01-05-2017
07:45 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruser

01-05-2017 03:07 PM

Here's a DATA step approach. It might be complex, but at least it's short:

data want;

set test (rename=(x=xstart));

by id;

set test (firstobs=2 keep=x rename=(x=xstop)) test (drop=_all_);

if last.id then xstop=.;

run;

It does the heavy lifting (getting XSTART and XSTOP), but not all of the work. It doesn't insert the 0 steps.

All Replies

Solution

01-05-2017
07:45 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruser

01-05-2017 03:07 PM

Here's a DATA step approach. It might be complex, but at least it's short:

data want;

set test (rename=(x=xstart));

by id;

set test (firstobs=2 keep=x rename=(x=xstop)) test (drop=_all_);

if last.id then xstop=.;

run;

It does the heavy lifting (getting XSTART and XSTOP), but not all of the work. It doesn't insert the 0 steps.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

01-05-2017 07:46 PM

This is great! My solutions were much longer (and didn't work exactly as I wanted). Thanks.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruser

01-05-2017 08:43 PM

another approach

```
data want2;
num=_n_+1;
set test ;
if nobs>=num then set test(keep=id x rename=(id=id2 x=xstop)) nobs=nobs point=num;
if id^=id2 then call missing(xstop);
drop id2;
run;
```

Thanks,

Jag

Jag