turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- identifying the earliest event

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-25-2012 07:16 PM

I have a data set which contains people and their activity over a period of 4 days. The activity of people changes over these periods.

for example some people have been active from the beginning, some just was active in the last period and some were active in the beginning and the end only.

The data set looks like this :

Person Activitydate

---------- ---------------

A 1

A 2

A 3

A 4

B 3

B 4

C 1

C 4

D 3

E 4

which means person A has been active in all four days, person B has been active in days 3 and 4. C has been active in day 1 and 3 and D and E has been active in days 3 and 4 respectively

I want to define a new variable such as 'tenure' which shows the number of days that people have been active in days before. So the new data set should look like

Person Activitydate Tenure

---------- --------------- --------

A 1 0

A 2 1

A 3 2

A 4 3

B 3 0

B 4 1

C 1 0

C 4 1

D 3 0

E 4 0

Please note that people C and D will have no prior experience since they started late, and thus the value of tenure will be zero for both of them despite the fact that their activity dates are 3 and 4

Any hints and solutions are greatly appreciated

Thanks for reading this post

Accepted Solutions

Solution

11-26-2012
01:44 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to niam

11-26-2012 01:44 AM

Assuming your data has already been sorted.

data x; input person $ date; cards; a 1 a 2 a 3 a 4 b 3 b 4 ; run; data x; set x; if person ne lag(person) then tenure=-1; tenure+1; run;

Ksharp

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to niam

11-25-2012 07:36 PM

If dates are simple numbers as in your example, you can use

**proc sql;**

**create table want as**

**select person, activityDate, activityDate-min(activityDate) as tenure**

**from have**

**group by person;**

**quit;**

PG

This simple query gives the WRONG result. See Haikuo Bians' note below.

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

11-25-2012 09:21 PM

PG, your solution works best if OP 's request counts calendar days instead of actual activity days. For that, DOW can offer a simple approach:

data want;

do tenure=0 by 1 until (last.person);

set have;

by person notsorted;

output;

end;

run;

Haikuo

Update:

If forcing me to use proc sql, I would have to rely on some undocumented feature:

proc sql;

select *, monotonic()-min(monotonic()) as tunure from have group by person;quit;

Solution

11-26-2012
01:44 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to niam

11-26-2012 01:44 AM

Assuming your data has already been sorted.

data x; input person $ date; cards; a 1 a 2 a 3 a 4 b 3 b 4 ; run; data x; set x; if person ne lag(person) then tenure=-1; tenure+1; run;

Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to niam

11-26-2012 12:51 PM

This isn't all that different from Ksharp's solution. I just prefer a slightly different set of tools because LAG can be tricky and because this version warns you if the data are not in order.

data want;

set have;

by person;

if first.person then tenure=0;

else tenure + 1;

run;

Optionally, you could extend the BY statement to verify the order even further:

by person Activitydate;

Good luck.