identifying first observations

Reply
Contributor
Posts: 29

identifying first observations

Hi, i have such dataset:

id time timestamp

1 1 123

1 1 124

1 1 125

1 2 126

1 2 127

1 2 128

1 3 129

1 3 130

1 3 131

2 1 132

2 1 133

2 1 134

2 2 135

2 2 136

2 2 137

2 3 138

2 3 139

2 3 140

...

...

~14mln ovesrvations in total. time is minute of hour, timestamp is systemdate when observation was took. id is some kind of observation name.

i need to get one more collumn so the data would look like:

id time timestamp first_time

1 1 123 1

1 1 124 0

1 1 125 0

1 2 126 1

1 2 127 0

1 2 128 0

1 3 129 1

1 3 130 0

1 3 131 0

2 1 132 1

2 1 133 0

2 1 134 0

2 2 135 1

2 2 136 0

2 2 137 0

2 3 138 1

2 3 139 0

2 3 140 0

basicaly each ID was observd 3 times during every minute, and it repeatedly done done whole hour. i need to recognize observations that were took in the begining of each minute for all ID's and for all minutes. could anyone help with that?

the only way i see so far is creating new data set like:

proc sort data = dataset; by id; run;

data dataset_new; by event_id;

     if first.id and time = 1 then first_time = 1;

run;

ant then use proc sql to join these tables. then repeat it 60 times for all minutes. have any more simple solution? thanks.

Contributor
Posts: 29

Re: identifying first observations

anyone?

Super User
Posts: 10,046

Re: identifying first observations

That is easy. Not test code.

proc sort data = dataset; by id time ; run;

data dataset_new;

set dataset;

by id time ;

     if first.time then first_time = 1;

run;

Contributor
Posts: 29

Re: identifying first observations

Hi, i got home after work yesterday and tried the code but it didnt work..

the statement

set dataset;

by id time ;

just gives error "by variables are not properly sorted in dataset"... i did sort by id time; before that.. nothing.. have any idea why the error came up?

Super Contributor
Posts: 349

Re: identifying first observations

Hi,

Try this...

data one;

input id time timestamp;

cards;

1 1 123

1 1 124

1 1 125

1 2 126

1 2 127

1 2 128

1 3 129

1 3 130

1 3 131

2 1 132

2 1 133

2 1 134

2 2 135

2 2 136

2 2 137

2 3 138

2 3 139

2 3 140

;

run;

data want;

set one;

by id time;

if first.id or first.time then   first_time =1;

else first_time =0;

run;

Thanks,

Shiva

Contributor
Posts: 29

Re: identifying first observations

i tried this code and it totaly works with this sample data set. but my original dataset contains 18 (3 of them are id timestamp and time) cullumns ant 14 000 000 rows and when i put the same code with my datafiles names it crashes and tells that "by variables are not propely sorted" although i do proc sort data = datafile; by id time; run; before that.. have any idea why?

Super Contributor
Posts: 282

Re: identifying first observations

Hi,

Check if the sort step completed successfully. In the past I think I remember seeing a message in the log saying the input SAS data set was not replaced when a step did not complete successfully or there was a space issue.

Regards,

Amir.

Super User
Posts: 5,518

Re: identifying first observations

I'm not sure that the problem is being properly attacked.  Here is my take on what needs to be done.  This step can likely be skipped, since the data may already be in order:

proc sort data=have;

   by timestamp;

run;

Then it looks like this is what you are trying to achieve:

data want;

   set have;

   by time notsorted;

   first_time = first.time;

run;

Good luck.

Contributor
Posts: 29

Re: identifying first observations


Thanks a lot!

i did

proc sort data = dataset; by id time ; run;

but when i was setting data in

data dataset_new;

set dataset;

i just used "by time;" and i didnt realize that solution is simple as that. thanks a lot guys.

Contributor
Posts: 29

Re: identifying first observations

as i found out the issue was with the dataset size i changed directories and filenames and everything worked with the first solution offered by Ksharp. thanks for all!

Ask a Question
Discussion stats
  • 9 replies
  • 532 views
  • 0 likes
  • 5 in conversation