Hi, i have such dataset:
id time timestamp
1 1 123
1 1 124
1 1 125
1 2 126
1 2 127
1 2 128
1 3 129
1 3 130
1 3 131
2 1 132
2 1 133
2 1 134
2 2 135
2 2 136
2 2 137
2 3 138
2 3 139
2 3 140
...
...
~14mln ovesrvations in total. time is minute of hour, timestamp is systemdate when observation was took. id is some kind of observation name.
i need to get one more collumn so the data would look like:
id time timestamp first_time
1 1 123 1
1 1 124 0
1 1 125 0
1 2 126 1
1 2 127 0
1 2 128 0
1 3 129 1
1 3 130 0
1 3 131 0
2 1 132 1
2 1 133 0
2 1 134 0
2 2 135 1
2 2 136 0
2 2 137 0
2 3 138 1
2 3 139 0
2 3 140 0
basicaly each ID was observd 3 times during every minute, and it repeatedly done done whole hour. i need to recognize observations that were took in the begining of each minute for all ID's and for all minutes. could anyone help with that?
the only way i see so far is creating new data set like:
proc sort data = dataset; by id; run;
data dataset_new; by event_id;
if first.id and time = 1 then first_time = 1;
run;
ant then use proc sql to join these tables. then repeat it 60 times for all minutes. have any more simple solution? thanks.
anyone?
That is easy. Not test code.
proc sort data = dataset; by id time ; run;
data dataset_new;
set dataset;
by id time ;
if first.time then first_time = 1;
run;
Hi, i got home after work yesterday and tried the code but it didnt work..
the statement
set dataset;
by id time ;
just gives error "by variables are not properly sorted in dataset"... i did sort by id time; before that.. nothing.. have any idea why the error came up?
Hi,
Try this...
data one;
input id time timestamp;
cards;
1 1 123
1 1 124
1 1 125
1 2 126
1 2 127
1 2 128
1 3 129
1 3 130
1 3 131
2 1 132
2 1 133
2 1 134
2 2 135
2 2 136
2 2 137
2 3 138
2 3 139
2 3 140
;
run;
data want;
set one;
by id time;
if first.id or first.time then first_time =1;
else first_time =0;
run;
Thanks,
Shiva
i tried this code and it totaly works with this sample data set. but my original dataset contains 18 (3 of them are id timestamp and time) cullumns ant 14 000 000 rows and when i put the same code with my datafiles names it crashes and tells that "by variables are not propely sorted" although i do proc sort data = datafile; by id time; run; before that.. have any idea why?
Hi,
Check if the sort step completed successfully. In the past I think I remember seeing a message in the log saying the input SAS data set was not replaced when a step did not complete successfully or there was a space issue.
Regards,
Amir.
I'm not sure that the problem is being properly attacked. Here is my take on what needs to be done. This step can likely be skipped, since the data may already be in order:
proc sort data=have;
by timestamp;
run;
Then it looks like this is what you are trying to achieve:
data want;
set have;
by time notsorted;
first_time = first.time;
run;
Good luck.
Thanks a lot!
i did
proc sort data = dataset; by id time ; run;
but when i was setting data in
data dataset_new;
set dataset;
i just used "by time;" and i didnt realize that solution is simple as that. thanks a lot guys.
as i found out the issue was with the dataset size i changed directories and filenames and everything worked with the first solution offered by Ksharp. thanks for all!
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.