Hi there, rather than being wordy and confusing I'll just present what I have and what I want;
What I have:
Date | Time | Data |
---|---|---|
Jan 1, 2014 | 7:30 | 123 |
Jan 1, 2014 | 7:31 | 345 |
Jan 1, 2014 | 7:33 | 567 |
Jan 2, 2014 | 7:30 | 456 |
Jan 2, 2014 | 7:32 | 234 |
What I want:
Date | Time | Data |
---|---|---|
Jan 1, 2014 | 7:30 | 123 |
Jan 1, 2014 | 7:31 | 345 |
Jan 1, 2014 | 7:32 | 345 |
Jan 1, 2014 | 7:33 | 567 |
Jan 2, 2014 | 7:30 | 456 |
Jan 2, 2014 | 7:31 | 456 |
Jan 2, 2014 | 7:32 | 234 |
The code I have
proc expand data = last out = addmissobs to = minute method=step;
by date;
id time;
run;
What I got:
Date | Time | Data |
---|---|---|
Jan 1, 2014 | 7:30 | 123 |
Jan 1, 2014 | 7:31 | 123 |
Jan 1, 2014 | 7:32 | 345 |
Jan 1, 2014 | 7:33 | 567 |
Jan 2, 2014 | 7:30 | 456 |
Jan 2, 2014 | 7:31 | 234 |
Jan 2, 2014 | 7:32 | 234 |
When I rerun the code using method=none, I noticed that after inserting the missing minutes, all existing data got lead to the next period. Ie Jan 1 7:30 data became Jan 1 7:31 data. Then it replaces the missing data with the next period data. When it was supposed to replace the missing data with the previous period data.
Please help. Thank you in advance.
Sorry I meant to say it moved my observations ahead to the next period.
data have; input date & anydtdte20. time : time9. data; format date date9. time time8.; cards; Jan 1, 2014 7:30 123 Jan 1, 2014 7:31 345 Jan 1, 2014 7:33 567 Jan 2, 2014 7:30 456 Jan 2, 2014 7:32 234 ; run; data want; merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time); output; if date eq _date then do; do i=time+60 to _time-60; time=i;output; end; end; drop i _:; run;
Xia Keshan
Message was edited by: xia keshan
Hi Xia,
I don't quite get the do i=time+60 to _time-60 part.
I ran the code you provided and I ended up getting back the original data set.
To elaborate further, my data set is about 5G, going for about 3 months, starting at 7:30am to 5pm. I have over 25 variables for each period. I need observations for every minutes. If there is a particular minute missing, I need to generate the missing minute and use observations from the previous minute.
Thanks
Hi Xia,
Rechecked the code, I forgot the firstobs=2.
The new data I obtained lead all the observations. 7:30's observations are now 7:31's observations. It did generate the missing minutes - the missing Jan 1, 2014 7:32 is now there, but there are 10 of them. Not a big issue there, I can always delete replicates later. But biggest issue now is I lost my first observations each day.
60 means a minutes.
You must have some other variables like a group variable ID .
So you'd better post all your data . I am leaving now. See you tomorrow .
My data looks like this
Date Time Quantity ap_0 aqn_0 ........ ap_10 aqn_10
Jan 1, 2014 7:30 5 100.05 85 105.23 56
Jan 1, 2014 7:31 2 100.12 86 123.21 45
Jan 1, 2014 7:33 9 99.12 50 110.11 23
.
.
.
Jan 1, 2014 5:00 25 132.21 80 142.45 90
Jan 2, 2014 7:30 10 95.23 12 100.00 45
Jan 2, 2014 7:32 5 45.00 5 86.21 87
As previously mentioned, I have data stretching up to about 3 months. Each day (except for weekends) starting from 7:30 to 5:00. I want observations for each minute from 7:30 to 5:00. So I'm looking for 570 observations for each day.
My code should be worked since there is only one group variable DATE .
What wrong obs you got ? OR try this one which is almost the same as above .
data have; input date & anydtdte20. time : time9. data quantity ap_0; format date date9. time time8.; cards; Jan 1, 2014 7:30 123 5 2 Jan 1, 2014 7:31 345 2 4 Jan 1, 2014 7:33 567 3 6 Jan 2, 2014 7:30 456 8 1 Jan 2, 2014 7:32 234 9 2 ; run; data want; merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time); t=time; output; if date eq _date then do; do i=time+60 to _time-60; t=i;output; end; end; format t time9.; drop i _: time; run;
Xia Keshan
Xia,
It runs, but it did not catch the missing minutes. After running your code I get back my original dataset basically.
Brian
Oh,I know where is wrong . Sorry . Let me know if it worked .
data have; input date & anydtdte20. time : time9. data quantity ap_0; format date date9. time time8.; cards; Jan 1, 2014 7:30 123 5 2 Jan 1, 2014 7:31 345 2 4 Jan 1, 2014 7:33 567 3 6 Jan 2, 2014 7:30 456 8 1 Jan 2, 2014 7:34 234 9 2 ; run; data want; merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time); t=time; output; if date eq _date then do; do i=time+60 to _time-60 by 60; t=i;output; end; end; format t time9.; drop i _: time; run;
Xia Keshan
Message was edited by: xia keshan
Unfortunately it doesn't.
This is what the original dataset looks like, somewhat;
date | time | Volume |
2006-01-11 | 7:30:55 | 85 |
2006-01-11 | 7:31:59 | 170 |
2006-01-11 | 7:32:59 | 153 |
2006-01-11 | 7:34:45 | 0 |
2006-01-11 | 7:36:59 | 85 |
2006-01-11 | 7:37:52 | 51 |
2006-01-11 | 7:38:38 | 119 |
2006-01-11 | 7:39:59 | 969 |
2006-01-11 | 7:40:57 | 170 |
2006-01-11 | 7:41:59 | 561 |
2006-01-11 | 7:42:58 | 153 |
2006-01-11 | 7:43:53 | 85 |
2006-01-11 | 7:45:59 | 2499 |
2006-01-11 | 7:46:55 | 51 |
2006-01-11 | 7:48:50 | 0 |
2006-01-11 | 7:49:58 | 340 |
2006-01-11 | 7:50:43 | 17 |
2006-01-11 | 7:51:53 | 34 |
2006-01-11 | 7:52:59 | 34 |
2006-01-11 | 7:53:59 | 663 |
2006-01-11 | 7:54:52 | 0 |
2006-01-11 | 7:55:58 | 51 |
2006-01-11 | 7:56:51 | 1547 |
2006-01-11 | 7:57:59 | 459 |
2006-01-11 | 7:58:59 | 34 |
2006-01-11 | 7:59:59 | 0 |
2006-01-11 | 8:00:57 | 0 |
2006-01-11 | 8:01:32 | 0 |
2006-01-11 | 8:02:59 | 527 |
2006-01-11 | 8:04:59 | 17 |
2006-01-11 | 8:05:48 | 391 |
2006-01-11 | 8:06:50 | 68 |
2006-01-11 | 8:07:57 | 0 |
2006-01-11 | 8:09:59 | 969 |
2006-01-11 | 8:10:59 | 136 |
2006-01-11 | 8:11:55 | 85 |
2006-01-11 | 8:12:59 | 680 |
2006-01-11 | 8:13:59 | 51 |
2006-01-11 | 8:14:56 | 34 |
2006-01-11 | 8:15:58 | 0 |
2006-01-11 | 8:16:52 | 476 |
2006-01-11 | 8:17:52 | 102 |
2006-01-11 | 8:18:57 | 85 |
2006-01-11 | 8:19:59 | 221 |
2006-01-11 | 8:20:57 | 765 |
2006-01-11 | 8:21:59 | 1326 |
2006-01-11 | 8:22:59 | 1819 |
2006-01-11 | 8:23:59 | 510 |
2006-01-11 | 8:24:59 | 425 |
2006-01-11 | 8:25:59 | 255 |
2006-01-11 | 8:26:58 | 170 |
2006-01-11 | 8:28:59 | 714 |
2006-01-11 | 8:29:59 | 102 |
2006-03-27 | 7:30:31 | 0 |
2006-03-27 | 7:31:50 | 0 |
2006-03-27 | 7:32:54 | 0 |
2006-03-27 | 7:33:40 | 21 |
2006-03-27 | 7:34:59 | 4 |
2006-03-27 | 7:35:56 | 5 |
2006-03-27 | 7:36:52 | 0 |
2006-03-27 | 7:37:59 | 28 |
2006-03-27 | 7:38:59 | 18 |
2006-03-27 | 7:39:55 | 0 |
2006-03-27 | 7:40:57 | 0 |
2006-03-27 | 7:41:59 | 0 |
2006-03-27 | 7:42:57 | 0 |
2006-03-27 | 7:43:49 | 0 |
2006-03-27 | 7:44:54 | 10 |
2006-03-27 | 7:45:55 | 0 |
2006-03-27 | 7:46:52 | 2 |
2006-03-27 | 7:47:58 | 10 |
2006-03-27 | 7:48:57 | 11 |
2006-03-27 | 7:50:57 | 10 |
2006-03-27 | 7:51:45 | 0 |
2006-03-27 | 7:52:59 | 28 |
2006-03-27 | 7:53:48 | 1 |
2006-03-27 | 7:54:38 | 0 |
2006-03-27 | 7:55:50 | 5 |
2006-03-27 | 7:56:59 | 9 |
2006-03-27 | 7:58:59 | 3 |
2006-03-27 | 7:59:59 | 0 |
2006-03-27 | 8:00:45 | 7 |
2006-03-27 | 8:01:43 | 0 |
2006-03-27 | 8:02:59 | 2 |
2006-03-27 | 8:03:56 | 16 |
2006-03-27 | 8:04:47 | 6 |
2006-03-27 | 8:05:54 | 0 |
2006-03-27 | 8:06:55 | 0 |
2006-03-27 | 8:07:47 | 0 |
2006-03-27 | 8:08:59 | 0 |
2006-03-27 | 8:10:59 | 2 |
2006-03-27 | 8:11:50 | 25 |
2006-03-27 | 8:12:57 | 1 |
2006-03-27 | 8:13:33 | 0 |
2006-03-27 | 8:15:29 | 0 |
2006-03-27 | 8:16:56 | 0 |
2006-03-27 | 8:17:59 | 0 |
2006-03-27 | 8:18:59 | 30 |
2006-03-27 | 8:19:59 | 0 |
2006-03-27 | 8:20:56 | 29 |
2006-03-27 | 8:21:57 | 3 |
2006-03-27 | 8:22:53 | 0 |
2006-03-27 | 8:23:51 | 29 |
2006-03-27 | 8:24:56 | 1 |
2006-03-27 | 8:25:58 | 14 |
2006-03-27 | 8:26:59 | 24 |
2006-03-27 | 8:27:58 | 14 |
2006-03-27 | 8:28:59 | 23 |
2006-03-27 | 8:29:59 | 46 |
I ran your code, because my dataset is really large, I got back something that looks pretty much the same as what I started with. Eg on Jan 11th, I expect to find 7:33 with a volume of 153 and so on and so forth.
Thank you again for continuing to help.
Oh . Your time have the seconds which is not like your original data 7:30 .
data have;
input date & anydtdte20. time : time9. Volume;
time1=hms(hour(time),minute(time),1);
format date date9. time time8.;
cards;
2006-01-11 7:30:55 85
2006-01-11 7:31:59 170
2006-01-11 7:32:59 153
2006-01-11 7:34:45 0
2006-01-11 7:36:59 85
2006-01-11 7:37:52 51
2006-01-11 7:38:38 119
2006-01-11 7:39:59 969
2006-01-11 7:40:57 170
2006-01-11 7:41:59 561
2006-01-11 7:42:58 153
2006-01-11 7:43:53 85
2006-01-11 7:45:59 2499
2006-01-11 7:46:55 51
2006-01-11 7:48:50 0
2006-01-11 7:49:58 340
2006-01-11 7:50:43 17
2006-01-11 7:51:53 34
2006-01-11 7:52:59 34
2006-01-11 7:53:59 663
2006-01-11 7:54:52 0
2006-01-11 7:55:58 51
2006-01-11 7:56:51 1547
2006-01-11 7:57:59 459
2006-01-11 7:58:59 34
2006-01-11 7:59:59 0
2006-01-11 8:00:57 0
2006-01-11 8:01:32 0
2006-01-11 8:02:59 527
2006-01-11 8:04:59 17
2006-01-11 8:05:48 391
2006-01-11 8:06:50 68
2006-01-11 8:07:57 0
2006-01-11 8:09:59 969
2006-01-11 8:10:59 136
2006-01-11 8:11:55 85
2006-01-11 8:12:59 680
2006-01-11 8:13:59 51
2006-01-11 8:14:56 34
2006-01-11 8:15:58 0
2006-01-11 8:16:52 476
2006-01-11 8:17:52 102
2006-01-11 8:18:57 85
2006-01-11 8:19:59 221
2006-01-11 8:20:57 765
2006-01-11 8:21:59 1326
2006-01-11 8:22:59 1819
2006-01-11 8:23:59 510
2006-01-11 8:24:59 425
2006-01-11 8:25:59 255
2006-01-11 8:26:58 170
2006-01-11 8:28:59 714
2006-01-11 8:29:59 102
2006-03-27 7:30:31 0
2006-03-27 7:31:50 0
2006-03-27 7:32:54 0
2006-03-27 7:33:40 21
2006-03-27 7:34:59 4
2006-03-27 7:35:56 5
2006-03-27 7:36:52 0
2006-03-27 7:37:59 28
2006-03-27 7:38:59 18
2006-03-27 7:39:55 0
2006-03-27 7:40:57 0
2006-03-27 7:41:59 0
2006-03-27 7:42:57 0
2006-03-27 7:43:49 0
2006-03-27 7:44:54 10
2006-03-27 7:45:55 0
2006-03-27 7:46:52 2
2006-03-27 7:47:58 10
2006-03-27 7:48:57 11
2006-03-27 7:50:57 10
2006-03-27 7:51:45 0
2006-03-27 7:52:59 28
2006-03-27 7:53:48 1
2006-03-27 7:54:38 0
2006-03-27 7:55:50 5
2006-03-27 7:56:59 9
2006-03-27 7:58:59 3
2006-03-27 7:59:59 0
2006-03-27 8:00:45 7
2006-03-27 8:01:43 0
2006-03-27 8:02:59 2
2006-03-27 8:03:56 16
2006-03-27 8:04:47 6
2006-03-27 8:05:54 0
2006-03-27 8:06:55 0
2006-03-27 8:07:47 0
2006-03-27 8:08:59 0
2006-03-27 8:10:59 2
2006-03-27 8:11:50 25
2006-03-27 8:12:57 1
2006-03-27 8:13:33 0
2006-03-27 8:15:29 0
2006-03-27 8:16:56 0
2006-03-27 8:17:59 0
2006-03-27 8:18:59 30
2006-03-27 8:19:59 0
2006-03-27 8:20:56 29
2006-03-27 8:21:57 3
2006-03-27 8:22:53 0
2006-03-27 8:23:51 29
2006-03-27 8:24:56 1
2006-03-27 8:25:58 14
2006-03-27 8:26:59 24
2006-03-27 8:27:58 14
2006-03-27 8:28:59 23
2006-03-27 8:29:59 46
;
run;
data want;
merge have have(firstobs=2 rename=(date=_date time1=_time1) keep=date time1);
t=time1;
output;
if date eq _date then do;
do i=time1+60 to _time1-60 by 60;
t=i;output;
end;
end;
format t time1 time9.;
drop i _: time1 time;
run;
Xia Keshan
Hello Xia thanks for your continuous assistance. However, it wasn't the seconds that was giving me the problem. Anyway, I found what I was looking for using proc expand, method=step, and extrapolate. Don't have the code ready at the moment but it seems promising.
Thanks for your help.
Ran the code, but not getting what I want. There are still missing minutes. Here's a small chunk of my original dataset
date | time | Volume |
2006-01-11 | 7:30:55 | 85 |
2006-01-11 | 7:31:59 | 170 |
2006-01-11 | 7:32:59 | 153 |
2006-01-11 | 7:34:45 | 0 |
2006-01-11 | 7:36:59 | 85 |
2006-01-11 | 7:37:52 | 51 |
2006-01-11 | 7:38:38 | 119 |
2006-01-11 | 7:39:59 | 969 |
2006-01-11 | 7:40:57 | 170 |
2006-01-11 | 7:41:59 | 561 |
2006-01-11 | 7:42:58 | 153 |
2006-01-11 | 7:43:53 | 85 |
2006-01-11 | 7:45:59 | 2499 |
2006-01-11 | 7:46:55 | 51 |
2006-01-11 | 7:48:50 | 0 |
2006-01-11 | 7:49:58 | 340 |
2006-01-11 | 7:50:43 | 17 |
2006-01-11 | 7:51:53 | 34 |
2006-01-11 | 7:52:59 | 34 |
2006-01-11 | 7:53:59 | 663 |
2006-01-11 | 7:54:52 | 0 |
2006-01-11 | 7:55:58 | 51 |
2006-01-11 | 7:56:51 | 1547 |
2006-01-11 | 7:57:59 | 459 |
2006-01-11 | 7:58:59 | 34 |
2006-01-11 | 7:59:59 | 0 |
2006-01-11 | 8:00:57 | 0 |
2006-01-11 | 8:01:32 | 0 |
2006-01-11 | 8:02:59 | 527 |
2006-01-11 | 8:04:59 | 17 |
2006-01-11 | 8:05:48 | 391 |
2006-01-11 | 8:06:50 | 68 |
2006-01-11 | 8:07:57 | 0 |
2006-01-11 | 8:09:59 | 969 |
2006-01-11 | 8:10:59 | 136 |
2006-01-11 | 8:11:55 | 85 |
2006-01-11 | 8:12:59 | 680 |
2006-01-11 | 8:13:59 | 51 |
2006-01-11 | 8:14:56 | 34 |
2006-01-11 | 8:15:58 | 0 |
2006-01-11 | 8:16:52 | 476 |
2006-01-11 | 8:17:52 | 102 |
2006-01-11 | 8:18:57 | 85 |
2006-01-11 | 8:19:59 | 221 |
2006-01-11 | 8:20:57 | 765 |
2006-01-11 | 8:21:59 | 1326 |
2006-01-11 | 8:22:59 | 1819 |
2006-01-11 | 8:23:59 | 510 |
2006-01-11 | 8:24:59 | 425 |
2006-01-11 | 8:25:59 | 255 |
2006-01-11 | 8:26:58 | 170 |
2006-01-11 | 8:28:59 | 714 |
2006-01-11 | 8:29:59 | 102 |
2006-03-27 | 7:30:31 | 0 |
2006-03-27 | 7:31:50 | 0 |
2006-03-27 | 7:32:54 | 0 |
2006-03-27 | 7:33:40 | 21 |
2006-03-27 | 7:34:59 | 4 |
2006-03-27 | 7:35:56 | 5 |
2006-03-27 | 7:36:52 | 0 |
2006-03-27 | 7:37:59 | 28 |
2006-03-27 | 7:38:59 | 18 |
2006-03-27 | 7:39:55 | 0 |
2006-03-27 | 7:40:57 | 0 |
2006-03-27 | 7:41:59 | 0 |
2006-03-27 | 7:42:57 | 0 |
2006-03-27 | 7:43:49 | 0 |
2006-03-27 | 7:44:54 | 10 |
2006-03-27 | 7:45:55 | 0 |
2006-03-27 | 7:46:52 | 2 |
2006-03-27 | 7:47:58 | 10 |
2006-03-27 | 7:48:57 | 11 |
2006-03-27 | 7:50:57 | 10 |
2006-03-27 | 7:51:45 | 0 |
2006-03-27 | 7:52:59 | 28 |
2006-03-27 | 7:53:48 | 1 |
2006-03-27 | 7:54:38 | 0 |
2006-03-27 | 7:55:50 | 5 |
2006-03-27 | 7:56:59 | 9 |
2006-03-27 | 7:58:59 | 3 |
2006-03-27 | 7:59:59 | 0 |
2006-03-27 | 8:00:45 | 7 |
2006-03-27 | 8:01:43 | 0 |
2006-03-27 | 8:02:59 | 2 |
2006-03-27 | 8:03:56 | 16 |
2006-03-27 | 8:04:47 | 6 |
2006-03-27 | 8:05:54 | 0 |
2006-03-27 | 8:06:55 | 0 |
2006-03-27 | 8:07:47 | 0 |
2006-03-27 | 8:08:59 | 0 |
2006-03-27 | 8:10:59 | 2 |
2006-03-27 | 8:11:50 | 25 |
2006-03-27 | 8:12:57 | 1 |
2006-03-27 | 8:13:33 | 0 |
2006-03-27 | 8:15:29 | 0 |
2006-03-27 | 8:16:56 | 0 |
2006-03-27 | 8:17:59 | 0 |
2006-03-27 | 8:18:59 | 30 |
2006-03-27 | 8:19:59 | 0 |
2006-03-27 | 8:20:56 | 29 |
2006-03-27 | 8:21:57 | 3 |
2006-03-27 | 8:22:53 | 0 |
2006-03-27 | 8:23:51 | 29 |
2006-03-27 | 8:24:56 | 1 |
2006-03-27 | 8:25:58 | 14 |
2006-03-27 | 8:26:59 | 24 |
2006-03-27 | 8:27:58 | 14 |
2006-03-27 | 8:28:59 | 23 |
2006-03-27 | 8:29:59 | 46 |
Thanks for continuing to help.
I am pretty sure there is a much efficient way of doing it, however this is what I came up with
proc expand data = last out = addmissobs to = minute method=none; *identify missing minutes;
by date;
id time;
data addmissobs; *observations are 'misplaced' after proc expand command. The following 3 data steps are to realign observations back to its original time period, and keep the relevant observations;
merge addmissobs last;
by date time;
output;
data addmissobs;
set addmissobs;
time = timepart(time);
hour = hour(time);
min = minute(time);
data addmissobs;
set addmissobs;
by date hour min;
first=first.min;
last=last.min;
if first=1 and last=0 then delete;
drop first last;
run;
I am still very new to SAS and is very much interested in learning more. If someone has a much efficient code please do post.
Xian, the merge statement is definitely something that I've overlooked. Thanks again.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.