BookmarkSubscribeRSS Feed
Brian_Chee
Calcite | Level 5

Hi there, rather than being wordy and confusing I'll just present what I have and what I want;

What I have:

DateTimeData
Jan 1, 20147:30123
Jan 1, 20147:31345
Jan 1, 20147:33567
Jan 2, 20147:30456
Jan 2, 20147:32234

What I want:

DateTimeData
Jan 1, 20147:30123
Jan 1, 20147:31345
Jan 1, 20147:32345
Jan 1, 20147:33567
Jan 2, 20147:30456
Jan 2, 20147:31456
Jan 2, 20147:32234

The code I have

proc expand data = last out = addmissobs to = minute method=step;

  by date;

  id time;

run;

What I got:

DateTimeData
Jan 1, 20147:30123
Jan 1, 20147:31123
Jan 1, 20147:32345
Jan 1, 20147:33567
Jan 2, 20147:30456
Jan 2, 20147:31234
Jan 2, 20147:32234

When I rerun the code using method=none, I noticed that after inserting the missing minutes, all existing data got lead to the next period. Ie Jan 1 7:30 data became Jan 1 7:31 data. Then it replaces the missing data with the next period data. When it was supposed to replace the missing data with the previous period data.

Please help. Thank you in advance.

14 REPLIES 14
Brian_Chee
Calcite | Level 5

Sorry I meant to say it moved my observations ahead to the next period.

Ksharp
Super User
data have;
input date & anydtdte20. time : time9. data;
format date date9. time time8.;
cards;
Jan 1, 2014  7:30 123
Jan 1, 2014  7:31 345
Jan 1, 2014  7:33 567
Jan 2, 2014  7:30 456
Jan 2, 2014  7:32 234
;
run;
data want;
merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time);
output;
if date eq _date then do;
  do i=time+60 to _time-60;
   time=i;output;
  end;
end;
drop i _:;
run;



Xia Keshan

Message was edited by: xia keshan

Brian_Chee
Calcite | Level 5

Hi Xia,

I don't quite get the do i=time+60 to _time-60 part.

I ran the code you provided and I ended up getting back the original data set.

To elaborate further, my data set is about 5G, going for about 3 months, starting at 7:30am to 5pm. I have over 25 variables for each period. I need observations for every minutes. If there is a particular minute missing, I need to generate the missing minute and use observations from the previous minute.

Thanks

Brian_Chee
Calcite | Level 5

Hi Xia,

Rechecked the code, I forgot the firstobs=2.

The new data I obtained lead all the observations. 7:30's observations are now 7:31's observations. It did generate the missing minutes - the missing Jan 1, 2014 7:32 is now there, but there are 10 of them. Not a big issue there, I can always delete replicates later. But biggest issue now is I lost my first observations each day.

Ksharp
Super User

60 means a minutes.

You must have some other variables like a group variable ID .

So you'd better post all your data . I am leaving now. See you tomorrow .

Brian_Chee
Calcite | Level 5

My data looks like this

Date               Time      Quantity     ap_0     aqn_0     ........     ap_10     aqn_10

Jan 1, 2014     7:30          5          100.05     85                    105.23          56

Jan 1, 2014     7:31          2          100.12     86                    123.21          45

Jan 1, 2014     7:33          9          99.12       50                    110.11          23

.

.

.

Jan 1, 2014     5:00          25        132.21     80                    142.45             90

Jan 2, 2014     7:30          10         95.23     12                    100.00              45

Jan 2, 2014     7:32          5           45.00     5                       86.21              87

As previously mentioned, I have data stretching up to about 3 months. Each day (except for weekends) starting from 7:30 to 5:00. I want observations for each minute from 7:30 to 5:00. So I'm looking for 570 observations for each day.

Ksharp
Super User

My code should be worked since there is only one group variable DATE .

What wrong obs you got ? OR try this one which is almost the same as above .

data have;
input date & anydtdte20. time : time9. data quantity ap_0;
format date date9. time time8.;
cards;
Jan 1, 2014   7:30 123 5 2
Jan 1, 2014   7:31 345 2 4
Jan 1, 2014   7:33 567 3 6
Jan 2, 2014   7:30 456 8 1
Jan 2, 2014   7:32 234 9 2
;
run;
data want;
merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time);
t=time;
output;
if date eq _date then do;
  do i=time+60 to _time-60;
   t=i;output;
  end;
end;
format t time9.;
drop i _: time;
run;
 

Xia Keshan

Brian_Chee
Calcite | Level 5

Xia,

It runs, but it did not catch the missing minutes. After running your code I get back my original dataset basically.

Brian

Ksharp
Super User

Oh,I know where is wrong . Sorry . Let me know if it worked .

data have;
input date & anydtdte20. time : time9. data quantity ap_0;
format date date9. time time8.;
cards;
Jan 1, 2014   7:30 123 5 2
Jan 1, 2014   7:31 345 2 4
Jan 1, 2014   7:33 567 3 6
Jan 2, 2014   7:30 456 8 1
Jan 2, 2014   7:34 234 9 2
;
run;
data want;
merge have have(firstobs=2 rename=(date=_date time=_time) keep=date time);
t=time;
output;
if date eq _date then do;
  do i=time+60 to _time-60 by 60;
   t=i;output;
  end;
end;
format t time9.;
drop i _: time;
run;

Xia Keshan

Message was edited by: xia keshan

Brian_Chee
Calcite | Level 5

Unfortunately it doesn't.

This is what the original dataset looks like, somewhat;

datetimeVolume
2006-01-117:30:5585
2006-01-117:31:59170
2006-01-117:32:59153
2006-01-117:34:450
2006-01-117:36:5985
2006-01-117:37:5251
2006-01-117:38:38119
2006-01-117:39:59969
2006-01-117:40:57170
2006-01-117:41:59561
2006-01-117:42:58153
2006-01-117:43:5385
2006-01-117:45:592499
2006-01-117:46:5551
2006-01-117:48:500
2006-01-117:49:58340
2006-01-117:50:4317
2006-01-117:51:5334
2006-01-117:52:5934
2006-01-117:53:59663
2006-01-117:54:520
2006-01-117:55:5851
2006-01-117:56:511547
2006-01-117:57:59459
2006-01-117:58:5934
2006-01-117:59:590
2006-01-118:00:570
2006-01-118:01:320
2006-01-118:02:59527
2006-01-118:04:5917
2006-01-118:05:48391
2006-01-118:06:5068
2006-01-118:07:570
2006-01-118:09:59969
2006-01-118:10:59136
2006-01-118:11:5585
2006-01-118:12:59680
2006-01-118:13:5951
2006-01-118:14:5634
2006-01-118:15:580
2006-01-118:16:52476
2006-01-118:17:52102
2006-01-118:18:5785
2006-01-118:19:59221
2006-01-118:20:57765
2006-01-118:21:591326
2006-01-118:22:591819
2006-01-118:23:59510
2006-01-118:24:59425
2006-01-118:25:59255
2006-01-118:26:58170
2006-01-118:28:59714
2006-01-118:29:59102
2006-03-277:30:310
2006-03-277:31:500
2006-03-277:32:540
2006-03-277:33:4021
2006-03-277:34:594
2006-03-277:35:565
2006-03-277:36:520
2006-03-277:37:5928
2006-03-277:38:5918
2006-03-277:39:550
2006-03-277:40:570
2006-03-277:41:590
2006-03-277:42:570
2006-03-277:43:490
2006-03-277:44:5410
2006-03-277:45:550
2006-03-277:46:522
2006-03-277:47:5810
2006-03-277:48:5711
2006-03-277:50:5710
2006-03-277:51:450
2006-03-277:52:5928
2006-03-277:53:481
2006-03-277:54:380
2006-03-277:55:505
2006-03-277:56:599
2006-03-277:58:593
2006-03-277:59:590
2006-03-278:00:457
2006-03-278:01:430
2006-03-278:02:592
2006-03-278:03:5616
2006-03-278:04:476
2006-03-278:05:540
2006-03-278:06:550
2006-03-278:07:470
2006-03-278:08:590
2006-03-278:10:592
2006-03-278:11:5025
2006-03-278:12:571
2006-03-278:13:330
2006-03-278:15:290
2006-03-278:16:560
2006-03-278:17:590
2006-03-278:18:5930
2006-03-278:19:590
2006-03-278:20:5629
2006-03-278:21:573
2006-03-278:22:530
2006-03-278:23:5129
2006-03-278:24:561
2006-03-278:25:5814
2006-03-278:26:5924
2006-03-278:27:5814
2006-03-278:28:5923
2006-03-278:29:5946

I ran your code, because my dataset is really large, I got back something that looks pretty much the same as what I started with. Eg on Jan 11th, I expect to find 7:33 with a volume of 153 and so on and so forth.

Thank you again for continuing to help.

Ksharp
Super User

Oh . Your time have the seconds which is not like your original data 7:30 .

data have;

input date & anydtdte20. time : time9. Volume;

time1=hms(hour(time),minute(time),1);

format date date9. time time8.;

cards;

2006-01-11 7:30:55 85

2006-01-11 7:31:59 170

2006-01-11 7:32:59 153

2006-01-11 7:34:45 0

2006-01-11 7:36:59 85

2006-01-11 7:37:52 51

2006-01-11 7:38:38 119

2006-01-11 7:39:59 969

2006-01-11 7:40:57 170

2006-01-11 7:41:59 561

2006-01-11 7:42:58 153

2006-01-11 7:43:53 85

2006-01-11 7:45:59 2499

2006-01-11 7:46:55 51

2006-01-11 7:48:50 0

2006-01-11 7:49:58 340

2006-01-11 7:50:43 17

2006-01-11 7:51:53 34

2006-01-11 7:52:59 34

2006-01-11 7:53:59 663

2006-01-11 7:54:52 0

2006-01-11 7:55:58 51

2006-01-11 7:56:51 1547

2006-01-11 7:57:59 459

2006-01-11 7:58:59 34

2006-01-11 7:59:59 0

2006-01-11 8:00:57 0

2006-01-11 8:01:32 0

2006-01-11 8:02:59 527

2006-01-11 8:04:59 17

2006-01-11 8:05:48 391

2006-01-11 8:06:50 68

2006-01-11 8:07:57 0

2006-01-11 8:09:59 969

2006-01-11 8:10:59 136

2006-01-11 8:11:55 85

2006-01-11 8:12:59 680

2006-01-11 8:13:59 51

2006-01-11 8:14:56 34

2006-01-11 8:15:58 0

2006-01-11 8:16:52 476

2006-01-11 8:17:52 102

2006-01-11 8:18:57 85

2006-01-11 8:19:59 221

2006-01-11 8:20:57 765

2006-01-11 8:21:59 1326

2006-01-11 8:22:59 1819

2006-01-11 8:23:59 510

2006-01-11 8:24:59 425

2006-01-11 8:25:59 255

2006-01-11 8:26:58 170

2006-01-11 8:28:59 714

2006-01-11 8:29:59 102

2006-03-27 7:30:31 0

2006-03-27 7:31:50 0

2006-03-27 7:32:54 0

2006-03-27 7:33:40 21

2006-03-27 7:34:59 4

2006-03-27 7:35:56 5

2006-03-27 7:36:52 0

2006-03-27 7:37:59 28

2006-03-27 7:38:59 18

2006-03-27 7:39:55 0

2006-03-27 7:40:57 0

2006-03-27 7:41:59 0

2006-03-27 7:42:57 0

2006-03-27 7:43:49 0

2006-03-27 7:44:54 10

2006-03-27 7:45:55 0

2006-03-27 7:46:52 2

2006-03-27 7:47:58 10

2006-03-27 7:48:57 11

2006-03-27 7:50:57 10

2006-03-27 7:51:45 0

2006-03-27 7:52:59 28

2006-03-27 7:53:48 1

2006-03-27 7:54:38 0

2006-03-27 7:55:50 5

2006-03-27 7:56:59 9

2006-03-27 7:58:59 3

2006-03-27 7:59:59 0

2006-03-27 8:00:45 7

2006-03-27 8:01:43 0

2006-03-27 8:02:59 2

2006-03-27 8:03:56 16

2006-03-27 8:04:47 6

2006-03-27 8:05:54 0

2006-03-27 8:06:55 0

2006-03-27 8:07:47 0

2006-03-27 8:08:59 0

2006-03-27 8:10:59 2

2006-03-27 8:11:50 25

2006-03-27 8:12:57 1

2006-03-27 8:13:33 0

2006-03-27 8:15:29 0

2006-03-27 8:16:56 0

2006-03-27 8:17:59 0

2006-03-27 8:18:59 30

2006-03-27 8:19:59 0

2006-03-27 8:20:56 29

2006-03-27 8:21:57 3

2006-03-27 8:22:53 0

2006-03-27 8:23:51 29

2006-03-27 8:24:56 1

2006-03-27 8:25:58 14

2006-03-27 8:26:59 24

2006-03-27 8:27:58 14

2006-03-27 8:28:59 23

2006-03-27 8:29:59 46

;

run;

data want;

merge have have(firstobs=2 rename=(date=_date time1=_time1) keep=date time1);

t=time1;

output;

if date eq _date then do;

  do i=time1+60 to _time1-60 by 60;

   t=i;output;

  end;

end;

format t time1 time9.;

drop i _: time1 time;

run;

Xia Keshan

Brian_Chee
Calcite | Level 5

Hello Xia thanks for your continuous assistance. However, it wasn't the seconds that was giving me the problem. Anyway, I found what I was looking for using proc expand, method=step, and extrapolate. Don't have the code ready at the moment but it seems promising.


Thanks for your help.

Brian_Chee
Calcite | Level 5

Ran the code, but not getting what I want. There are still missing minutes. Here's a small chunk of my original dataset

datetimeVolume
2006-01-117:30:5585
2006-01-117:31:59170
2006-01-117:32:59153
2006-01-117:34:450
2006-01-117:36:5985
2006-01-117:37:5251
2006-01-117:38:38119
2006-01-117:39:59969
2006-01-117:40:57170
2006-01-117:41:59561
2006-01-117:42:58153
2006-01-117:43:5385
2006-01-117:45:592499
2006-01-117:46:5551
2006-01-117:48:500
2006-01-117:49:58340
2006-01-117:50:4317
2006-01-117:51:5334
2006-01-117:52:5934
2006-01-117:53:59663
2006-01-117:54:520
2006-01-117:55:5851
2006-01-117:56:511547
2006-01-117:57:59459
2006-01-117:58:5934
2006-01-117:59:590
2006-01-118:00:570
2006-01-118:01:320
2006-01-118:02:59527
2006-01-118:04:5917
2006-01-118:05:48391
2006-01-118:06:5068
2006-01-118:07:570
2006-01-118:09:59969
2006-01-118:10:59136
2006-01-118:11:5585
2006-01-118:12:59680
2006-01-118:13:5951
2006-01-118:14:5634
2006-01-118:15:580
2006-01-118:16:52476
2006-01-118:17:52102
2006-01-118:18:5785
2006-01-118:19:59221
2006-01-118:20:57765
2006-01-118:21:591326
2006-01-118:22:591819
2006-01-118:23:59510
2006-01-118:24:59425
2006-01-118:25:59255
2006-01-118:26:58170
2006-01-118:28:59714
2006-01-118:29:59102
2006-03-277:30:310
2006-03-277:31:500
2006-03-277:32:540
2006-03-277:33:4021
2006-03-277:34:594
2006-03-277:35:565
2006-03-277:36:520
2006-03-277:37:5928
2006-03-277:38:5918
2006-03-277:39:550
2006-03-277:40:570
2006-03-277:41:590
2006-03-277:42:570
2006-03-277:43:490
2006-03-277:44:5410
2006-03-277:45:550
2006-03-277:46:522
2006-03-277:47:5810
2006-03-277:48:5711
2006-03-277:50:5710
2006-03-277:51:450
2006-03-277:52:5928
2006-03-277:53:481
2006-03-277:54:380
2006-03-277:55:505
2006-03-277:56:599
2006-03-277:58:593
2006-03-277:59:590
2006-03-278:00:457
2006-03-278:01:430
2006-03-278:02:592
2006-03-278:03:5616
2006-03-278:04:476
2006-03-278:05:540
2006-03-278:06:550
2006-03-278:07:470
2006-03-278:08:590
2006-03-278:10:592
2006-03-278:11:5025
2006-03-278:12:571
2006-03-278:13:330
2006-03-278:15:290
2006-03-278:16:560
2006-03-278:17:590
2006-03-278:18:5930
2006-03-278:19:590
2006-03-278:20:5629
2006-03-278:21:573
2006-03-278:22:530
2006-03-278:23:5129
2006-03-278:24:561
2006-03-278:25:5814
2006-03-278:26:5924
2006-03-278:27:5814
2006-03-278:28:5923
2006-03-278:29:5946

Thanks for continuing to help.

Brian_Chee
Calcite | Level 5

I am pretty sure there is a much efficient way of doing it, however this is what I came up with

proc expand data = last out = addmissobs to = minute method=none; *identify missing minutes;

  by date;

  id time;

data addmissobs; *observations are 'misplaced' after proc expand command. The following 3 data steps are to realign observations back to its original time period, and keep the relevant observations;

  merge addmissobs last;

  by date time;

  output;

data addmissobs;

  set addmissobs;

  time = timepart(time);

  hour = hour(time);

  min = minute(time);

data addmissobs;

  set addmissobs;

  by date hour min;

  first=first.min;

  last=last.min;

  if first=1 and last=0 then delete;

  drop first last;

run;

I am still very new to SAS and is very much interested in learning more. If someone has a much efficient code please do post.

Xian, the merge statement is definitely something that I've overlooked. Thanks again.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1124 views
  • 3 likes
  • 2 in conversation