Conditionally insert rows in a table

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Conditionally insert rows in a table

Hi everyone,

I need some help. I have a massive dataset (tick by tick data). The columns of the table are: Dates(i.e.different dates), Hour, Minutes, Price, Volume. My problem is that some minutes are missing, so I want to insert them in the data. For instance I have data for minute 25 26 28 30 35 and I want to add rows for minute 27, 29, 31, 32,34, 33 etc for each hour and each date. The data for the price and volume  in the new rows should be empty. Could you please give any suggestion? Thanks


Accepted Solutions
Solution
‎05-21-2015 08:47 AM
Super User
Super User
Posts: 7,720

Re: Conditionally insert rows in a table

You have two options.  One is to go through the dataset, and do do-loops where gaps exist.  The other option is to create a template and merge that back to your data.  I will give an example of the second approach:

data have;

  id=1; date="01jan15"d; hour=3; minute=4; price=12; volume=16; output;

  id=1; date="01jan15"d; hour=3; minute=5; price=16; volume=20; output;

  id=1; date="01jan15"d; hour=3; minute=10; price=8; volume=4; output;

  id=1; date="01jan15"d; hour=5; minute=10; price=7; volume=25; output;

  format date date9.;

run;

/* Create a dataset expanded for each ID from minimum to maximum */

proc sql;

  create table TEMPLATE as

  select  ID,

          min(dhms(date,hour,minute,0)) as LOW,

          max(dhms(date,hour,minute,0)) as HIGH

  from    HAVE

  group by ID;

quit;

data template (drop=low high i);

  set template;

  do i=low to high by 60;

    date=datepart(i);

    hour=hour(i);

    minute=minute(i);

    output;

  end;

run;

/* Merge the template onto the data */

proc sql;

  create table WANT as

  select  COALESCE(A.ID,B.ID) as ID,

          COALESCE(A.DATE,B.DATE) as DATE format=date9.,

          COALESCE(A.HOUR,B.HOUR) as HOUR,

          COALESCE(A.MINUTE,B.MINUTE) as MINUTE,

          A.PRICE,

          A.VOLUME

  from    WORK.HAVE A

  full join WORK.TEMPLATE B

  on      A.ID=B.ID

  and     A.DATE=B.DATE

  and     A.HOUR=B.HOUR

  and     A.MINUTE=B.MINUTE;

quit;

View solution in original post


All Replies
Solution
‎05-21-2015 08:47 AM
Super User
Super User
Posts: 7,720

Re: Conditionally insert rows in a table

You have two options.  One is to go through the dataset, and do do-loops where gaps exist.  The other option is to create a template and merge that back to your data.  I will give an example of the second approach:

data have;

  id=1; date="01jan15"d; hour=3; minute=4; price=12; volume=16; output;

  id=1; date="01jan15"d; hour=3; minute=5; price=16; volume=20; output;

  id=1; date="01jan15"d; hour=3; minute=10; price=8; volume=4; output;

  id=1; date="01jan15"d; hour=5; minute=10; price=7; volume=25; output;

  format date date9.;

run;

/* Create a dataset expanded for each ID from minimum to maximum */

proc sql;

  create table TEMPLATE as

  select  ID,

          min(dhms(date,hour,minute,0)) as LOW,

          max(dhms(date,hour,minute,0)) as HIGH

  from    HAVE

  group by ID;

quit;

data template (drop=low high i);

  set template;

  do i=low to high by 60;

    date=datepart(i);

    hour=hour(i);

    minute=minute(i);

    output;

  end;

run;

/* Merge the template onto the data */

proc sql;

  create table WANT as

  select  COALESCE(A.ID,B.ID) as ID,

          COALESCE(A.DATE,B.DATE) as DATE format=date9.,

          COALESCE(A.HOUR,B.HOUR) as HOUR,

          COALESCE(A.MINUTE,B.MINUTE) as MINUTE,

          A.PRICE,

          A.VOLUME

  from    WORK.HAVE A

  full join WORK.TEMPLATE B

  on      A.ID=B.ID

  and     A.DATE=B.DATE

  and     A.HOUR=B.HOUR

  and     A.MINUTE=B.MINUTE;

quit;

Occasional Contributor
Posts: 5

Re: Conditionally insert rows in a table

Thank you so much RW9 for the quick reply. I will try it and come back to you whether it works for my dataset. I run your example and works.

Occasional Contributor
Posts: 5

Re: Conditionally insert rows in a table

Just a quick question. How can I get the ID for each date?Should I do a loop? Thanks

Super User
Super User
Posts: 7,720

Re: Conditionally insert rows in a table

Not sure I am following you?  This section will get you distinct ID's with their first and last dates, its used in the loop over, so if min date is 01jan15 and 05jan15 is max, then it will do each day.

Occasional Contributor
Posts: 5

Re: Conditionally insert rows in a table

I mean in my set I do not have ID column. So, I want first to generate this column for each date. For example I have 01/02/2007, 01/02/2007 etc, 02/02/2007 etc. I understand I should have an ID 1 for 01/02/2007, ID 2 for 02/02/2007 etc in order to implement the code.

Super User
Super User
Posts: 7,720

Re: Conditionally insert rows in a table

Oh, just drop the id bit then:

proc sql;

  create table TEMPLATE as

  select  min(dhms(date,hour,minute,0)) as LOW,

          max(dhms(date,hour,minute,0)) as HIGH

  from    HAVE;

quit;

Occasional Contributor
Posts: 5

Re: Conditionally insert rows in a table

Thank you so much RW9. It worked perfectly. Smiley Happy  I did the necessary adjustments and now I have exactly what I wanted.  I am new to SAS and I have tried everything by myself, so your help saved me hours Smiley Wink. I have been trying to fix this problem for the past two days. :smileyplain:

Super User
Posts: 9,875

Re: Conditionally insert rows in a table

data have;
  id=1; date="01jan15"d; hour=3; minute=4; price=12; volume=16; output;
  id=1; date="01jan15"d; hour=3; minute=5; price=16; volume=20; output;
  id=1; date="01jan15"d; hour=3; minute=10; price=8; volume=4; output;
  id=1; date="01jan15"d; hour=5; minute=10; price=7; volume=25; output;
  format date date9.;
run;
data want;
 merge have have(firstobs=2 rename=(id=_id date=_date hour=_hour minute=_minute));
 output;
 if id=_id then do;
   do i=dhms(date,hour,minute,0)+60 to dhms(_date,_hour,_minute,0)-60 by 60;
    date=datepart(i);hour=hour(timepart(i));minute=minute(timepart(i));price=.;volume=.;
     output;
   end;
 end; format i datetime.;
drop _:;
run;

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 740 views
  • 3 likes
  • 3 in conversation