How to eliminate observations between two tables based on a formula?

Accepted Solution Solved
Reply
Regular Contributor
Posts: 166
Accepted Solution

How to eliminate observations between two tables based on a formula?

[ Edited ]

Hello everybody,

I have two tables. First table has name, date, time and intraday price variables. It means there is an intraday price for each name in a specific date and time. Second table has name, date and daily price and the daily price is intraday price aggregation for each name and date.

 

I try to write a program which performs the procedure below:

1- It can find same observations by name and date in two tables and then;

2- If first and last intraday price be out of 0.962 and 1.0398 times of daily price in last day; then deletes all data related to that specific name and date in table 1.

 

The statement is:

IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

For instance, consider two tables which are below:

 

table 1

namedatetimeintraday price
A113
A122
A131
B113.1
B121
B134
B142.9
C113
C122
A213.95
A223
A236
A244.01
B216
B221
C216.1
C223
C232
B315
B327
C318
C322
C333
C342

 

And table 2:

namedatedaily price
A05
B03
C07
A14
B14
C16
B23
C25


Then, based on the procedure, The result is:

namedatetimeintraday price
B113.1
B121
B134
B142.9
A213.95
A223
A236
A244.01
C216.1
C223
C232

 

Would you please tell me how I can do that?

 

Thanks in advance.


Accepted Solutions
Solution
‎11-21-2017 03:33 AM
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

[ Edited ]

Thanks @Shmuel and @KurtBremser

The statement is:

 

 IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

There are still two problems:

1- lines 

 if last.date then last_price = intraday_price;

and

 

 

not (min_price le last_price le max_price)

should be added.

 

 

2- There is lack of matching for last data in table 1 because of merging. So the calculation doesn't work well and I add data for 09 MAY 2008 to solve it.

 

Based on their works, the result is:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

proc sql;
create table table3 as
select * from table1, table2
where table1.name=table2.name and table1.date=table2.date;
quit;

data table2_new;
set table2;
by name;
/* save price of yesterday */
lag_Price = lag(Price);
if first.name then lag_Price = .;
run;

data to_delete(keep = name date);
merge table3 (in=in1)
table2_new (in=in2);
by name date;
retain start_price last_price;

if in1 and in2; /* deal with obs on both tables only */
if first.date then start_price = intradayprice;
if last.date then last_price = intradayprice;
if last.date then do;
min_price = 0.962 * lag_Price;
max_price = 1.0398 * lag_Price;
if not (min_price le start_price le max_price) and not (min_price le last_price le max_price)
then output;
end;
run;

data want;
merge table3 /* table2 */
to_delete (in=indel);
by name date;
if not indel;
run;

 

View solution in original post


All Replies
Super User
Posts: 9,611

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

Sort your table1 by name, date and time.

Sort table2 by name and date.

Create a new table from merging table1 and table2 (by name date), with a new variable start_price, which is retained. Set start_price at first.date. At last.date, compare start_price and current price with the aggregated price according to your formula, and output if condition is met. Keep only name and date.

Now, merge table1 with the new table (again by name and date), and keep or delete if a match is found/not found.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to KurtBremser
Thanks @KurtBremser;
Would you please write the code?
because I am new with SAS and I cannot do it on my own.
Trusted Advisor
Posts: 1,826

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

Here is the code based on @KurtBremser design:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

data to_delete(keep = name date);
merge table1 (in=in1) 
           table2 (in=in2);
  by name date;
       retain start_price;
       if in1 and in2;     /* deal with obs on both tables only */

       if first.date then start_price = intraday_price;
       if last.date then do;
          min_price = 0.962 * daily_price;
          max_price = 1.0398 * daily_price;
          if not (min_price le start_price le max_price)
             then output; 
       end;
run;

data want;
merge table 1 /* table2 */
           to_delete (in=indel);
  by name date;
       if not indel;
run;
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

[ Edited ]

Thanks @Shmuel

However the code doesn't work correct.

Here is an example of the table 1:

 

 

data WORK.TABLE1;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8. intraday_price:32.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time" intraday_price="intraday price";
datalines4;
A,07MAY2008,11:32:41,3
A,07MAY2008,12:32:41,2
A,07MAY2008,13:32:41,1
A,08MAY2008,11:32:41,3.95
A,08MAY2008,12:32:41,3
A,08MAY2008,13:32:41,6
A,08MAY2008,14:32:41,4.01
B,07MAY2008,11:32:41,3.1
B,07MAY2008,12:32:41,1
B,07MAY2008,13:32:41,4
B,07MAY2008,14:32:41,2.9
B,08MAY2008,11:32:41,6
B,08MAY2008,12:32:41,1
B,09MAY2008,11:32:41,5
B,09MAY2008,12:32:41,7
C,07MAY2008,11:32:41,3
C,07MAY2008,12:32:41,2
C,08MAY2008,11:32:41,6.1
C,08MAY2008,12:32:41,3
C,08MAY2008,13:32:41,2
C,09MAY2008,11:32:41,8
C,09MAY2008,12:32:41,2
C,09MAY2008,13:32:41,3
C,09MAY2008,14:32:41,2
;;;;

 

And the table 2 is:

 

data WORK.TABLE2;
infile datalines dsd truncover;
input name:$3. date:DATE9. daily_price:32.;
format date DATE9.;
label name="name" date="date" daily_price="daily price";
datalines4;
A,05MAY2008,3
B,05MAY2008,6
C,05MAY2008,5
A,06MAY2008,5 A,07MAY2008,4 B,06MAY2008,3 B,07MAY2008,4 B,08MAY2008,3 C,06MAY2008,7 C,07MAY2008,6 C,08MAY2008,5 ;;;;

 

Please consider that the daily price of yesterday should be used in formula.

Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

[ Edited ]
Posted in reply to aminkarimid

So the result should be:

namedatetimeintraday price
B7-May-0811:32:413.1
B7-May-0812:32:411
B7-May-0813:32:414
B7-May-0814:32:412.9
A8-May-0811:32:413.95
A8-May-0812:32:413
A8-May-0813:32:416
A8-May-0814:32:414.01
C8-May-0811:32:416.1
C8-May-0812:32:413
C8-May-0813:32:412

 

Please notice that the date variable in table 1 starts from 07MAY2008, however the table 2 begins from 05MAY2008.

Maybe I made a mistake to explain it. 

The statement is:

IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

Trusted Advisor
Posts: 1,826

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

@aminkarimid, do you understand the coed? 

I hoped you'll be able to check it and addapt it, more precisely, to your needs.

 

Maybe if you change the next lines - defining which data to remove - :

if not (min_price le start_price le max_price)
             then output; 

into:

if not (min_price le start_price le max_price) and
   not (min_price le intraday_price le max_price) 
then output;

it will fit your demnd; 

Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

[ Edited ]

Thanks @Shmuel;
I can almost understand your code.
I think there is a problem:

min_price = 0.962 * daily_price;
max_price = 1.0398 * daily_price;

The daily_price which is used in these lines should be the daily price of specific name and date in yesterday.

Super User
Posts: 9,611

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

aminkarimid wrote:

Thanks @Shmuel;
I can almost understand your code.
I think there is a problem:

min_price = 0.962 * daily_price;
max_price = 1.0398 * daily_price;

The daily_price which is used in these lines should be the daily price of specific name and date in yesterday.


Then you should add 1 to the date in table2 before merging. Note that SAS dates are counts of days.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to KurtBremser
Would you please tell me how I can do that?
Super User
Posts: 9,611

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

aminkarimid wrote:
Would you please tell me how I can do that?
date = date + 1;

in a data step.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Trusted Advisor
Posts: 1,826

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

@aminkarimid, I got your point.

I have added a step to save the previous day daily_price in table2 using lag() fubction;

Pay attention - you cannot use lag function conditionally (under if statement).

 

Here is the full tested code:

data WORK.TABLE1;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8. intraday_price:32.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time" intraday_price="intraday price";
datalines4;
A,07MAY2008,11:32:41,3
A,07MAY2008,12:32:41,2
A,07MAY2008,13:32:41,1
A,08MAY2008,11:32:41,3.95
A,08MAY2008,12:32:41,3
A,08MAY2008,13:32:41,6
A,08MAY2008,14:32:41,4.01
B,07MAY2008,11:32:41,3.1
B,07MAY2008,12:32:41,1
B,07MAY2008,13:32:41,4
B,07MAY2008,14:32:41,2.9
B,08MAY2008,11:32:41,6
B,08MAY2008,12:32:41,1
B,09MAY2008,11:32:41,5
B,09MAY2008,12:32:41,7
C,07MAY2008,11:32:41,3
C,07MAY2008,12:32:41,2
C,08MAY2008,11:32:41,6.1
C,08MAY2008,12:32:41,3
C,08MAY2008,13:32:41,2
C,09MAY2008,11:32:41,8
C,09MAY2008,12:32:41,2
C,09MAY2008,13:32:41,3
C,09MAY2008,14:32:41,2
;;;;
run;

data WORK.TABLE2;
infile datalines dsd truncover;
input name:$3. date:DATE9. daily_price:32.;
format date DATE9.;
label name="name" date="date" daily_price="daily price";
datalines4;
A,05MAY2008,3
B,05MAY2008,6
C,05MAY2008,5
A,06MAY2008,5
A,07MAY2008,4
B,06MAY2008,3
B,07MAY2008,4
B,08MAY2008,3
C,06MAY2008,7
C,07MAY2008,6
C,08MAY2008,5
;;;;
run;

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;
data table2_new;
 set table2;
  by name;
     /* save price of yesterday */
     lag_price = lag(daily_price);
     if first.name then lag_price = .;
run;

data to_delete(keep = name date);
merge table1 (in=in1) 
      table2_new (in=in2);
  by name date;
       retain start_price;
	   
       if in1 and in2;     /* deal with obs on both tables only */

       if first.date then start_price = intraday_price;
       if last.date then do;  		      
          min_price = 0.962  * lag_price;
          max_price = 1.0398 * lag_price;
       if not (min_price le start_price le max_price) and
          not (min_price le intraday_price le max_price)   
		  then output; 
       end;
run;

data want;
merge table1 /* table2 */
           to_delete (in=indel);
  by name date;
       if not indel;
run;

 

Solution
‎11-21-2017 03:33 AM
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

[ Edited ]

Thanks @Shmuel and @KurtBremser

The statement is:

 

 IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

There are still two problems:

1- lines 

 if last.date then last_price = intraday_price;

and

 

 

not (min_price le last_price le max_price)

should be added.

 

 

2- There is lack of matching for last data in table 1 because of merging. So the calculation doesn't work well and I add data for 09 MAY 2008 to solve it.

 

Based on their works, the result is:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

proc sql;
create table table3 as
select * from table1, table2
where table1.name=table2.name and table1.date=table2.date;
quit;

data table2_new;
set table2;
by name;
/* save price of yesterday */
lag_Price = lag(Price);
if first.name then lag_Price = .;
run;

data to_delete(keep = name date);
merge table3 (in=in1)
table2_new (in=in2);
by name date;
retain start_price last_price;

if in1 and in2; /* deal with obs on both tables only */
if first.date then start_price = intradayprice;
if last.date then last_price = intradayprice;
if last.date then do;
min_price = 0.962 * lag_Price;
max_price = 1.0398 * lag_Price;
if not (min_price le start_price le max_price) and not (min_price le last_price le max_price)
then output;
end;
run;

data want;
merge table3 /* table2 */
to_delete (in=indel);
by name date;
if not indel;
run;

 

Super User
Posts: 9,611

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to aminkarimid

aminkarimid wrote:
Thanks @KurtBremser;
Would you please write the code?
because I am new with SAS and I cannot do it on my own.

Then you should start to learn immediately here: https://support.sas.com/edu/schedules.html?ctry=us&crs=PROG1

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Regular Contributor
Posts: 166

Re: How to eliminate observations between two tables based on a formula?

Posted in reply to KurtBremser
Thanks @KurtBremser,
I have experience of work with MATLAB, Excel, Eviews, STATA and R in fields of economics and finance. I have been working with SAS for a few months and I think it is an efficient software for statistical and econometrics analysis.
I'm trying to improve my skill and I think this community is one of the strength aspects of this software as a kind of practical learning.
Thank you so much for your attention and help.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 264 views
  • 0 likes
  • 3 in conversation