BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody,

I have two tables. First table has name, date, time and intraday price variables. It means there is an intraday price for each name in a specific date and time. Second table has name, date and daily price and the daily price is intraday price aggregation for each name and date.

 

I try to write a program which performs the procedure below:

1- It can find same observations by name and date in two tables and then;

2- If first and last intraday price be out of 0.962 and 1.0398 times of daily price in last day; then deletes all data related to that specific name and date in table 1.

 

The statement is:

IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

For instance, consider two tables which are below:

 

table 1

namedatetimeintraday price
A113
A122
A131
B113.1
B121
B134
B142.9
C113
C122
A213.95
A223
A236
A244.01
B216
B221
C216.1
C223
C232
B315
B327
C318
C322
C333
C342

 

And table 2:

namedatedaily price
A05
B03
C07
A14
B14
C16
B23
C25


Then, based on the procedure, The result is:

namedatetimeintraday price
B113.1
B121
B134
B142.9
A213.95
A223
A236
A244.01
C216.1
C223
C232

 

Would you please tell me how I can do that?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
aminkarimid
Lapis Lazuli | Level 10

Thanks @Shmuel and @Kurt_Bremser

The statement is:

 

 IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

There are still two problems:

1- lines 

 if last.date then last_price = intraday_price;

and

 

 

not (min_price le last_price le max_price)

should be added.

 

 

2- There is lack of matching for last data in table 1 because of merging. So the calculation doesn't work well and I add data for 09 MAY 2008 to solve it.

 

Based on their works, the result is:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

proc sql;
create table table3 as
select * from table1, table2
where table1.name=table2.name and table1.date=table2.date;
quit;

data table2_new;
set table2;
by name;
/* save price of yesterday */
lag_Price = lag(Price);
if first.name then lag_Price = .;
run;

data to_delete(keep = name date);
merge table3 (in=in1)
table2_new (in=in2);
by name date;
retain start_price last_price;

if in1 and in2; /* deal with obs on both tables only */
if first.date then start_price = intradayprice;
if last.date then last_price = intradayprice;
if last.date then do;
min_price = 0.962 * lag_Price;
max_price = 1.0398 * lag_Price;
if not (min_price le start_price le max_price) and not (min_price le last_price le max_price)
then output;
end;
run;

data want;
merge table3 /* table2 */
to_delete (in=indel);
by name date;
if not indel;
run;

 

View solution in original post

14 REPLIES 14
Kurt_Bremser
Super User

Sort your table1 by name, date and time.

Sort table2 by name and date.

Create a new table from merging table1 and table2 (by name date), with a new variable start_price, which is retained. Set start_price at first.date. At last.date, compare start_price and current price with the aggregated price according to your formula, and output if condition is met. Keep only name and date.

Now, merge table1 with the new table (again by name and date), and keep or delete if a match is found/not found.

aminkarimid
Lapis Lazuli | Level 10
Thanks @Kurt_Bremser;
Would you please write the code?
because I am new with SAS and I cannot do it on my own.
Shmuel
Garnet | Level 18

Here is the code based on @Kurt_Bremser design:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

data to_delete(keep = name date);
merge table1 (in=in1) 
           table2 (in=in2);
  by name date;
       retain start_price;
       if in1 and in2;     /* deal with obs on both tables only */

       if first.date then start_price = intraday_price;
       if last.date then do;
          min_price = 0.962 * daily_price;
          max_price = 1.0398 * daily_price;
          if not (min_price le start_price le max_price)
             then output; 
       end;
run;

data want;
merge table 1 /* table2 */
           to_delete (in=indel);
  by name date;
       if not indel;
run;
aminkarimid
Lapis Lazuli | Level 10

Thanks @Shmuel

However the code doesn't work correct.

Here is an example of the table 1:

 

 

data WORK.TABLE1;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8. intraday_price:32.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time" intraday_price="intraday price";
datalines4;
A,07MAY2008,11:32:41,3
A,07MAY2008,12:32:41,2
A,07MAY2008,13:32:41,1
A,08MAY2008,11:32:41,3.95
A,08MAY2008,12:32:41,3
A,08MAY2008,13:32:41,6
A,08MAY2008,14:32:41,4.01
B,07MAY2008,11:32:41,3.1
B,07MAY2008,12:32:41,1
B,07MAY2008,13:32:41,4
B,07MAY2008,14:32:41,2.9
B,08MAY2008,11:32:41,6
B,08MAY2008,12:32:41,1
B,09MAY2008,11:32:41,5
B,09MAY2008,12:32:41,7
C,07MAY2008,11:32:41,3
C,07MAY2008,12:32:41,2
C,08MAY2008,11:32:41,6.1
C,08MAY2008,12:32:41,3
C,08MAY2008,13:32:41,2
C,09MAY2008,11:32:41,8
C,09MAY2008,12:32:41,2
C,09MAY2008,13:32:41,3
C,09MAY2008,14:32:41,2
;;;;

 

And the table 2 is:

 

data WORK.TABLE2;
infile datalines dsd truncover;
input name:$3. date:DATE9. daily_price:32.;
format date DATE9.;
label name="name" date="date" daily_price="daily price";
datalines4;
A,05MAY2008,3
B,05MAY2008,6
C,05MAY2008,5
A,06MAY2008,5 A,07MAY2008,4 B,06MAY2008,3 B,07MAY2008,4 B,08MAY2008,3 C,06MAY2008,7 C,07MAY2008,6 C,08MAY2008,5 ;;;;

 

Please consider that the daily price of yesterday should be used in formula.

aminkarimid
Lapis Lazuli | Level 10

So the result should be:

namedatetimeintraday price
B7-May-0811:32:413.1
B7-May-0812:32:411
B7-May-0813:32:414
B7-May-0814:32:412.9
A8-May-0811:32:413.95
A8-May-0812:32:413
A8-May-0813:32:416
A8-May-0814:32:414.01
C8-May-0811:32:416.1
C8-May-0812:32:413
C8-May-0813:32:412

 

Please notice that the date variable in table 1 starts from 07MAY2008, however the table 2 begins from 05MAY2008.

Maybe I made a mistake to explain it. 

The statement is:

IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

Shmuel
Garnet | Level 18

@aminkarimid, do you understand the coed? 

I hoped you'll be able to check it and addapt it, more precisely, to your needs.

 

Maybe if you change the next lines - defining which data to remove - :

if not (min_price le start_price le max_price)
             then output; 

into:

if not (min_price le start_price le max_price) and
   not (min_price le intraday_price le max_price) 
then output;

it will fit your demnd; 

aminkarimid
Lapis Lazuli | Level 10

Thanks @Shmuel;
I can almost understand your code.
I think there is a problem:

min_price = 0.962 * daily_price;
max_price = 1.0398 * daily_price;

The daily_price which is used in these lines should be the daily price of specific name and date in yesterday.

Kurt_Bremser
Super User

@aminkarimid wrote:

Thanks @Shmuel;
I can almost understand your code.
I think there is a problem:

min_price = 0.962 * daily_price;
max_price = 1.0398 * daily_price;

The daily_price which is used in these lines should be the daily price of specific name and date in yesterday.


Then you should add 1 to the date in table2 before merging. Note that SAS dates are counts of days.

aminkarimid
Lapis Lazuli | Level 10
Would you please tell me how I can do that?
Shmuel
Garnet | Level 18

@aminkarimid, I got your point.

I have added a step to save the previous day daily_price in table2 using lag() fubction;

Pay attention - you cannot use lag function conditionally (under if statement).

 

Here is the full tested code:

data WORK.TABLE1;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8. intraday_price:32.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time" intraday_price="intraday price";
datalines4;
A,07MAY2008,11:32:41,3
A,07MAY2008,12:32:41,2
A,07MAY2008,13:32:41,1
A,08MAY2008,11:32:41,3.95
A,08MAY2008,12:32:41,3
A,08MAY2008,13:32:41,6
A,08MAY2008,14:32:41,4.01
B,07MAY2008,11:32:41,3.1
B,07MAY2008,12:32:41,1
B,07MAY2008,13:32:41,4
B,07MAY2008,14:32:41,2.9
B,08MAY2008,11:32:41,6
B,08MAY2008,12:32:41,1
B,09MAY2008,11:32:41,5
B,09MAY2008,12:32:41,7
C,07MAY2008,11:32:41,3
C,07MAY2008,12:32:41,2
C,08MAY2008,11:32:41,6.1
C,08MAY2008,12:32:41,3
C,08MAY2008,13:32:41,2
C,09MAY2008,11:32:41,8
C,09MAY2008,12:32:41,2
C,09MAY2008,13:32:41,3
C,09MAY2008,14:32:41,2
;;;;
run;

data WORK.TABLE2;
infile datalines dsd truncover;
input name:$3. date:DATE9. daily_price:32.;
format date DATE9.;
label name="name" date="date" daily_price="daily price";
datalines4;
A,05MAY2008,3
B,05MAY2008,6
C,05MAY2008,5
A,06MAY2008,5
A,07MAY2008,4
B,06MAY2008,3
B,07MAY2008,4
B,08MAY2008,3
C,06MAY2008,7
C,07MAY2008,6
C,08MAY2008,5
;;;;
run;

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;
data table2_new;
 set table2;
  by name;
     /* save price of yesterday */
     lag_price = lag(daily_price);
     if first.name then lag_price = .;
run;

data to_delete(keep = name date);
merge table1 (in=in1) 
      table2_new (in=in2);
  by name date;
       retain start_price;
	   
       if in1 and in2;     /* deal with obs on both tables only */

       if first.date then start_price = intraday_price;
       if last.date then do;  		      
          min_price = 0.962  * lag_price;
          max_price = 1.0398 * lag_price;
       if not (min_price le start_price le max_price) and
          not (min_price le intraday_price le max_price)   
		  then output; 
       end;
run;

data want;
merge table1 /* table2 */
           to_delete (in=indel);
  by name date;
       if not indel;
run;

 

aminkarimid
Lapis Lazuli | Level 10

Thanks @Shmuel and @Kurt_Bremser

The statement is:

 

 IF first AND last (intraday price for specific name & date) NOT IN [0.962*(daily price of yesterday), 1.0398*(daily price of yesterday)] THEN DELETE.

 

There are still two problems:

1- lines 

 if last.date then last_price = intraday_price;

and

 

 

not (min_price le last_price le max_price)

should be added.

 

 

2- There is lack of matching for last data in table 1 because of merging. So the calculation doesn't work well and I add data for 09 MAY 2008 to solve it.

 

Based on their works, the result is:

proc sort data=table1; by name date time; run;

proc sort data=table2; by name date; run;

proc sql;
create table table3 as
select * from table1, table2
where table1.name=table2.name and table1.date=table2.date;
quit;

data table2_new;
set table2;
by name;
/* save price of yesterday */
lag_Price = lag(Price);
if first.name then lag_Price = .;
run;

data to_delete(keep = name date);
merge table3 (in=in1)
table2_new (in=in2);
by name date;
retain start_price last_price;

if in1 and in2; /* deal with obs on both tables only */
if first.date then start_price = intradayprice;
if last.date then last_price = intradayprice;
if last.date then do;
min_price = 0.962 * lag_Price;
max_price = 1.0398 * lag_Price;
if not (min_price le start_price le max_price) and not (min_price le last_price le max_price)
then output;
end;
run;

data want;
merge table3 /* table2 */
to_delete (in=indel);
by name date;
if not indel;
run;

 

aminkarimid
Lapis Lazuli | Level 10
Thanks @Kurt_Bremser,
I have experience of work with MATLAB, Excel, Eviews, STATA and R in fields of economics and finance. I have been working with SAS for a few months and I think it is an efficient software for statistical and econometrics analysis.
I'm trying to improve my skill and I think this community is one of the strength aspects of this software as a kind of practical learning.
Thank you so much for your attention and help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 863 views
  • 0 likes
  • 3 in conversation