Help using Base SAS procedures

second by second dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

second by second dataset

Hi,

I am a beginner in SAS. I have a dataset with stock prices quoted in seconds during10 days. I need to compute the return of the stock for every minute of the day. To do so I need to use the last price quoted for every minute. I used the following commands to compute the returns for each second

data return;

set data;

return=log(price/lag(price));

run;

My dataset looks like this:


date                              time                 price

01MAY2012                 8:02:01          1.101

01MAY2012                 8:02:15          1.258

01MAY2012                 8:02:29            .

01MAY2012                 8:02:48          1.100

01MAY2012                 8:04:04          1.225

01MAY2012                 8:05:01          1.001

01MAY2012                 8.44.00          1.225

I am struggling to compute the return of the stock for every minute using the last price available for each minute. Any idea in how to do this? Thank you!


Accepted Solutions
Solution
‎07-03-2015 02:02 PM
Respected Advisor
Posts: 4,919

Re: second by second dataset

Check your input date and time variables. My code assumes that they are proper SAS date and proper SAS time values. If they are anything else, such as character or SAS datetime, the code will have to account for that.

PG

PG

View solution in original post


All Replies
Super User
Posts: 19,770

Re: second by second dataset

Step 1 - Convert time to minute. Since time variables are stored in seconds you could divide by 60 and floor the number. Or you could use an appropriate time format.

Step 2 - Sort by Stock and minute

Step 3 - Use BY processing and take the last of each minute. So By stock minute; if last.minute;

Occasional Contributor
Posts: 11

Re: second by second dataset

Reeza thank you so much for your help since I am a beginner and I was not sure how to divide the time variables by 60 and floor the number I attempted PG's approach.

PG I also want to thank you for your help. I ran the codes you passed me. The outputs of the first part look good to me but I need to keep the dates and the minutes (they appear as .) because once I have the results for this stock I need to merge these returns with others from other stocks. The fields that I will use to merge them will be date and minute. I tried to amend this by adding time and date to the command keep but it provides me the headings (date time minute return) but the rest of cells are all full with dots (.).

data returns;

set prices; where price is not missing;

lastPrice = lag(price);

if not missing(lastPrice) then return = log(price/lastPrice);

minute = dhms(date, hour(time), minute(time), 0);

format minute datetime15.;

keep date time minute return;

run;

Another issue I am encountering is that when I ran the second part of the code in your message SAS returns me only 1 observation and 2 variables. This is what the log says:

NOTE: There were 152493 observations read from the data set WORK.RETURN.

NOTE: The data set WORK.LASTMINUTERETURNS has 1 observations and 2 variables.

NOTE: DATA statement used (Total process time)

      real time           0.29 second

      cpu time            0.04 seconds

Any ideas in how to solve these issues? Thank you in advance!! I really appreciate your help!

Occasional Contributor
Posts: 11

Re: second by second dataset

Hi PG,

As I mentioned earlier I am a beginner in SAS so I am trying to get used to it. I have to say that the first code works properly. I have just realised that I made a mistake when running it. The mistake was that in my dataset  the variable date is called date_L_ and time is called time_L_ and that is the reason I was getting the dots in the output. I changed that and the code works properly. So thank you so much for that. Unfortunately, the second code still gives me the same message when I run it. I looks to me correct to me because it uses the variable minute, which has been defined in the first part of the code, but somehow it does not work. So any ideas on how to fix that will be very much appreciated.

Thank you!

Solution
‎07-03-2015 02:02 PM
Respected Advisor
Posts: 4,919

Re: second by second dataset

Check your input date and time variables. My code assumes that they are proper SAS date and proper SAS time values. If they are anything else, such as character or SAS datetime, the code will have to account for that.

PG

PG
Occasional Contributor
Posts: 11

Re: second by second dataset

PG,

I did change my date/time as you suggested and it worked. Thank you so much for your help. I really appreciate it! Now I am having another issue. I have another data set that has data in minutes (from 0 to 59 minutes) for few days. The problem is that while I have the date (day/month/year), the time provided by this dataset consists only in minutes  (ranging from 0 to 59 and again). I know that I have 1440 minutes of data every day. Therefore, I computed the following: 1440minutes/60minutes=24 hours. So now I know that I have the 24 hour of data recorded for each day. I need to create a new variable for the hours since I need the complete date and time to merge this dataset with the one in seconds that has been converted to minutes. I need to assign the first 60 observations the hour 00.00, from 61 to 120 the hour 01.00 and so on. And once I have this sorted I need to do it for the 10 days of data that I am analysing. I tried different commands but none of them does what I need. The commands I tried for the first part to convert the minutes in hours are the ones below. I tried only for the first 60 observations but none of them worked.

data intra4;
SET intra ;
do obsnum=60 by 1;
IF 0.00>time_L_>=59.00 then hour=1;

end;

run;


data intra5;
SET intra ;

options firstobs=1 obs=60;

0.00<=time_L_<=59.00 then hour=1.00;

hour time8.;

run;



intra9;
SET intra ;

do i=1 to 60;

hours=1;

format hours hour2.;

end;

run;


data intra2;

set intra;

format time_L_ mmss5.;

minutos=minute(time_L_);

run;

data intra10;
SET intra2;

firstobs=1 obs=60;

0=<minutos<=59 then j=1;

j hour2.;

run;

So just wondering if you have any ideas on how to do this.

Thank you very much in advance.

C


Respected Advisor
Posts: 4,919

Re: second by second dataset

Keep it simple:

data prices;

input date :date9. time :time8. price;

datalines;

01MAY2012                 8:02:01          1.101

01MAY2012                 8:02:15          1.258

01MAY2012                 8:02:29            .

01MAY2012                 8:02:48          1.100

01MAY2012                 8:04:04          1.225

01MAY2012                 8:05:01          1.001

01MAY2012                 8.44.00          1.225

;

/* Assuming chronological order, compute returns and minute datetimes */

data returns;

set prices; where price is not missing;

lastPrice = lag(price);

if not missing(lastPrice) then return = log(price/lastPrice);

minute = dhms(date, hour(time), minute(time), 0);

format minute datetime15.;

keep minute return;

run;

/* Keep last return for every minute */

data lastMinuteReturns;

set returns;

by minute;

if last.minute;

run;

proc print data=lastMinuteReturns; run;

PG

PG
Occasional Contributor
Posts: 11

Re: second by second dataset

Hi PG,

The first code works properly. So thank you so much for that. Unfortunately, the second code still gives me the same message when I run it. I looks to me correct to me because it uses the variable minute, which has been defined in the first part of the code, but somehow it does not work. I created a new variable which I called min and it contains only the minutes from the time and then I tried to sort by min, date and hour but it did not work either. Any ideas on how to solve this issue? Thank you in advance.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 388 views
  • 8 likes
  • 3 in conversation