BookmarkSubscribeRSS Feed
MonsterGnom
Calcite | Level 5

Hey people,

I'm new to SAS and have a question that is hopefully easy to answer. The problem is that I have a dataset with a time variable and corresponding price values. Some timestamps appear more than once others only once. I'm asked to produce a dataset with all timestamps only occuring once and therefore to calculate the mean of the corresponding price variables for every timestamp that occures more than once. We're supposed to do it with a DoW-loop but any solutions would help me. By now I'm pretty desperate because I dont really have an idea how to do it. I only created a variable containing the i+1-th timestamp to be able to compare the time variable with its next value. But counting the number of observations with the same timestamp starting over for every new timestamp that occures multiple times seems quite difficult to me. If I missed necessary informations just ask for them and I wil post them in a reply. Thanks for your help.

8 REPLIES 8
PaigeMiller
Diamond | Level 26

We're supposed to do it with a DoW-loop but any solutions would help me.

 

So, you are required to do this in the most difficult way possible. Interesting. Now @MonsterGnom please don't be offended, I have made this comment to many people in these forums, and its not specifically about you but about the request to use DoW. The whole idea of writing your own code to compute a mean is nonsense, SAS has already done this for you, tested it debugged it and proved that it works in a bazillion real-world applications. In addition, you (or your company or university) are paying for all of this, you are paying for SAS testing it, debugging it and proving that it works. Writing your own leads to errors, even for something as simple as computing a mean, we have seen in this forum that sometimes people don't get it right when they have to write their own code to compute a mean. Plus, writing you own code to compute a mean is just plain old unnecessary work.

 

Here's my solution

proc summary data=have nway;
    class timestamp;
    var price;
    output out=means mean=;
run;

 

--
Paige Miller
MonsterGnom
Calcite | Level 5
Thanks a lot for your answer. I also felt like using a DoW-loop isn't the most efficient way to do it and after you said it is pretty much the most complicated way to do it I think I will complain about it. I used your solution and it works perfectly fine as well as beeing understandable and really short. Thanks a lot again.:)
ballardw
Super User

Small set of example data with both types of cases

And what the expected output for that example data.

MonsterGnom
Calcite | Level 5
So the data looks like:

i | Ask-Price | Bid-Price | Transaction-Price | Time
1 2275 2273.6 2275 12.04.20 12:30:17
2 2276 2270 2276.58 12.04.20 12:30:22
3 2278 2270 2277 12.04.20 12:30:22
4 2277.5 2271 2273.7 12.04.20 12:30:25

If you have the motivation to find a solution using a DoW-loop I will not complain about it but as PaigeMiller stated above it is definetly not a smart way to do it. I still would be very thankfull.;)
Astounding
PROC Star

For this sample case, @PaigeMiller is correct in his comments.  But there could easily be cases where a DOW loop is needed, and it pays to learn the technique in a simple case.  So here is an approach (just using the transaction price ... if you want to add the bid price to the calculations, you can do that as well):

 

data want;
   transaction_numerator = 0;
   transaction_denominator = 0;
   do until (last.time);
      set have;
      by time;
      transaction_numerator + price;
      transaction_denominator + 1;
   end;
   calculated_mean = transaction_numerator / transaction_denominator;
run;

The code assumes that your data is already sorted by TIME.  If that's the case, you don't really need the variable you constructed using i+1.

If it's possible that your data might have missing values for PRICE, you would have to be more careful with the numerator and denominator calculations.

PaigeMiller
Diamond | Level 26

I agree with @Astounding that there are legitimate uses for DoW loops.

 

I also earlier pointed out that people writing their own code to compute means sometimes get it wrong, even though a mean is a very simple calculation. As an example, the code from @Astounding does not produce the right answer if there are missing values in the data, which he correctly points out; but other people have done this without realizing that it doesn't work in the presence of missing values, and they have gotten the wrong answer and may not even realize it.

--
Paige Miller
ballardw
Super User

@MonsterGnom wrote:
So the data looks like:

i | Ask-Price | Bid-Price | Transaction-Price | Time
1 2275 2273.6 2275 12.04.20 12:30:17
2 2276 2270 2276.58 12.04.20 12:30:22
3 2278 2270 2277 12.04.20 12:30:22
4 2277.5 2271 2273.7 12.04.20 12:30:25

If you have the motivation to find a solution using a DoW-loop I will not complain about it but as PaigeMiller stated above it is definetly not a smart way to do it. I still would be very thankfull.;)

Actually I was having a hard time telling what you wanted with the "only one time" verbiage.

 

If Proc summary as @PaigeMiller provided then your description was way overly complicated as that code does "summary for each level of time". No one only, no more than one involved. Your description was making me think that you wanted two different summaries: the summary of the cases where there was only one "time" all together and then a different summary for the cases where there were at least two times. And that wasn't making much sense. And with "price" there usually comes something that has a price attached. So I was trying to see if you actually intended to do these means by product or such.

Reeza
Super User
One reason for using DoW loops is usually cause you're doing more than one calculation at a time, and it can be more efficient to have one pass of the data if you require any statistics that for some reason cannot be summarized via a PROC MEANS.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1828 views
  • 8 likes
  • 5 in conversation