DATA Step, Macro, Functions and more

Mean calculation based on 2nd data time range?

Reply
Occasional Contributor
Posts: 13

Mean calculation based on 2nd data time range?

How to do data1 stat (for example mean) calculation based on data2 time range?

Data1

time x y
11:00 2 3
11:02 7 8
11:03 4 5
11:07 10 30
12:00 30 20
12:03 21 4
..............

Data2
TimeStart TimeEnd
11:00 11:07
11:03 12:00
12:00 12:03
....................

Want to get results
TimeStart TimeEnd MeanX meanY
11:00 11:07 ? ?
11:03 12:00 ? ?
12:00 12:03 ? ?
............................
Valued Guide
Posts: 2,174

Re: Mean calculation based on 2nd data time range?

Have a look at PROC SQL
Select starttime, endtime, mean(x) as meanX, mean(y) as meanY
from data1 a, data2 b
where a.time between starttime and endtime
Group by starttime, endtime

Well, something like that......

peterC
(not used to SAS-ing on iPhone standing on a crowded commuter train)

new defn of unstable code ;-)
Needed group by
and "time" names reversed
and a new name for the mean() stat variables

Message was edited by: Peter.C
Occasional Contributor
Posts: 13

Re: Mean calculation based on 2nd data time range?

Thanks! PeterC,

Could you please give me an example? I am not familar with Proc SQL.
Super Contributor
Super Contributor
Posts: 3,174

Re: Mean calculation based on 2nd data time range?

There is much information on the SAS support website, with technical reference, conference papers, and SAS-hosted documentation. Suggest using the SEARCH facility there or consider a Google advanced search argument, such as:

proc sql example site:sas.com

Scott Barry
SBBWorks, Inc.
Valued Guide
Posts: 2,174

Re: Mean calculation based on 2nd data time range?

JeffNcsu

sorry you didn't understand:
examples you can find in the manual
examples you can find in online-doc

what you got with my posting was the sql you need
(apart from the top and tail
proc sql ;
create table results as

Select TimeStart, TimeEnd, mean(x) as meanX, mean(y) as meanY
from data1 a, data2 b
where a.time between TimeStart and TimeEnd
Group by TimeStart, TimeEnd

;
quit ;
That produced these work.RESULTS, based on your sample data[pre]+FSVIEW: WORK.RESULTS (B)-------------------------------+
| Obs TimeStart TimeEnd meanX meanY |
| |
| 1 11:00 11:07 5.75 11.5 |
| 2 11:03 12:00 14.666666667 18.333333333 |
| 3 12:00 12:03 25.5 12 |
| |
+--------------------------------------------------------+[/pre]since these are the results you seemed to be asking for, it is a bit disappointed you didn't try

peterC
Regular Contributor
Posts: 241

Re: Mean calculation based on 2nd data time range?

This would have been easier if the time ranges did not overlap. Having no overlaps would have allowed us to utilize various table look-up techniques like formats, hash tables, or arrays.

My macro solution is unnecessarily complicated and inefficient. See Joe's or data _null_'s.

(my code is little too long for this forum. I have posted the code on sas-l:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1010a&L=sas-l&D=1&O=D&P=43780 )

Look at Joe Matise's succinct proc sql solution at:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1010a&L=sas-l&D=1&O=D&P=44045

Or data _null_'s using the multi-label format:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind1010a&L=sas-l&D=1&O=D&F=&S=&P=44173
Ask a Question
Discussion stats
  • 5 replies
  • 160 views
  • 0 likes
  • 4 in conversation