I am not sure if this is the right forum to ask this question but here it goes.
I am trying to write code for reducing data. The data is from an activity study that was conducted by my mentor at the University of Utah. I have read in the data to a SAS data set and now get to manipulate it. The data is currently in long columns that are about 27000 cases/observations long. Each observation/case represents 30 seconds of data and I want to reduce to 60 seconds of data or one minute per case/observation. Is there a function in SAS to aggregate two observations/cases in a sequential fashion and then drop even or odd time values without aggregating them? Ideas on this? I have been toying with the idea of transposing the data and using an array for this. Thanks for any help or direction to the right place to discuss this issue.
This forum is not the right place for your question. You may not need PROC TRANSPOSE. You may be able to reduce your data by using a PROC MEANS/PROC SUMMARY step. What if your data looked like this:
In the above output, you can see that the 609.53, the value of sum_minute for the first observation above, is the sum of the measure value for the 2 obs for minute 0 for BLUEFISH on 01/01/2007. This was done with the following PROC MEANS code:
proc means data=fishes sum nway noprint;
class case date hour minute;
output out=fishedout sum=sum_minute;
If, however, you did not want the sum of the 2 obs (is this what you meant by "aggregating"), then you could probably use a DATA step program to delete every other observation without doing any aggregation. Or, they could help you tweak the PROC MEANS program to get the output data set that you wanted.
Tech Support would be the best place to take this question. To contact Tech Support, go to http://support.sas.com and on the left hand navigation pane, click on the link that says "Submit a Problem"
Thank you for the tips. I did think about proc transpose but I ended up developing a more elegant solution.
What I did was use the lag function to duplicate the activity counts column and added my original activity counts with the new column created by the lag function (of course I had to set missing values to zero). I created an index variable by using my datetime variable (a sas date of both date and time to the second) and subtracted my starttime (which was a macro). Then I used a do-loop to incremented by 2 and compared my index variable to the do-loop index multipled by my epoch (which was a macro equaling 30seconds). The result is that my do loop would only keep even observations with the use of an explicit output statement.
my code looked like this:
data Bbb.sub215_actmin_1 ;
set Bbb.sub215_actmin ;
if Act=. then Act=0 ;
if Ped=. then Ped=0 ;
retain datetime ;
index=(datetime - &sdt) ;
data Bbb.sub215_actmin_2 ;
set Bbb.sub215_actmin_1 ;
do i= 2 to 100000 by 2 ;
if index = i * &epo then output Bbb.sub215_actmin_2 ;
keep pedometer activity ped act datetime index ;