05-23-2013 03:39 AM
I having issue by comparing the valid date to I got a dataset n ssd7bdata file format exported a small chuck of the data which looks like the attachment i. now the question I would like to have extra column with the duration in between each catogories such as:
Then 4th column is the duration in between each row, so the 1st row don have value of previous value will become missing value or .
5th volumn are retain column 3rd for indication total duration of them all, end 4th and 5th column only both rows table value are equal.
I having trouble to select previous or next row of those value and unable to use + or - to get in between time duration.
there are few more calculation need to be done later if these steps are completed.
any idea? and thanks first.
05-23-2013 08:11 AM
Your data seems like you want to calculate difference between time for each category...But the way you presanted your query, i am not able understand what excatly you want to do...
Like you wrote 4th and 5th column, but in your sample data there is only 3 columns...So it will be good if you convey your query in meaningful way so that we can solve your query...
Though, you simply need difference b/w time then please have look at following code...
checkin_time = dif(time);
Then you can calaulate more based on the value of CHECKIN_TIME variable...
05-24-2013 03:13 PM
Here is sample code that calculcates time difference and previous checkin values. You didn't say about across different categories, so I guessed and reset the previous values to missing.
The test_table appears to be tab-delimited, so I allowed for that, too.
The key is to write the observation before assigning the current values to the previous value variables.
infile 'c:\temp\test_table.txt' truncover firstobs=2 dlm='09'x;
input Time :time5. Categories :$5. Checkin :$1. ;
length prev_category $5;
retain prev_checkin ' ' prev_time . prev_category ' ';
if categories ne prev_category then do; /* start of new category? */
prev_time = .;
prev_checkin = ' ';
timediff = time - prev_time; /* calculate difference */
output; /* write observation */
prev_checkin = checkin; /* save values for comparison with next row */
prev_time = time;
prev_category = categories;
format time prev_time timediff time5.;
keep time categories checkin prev_checkin timediff;
proc print data=test; run;
05-27-2013 05:44 AM
I used lag() fucntion which work fine at the momeny, I still having some issue from the input with alot matchmaking to get the end result, just a quick question about the datasets if the categories always random generated from SAS do I need to sort before create a new datasets because I wanted to calculate the time duration in between each rows with the same categories, does the retain & lag function work the same?
05-28-2013 10:18 AM
The LAG function works from neighboring observations, so you want to have all the observations for a category together in the input data set.
If the data set is very large, you may find it more efficient to use PROC DATASETS, MODIFY and INDEX CREATE to avoid having to physically rearrange the observations.