Hi
I have a panel data set where I want to make 2 new variables. The data set is of people who run 5k runs. I want a variable that specifies how many days it has been since their last run, and one that specifies how much they have improved. I have made an overview in Excel of what I would like
Days_Between_Run is runDate_t - runDate_t-1 and the same goes for improvement which is just time_t - time_t-1
Hope you guys can help me and please let me know if i need to elaborate
data DataIM;
set DataAW;
prevrunnerid=lag(runnerid);
prevruntime=lag(runtime);
prevrundate=lag(rundate);
if prevRunnerID = RunnerID then Improvement = prevRunTime-RunTime;
if prevRunnerID = RunnerID then DBR = RunDate-prevRunDate;
drop prev:;
run;
Use the data step LAG function to determine the days between the previous run; and also to determine the previous time.
Examples here:
@YoLohse wrote:
Hi
I have a panel data set where I want to make 2 new variables. The data set is of people who run 5k runs. I want a variable that specifies how many days it has been since their last run, and one that specifies how much they have improved. I have made an overview in Excel of what I would like
Days_Between_Run is runDate_t - runDate_t-1 and the same goes for improvement which is just time_t - time_t-1
Hope you guys can help me and please let me know if i need to elaborate
Are your SAS variables actually date values, type is numeric and a SAS format such as ddmmyy10.? or character? Similar with the times. I strongly suspect the time ariables may not. What UNITS is your "Improvement" in? Seconds?
Again from your first post: Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
data WORK.DATAWAW;
infile datalines dsd truncover;
input RunDate:MMDDYY10. RunnerID:BEST. Gender:$1. Age:BEST. RunTime:BEST. Placement:BEST. After_Woman:32.;
format RunDate MMDDYY10. RunnerID BEST. Age BEST. RunTime BEST. Placement BEST.;
label RunDate="RunDate" RunnerID="RunnerID" Gender="Gender" Age="Age" RunTime="RunTime" Placement="Placement";
datalines;
10/22/2011 198111 1 39 1102 1 0
10/22/2011 33415 1 44 1134 2 0
10/22/2011 196982 1 44 1164 3 0
10/22/2011 92330 1 24 1182 4 0
10/22/2011 33809 1 59 1188 5 0
;;;;
That's great, thank you for the data in this format.
However, no runner has a previous time. And in any event, as I said, if they did have a previous time, you would use the data step LAG function, after things are properly sorted by runner and date.
I see what you mean. The sample I sent you have no reoccurring runners. but in the sample below, you can see that runner '1674' has participated in 2 runs for instance. I guess i should sort it like I have done here by RunnerID and RunDate right?
data WORK.ALLDATA;
infile datalines dsd truncover;
input RunDate:MMDDYY10. RunnerID:BEST. Gender:$1. Age:BEST. RunTime:BEST. Placement:BEST.;
format RunDate MMDDYY10. RunnerID BEST. Age BEST. RunTime BEST. Placement BEST.;
label RunDate="RunDate" RunnerID="RunnerID" Gender="Gender" Age="Age" RunTime="RunTime" Placement="Placement";
datalines;
09/16/2017 398 1 59 1440 38
10/22/2016 1623 0 64 1897 86
05/25/2013 1674 1 54 1322 15
05/26/2018 1674 1 54 1195 16
08/11/2018 1867 1 39 1666 76
08/18/2018 1867 1 39 1529 65
10/31/2015 1876 1 29 1279 14
01/02/2016 1876 1 29 1093 1
;;;;
So now you can modify our code to use the LAG function and determine the deltas, but only if the runnerID is the same as the previous runnerID.
Hi Paige
I almost got it working. My problem is that my code seems to lag the variable 2 periods and not 1. When i get two observations with the same RunnerID it seems to take the current run time and subtract the runtime two obsercations above and not one. furthermore it seems that it does not catch all of the cases where the RunnerID are the same as the previous. for instance observation 3 and 4 have the same RunnerID but observation 4 has no Improvement or DBR variable.
My code is:
proc sort
data=DataAW;
by runnerid rundate;
run;
data DataIM;
set DataAW;
by RunnerID;
retain prevRunnerID;
if prevRunnerID = RunnerID then Improvement = lag(RunTime)-RunTime;
if prevRunnerID = RunnerID then DBR = RunDate-lag(RunDate);
prevRunnerID = RunnerID;
drop prevRunnerID;
run;
my input data is
data WORK.DATAAW;
infile datalines dsd truncover;
input RunDate:MMDDYY10. RunnerID:BEST. Gender:$1. Age:BEST. RunTime:BEST. Placement:BEST. After_Woman:32.;
format RunDate MMDDYY10. RunnerID BEST. Age BEST. RunTime BEST. Placement BEST.;
label RunDate="RunDate" RunnerID="RunnerID" Gender="Gender" Age="Age" RunTime="RunTime" Placement="Placement";
datalines;
09/16/2017 398 1 59 1440 38 0
10/22/2016 1623 0 64 1897 86 .
05/25/2013 1674 1 54 1322 15 0
05/26/2018 1674 1 54 1195 16 0
08/11/2018 1867 1 39 1666 76 1
08/18/2018 1867 1 39 1529 65 1
10/31/2015 1876 1 29 1279 14 0
01/02/2016 1876 1 29 1093 1 0
06/10/2017 2288 1 59 1199 8 0
11/25/2017 2577 1 29 1004 1 0
;;;;
and my output data atm looks like:
data WORK.DATAIM;
infile datalines dsd truncover;
input RunDate:MMDDYY10. RunnerID:BEST. Gender:$1. Age:BEST. RunTime:BEST. Placement:BEST. After_Woman:32. Improvement:32. DBR:32.;
format RunDate MMDDYY10. RunnerID BEST. Age BEST. RunTime BEST. Placement BEST.;
label RunDate="RunDate" RunnerID="RunnerID" Gender="Gender" Age="Age" RunTime="RunTime" Placement="Placement";
datalines;
09/16/2017 398 1 59 1440 38 0 . .
10/22/2016 1623 0 64 1897 86 . . .
05/25/2013 1674 1 54 1322 15 0 . .
05/26/2018 1674 1 54 1195 16 0 . .
08/11/2018 1867 1 39 1666 76 1 . .
08/18/2018 1867 1 39 1529 65 1 -334 84
10/31/2015 1876 1 29 1279 14 0 . .
01/02/2016 1876 1 29 1093 1 0 436 -959
06/10/2017 2288 1 59 1199 8 0 . .
11/25/2017 2577 1 29 1004 1 0 . .
;;;;
data DataIM;
set DataAW;
prevrunnerid=lag(runnerid);
prevruntime=lag(runtime);
prevrundate=lag(rundate);
if prevRunnerID = RunnerID then Improvement = prevRunTime-RunTime;
if prevRunnerID = RunnerID then DBR = RunDate-prevRunDate;
drop prev:;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.