Hello,
I have some data in which each examee may take several level exams many times. The data like
data test;
input id $4. examlevel $1. examdate $10. grade $4. times 2;
datalines;
id1 A 4/20/2013 Fail 3
id1 A 2/23/2014 Fail 5
id1 A 5/18/2015 Pass 8
id1 B 10/12/2013 Fail 2
id1 B 6/20/2014 Fail 4
id1 B 7/10/2015 Fail 6
id1 B 2/12/2016 Pass 8
......
Because data are missing or somebody did the wrong calculation. The exam times are not correct. I need to calculation how many times the examee took for each examlevel but I can not change those data before 12/31/2014, i.e. I need the results like belows:
id1 A 4/20/2013 Fail 3
id1 A 2/23/2014 Fail 5
id1 A 5/18/2015 Pass 6
id1 B 10/12/2013 Fail 2
id1 B 6/20/2014 Fail 4
id1 B 7/10/2015 Fail 5
id1 B 2/12/2016 Pass 6
................
Part of my code is
proc sort data=test;
by id examlevel examdate;
run;
data want;
set test;
by id examlevel;
if date<=20088(12/31/2014) or first.id and first.examlevel then attempt=times;
else attempt=lag(times)+1;
run;
but the code did not work well. Please help me.
You want to leave TIMES unchanged for dates <='31dec2014'd, but want to increment times by 1 for all later dates within an ID1/examlevel group.
Thank you for providing a data step for TEST, but you should have tested it before submitting it to the forum. In particular, reading the examdate as a character variable would defeat any attempt to solve your request without first converting it to a sas date value. There were also error messages generated by the input statement.
This is a task where RETAINing the prior value of TIMES is beneficial. I put the retained value in variable _TIMES.
data test;
input id $4. examlevel :$1. examdate :mmddyy10. grade :$4. times :2.;
datalines;
id1 A 4/20/2013 Fail 3
id1 A 2/23/2014 Fail 5
id1 A 5/18/2015 Pass 8
id1 B 10/12/2013 Fail 2
id1 B 6/20/2014 Fail 4
id1 B 7/10/2015 Fail 6
id1 B 2/12/2016 Pass 8
run;
data want (drop=_:);
set test;
by id examlevel;
retain _times;
if examdate<='31dec2014'd then _times=times;
else if first.examlevel=0 then do;
times=_times+1;
_times=times;
end;
run;
You will need to clarify your logic for me. Why does the third row change and where does the 6 come from? Note you can't use lag in this instance because you are changing the lag() value. You would need to use a retained variable, e.g
data want; set have; retain lsttimes; by id examlevel; if first.examlevel then lsttimes=times; else do; if...then times=lsttimes-2; end; run;
Thanks for reply. The third row times variable changes because of former employee's mistakes and I can not change the those data because it is in the systems. What I can do is the make the correct times after 2014 based on the times of examees took
You want to leave TIMES unchanged for dates <='31dec2014'd, but want to increment times by 1 for all later dates within an ID1/examlevel group.
Thank you for providing a data step for TEST, but you should have tested it before submitting it to the forum. In particular, reading the examdate as a character variable would defeat any attempt to solve your request without first converting it to a sas date value. There were also error messages generated by the input statement.
This is a task where RETAINing the prior value of TIMES is beneficial. I put the retained value in variable _TIMES.
data test;
input id $4. examlevel :$1. examdate :mmddyy10. grade :$4. times :2.;
datalines;
id1 A 4/20/2013 Fail 3
id1 A 2/23/2014 Fail 5
id1 A 5/18/2015 Pass 8
id1 B 10/12/2013 Fail 2
id1 B 6/20/2014 Fail 4
id1 B 7/10/2015 Fail 6
id1 B 2/12/2016 Pass 8
run;
data want (drop=_:);
set test;
by id examlevel;
retain _times;
if examdate<='31dec2014'd then _times=times;
else if first.examlevel=0 then do;
times=_times+1;
_times=times;
end;
run;
Thank you very much Mkeintz! That is what I want.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.