BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daisy6
Quartz | Level 8

Hello,

 

I have some data in which each examee may take several level exams many times. The data like

data test;

input id $4. examlevel $1. examdate $10. grade $4. times 2;

datalines;

id1  A 4/20/2013 Fail 3

id1  A  2/23/2014  Fail 5

id1  A  5/18/2015  Pass 8

id1  B  10/12/2013 Fail 2

id1  B 6/20/2014  Fail   4

id1  B 7/10/2015  Fail 6

id1  B 2/12/2016  Pass 8

......

 

Because data are missing or somebody did the wrong calculation. The exam times are not correct. I need to calculation how many times the examee took for each examlevel but I can not change those data before 12/31/2014, i.e.  I need the results like belows:

 

id1  A 4/20/2013 Fail 3

id1  A  2/23/2014  Fail 5

id1  A  5/18/2015  Pass 6

id1  B  10/12/2013 Fail 2

id1  B 6/20/2014  Fail   4

id1  B 7/10/2015  Fail 5

id1  B 2/12/2016  Pass 6

 

................

 

Part of my code is 

proc sort data=test;
by id examlevel examdate;
run;

 

data want;
set test;
by id examlevel;
if date<=20088(12/31/2014) or first.id and first.examlevel then attempt=times;
else attempt=lag(times)+1;
run;

 

but the code did not work well.  Please help me.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You want to leave TIMES unchanged for dates <='31dec2014'd, but want to increment times by 1 for all later dates within an ID1/examlevel group.

 

Thank you for providing a data step for TEST, but you should have tested it before submitting it to the forum.  In particular, reading the examdate as a character variable would defeat any attempt to solve your request without first converting it to a sas date value.  There were also error messages generated by the input statement.

 

This is a task where RETAINing the prior value of TIMES is beneficial.  I put the retained value in variable _TIMES.

 

data test;
  input id $4. examlevel :$1. examdate :mmddyy10. grade :$4. times :2.;
datalines;
id1  A 4/20/2013 Fail 3
id1  A  2/23/2014  Fail 5
id1  A  5/18/2015  Pass 8
id1  B  10/12/2013 Fail 2
id1  B 6/20/2014  Fail   4
id1  B 7/10/2015  Fail 6
id1  B 2/12/2016  Pass 8
run;

data want (drop=_:);
  set test;
  by id examlevel;
  retain _times;
  if examdate<='31dec2014'd then _times=times;
  else if first.examlevel=0 then do;
    times=_times+1;
    _times=times;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You will need to clarify your logic for me.  Why does the third row change and where does the 6 come from?  Note you can't use lag in this instance because you are changing the lag() value.  You would need to use a retained variable, e.g

data want;
  set have;
  retain lsttimes;
  by id examlevel;
  if first.examlevel then lsttimes=times;
  else do;
    if...then times=lsttimes-2;
  end;
run;
daisy6
Quartz | Level 8

Thanks for reply. The third row times variable changes because of former employee's mistakes and I can not change the those data because it is in the systems. What I can do is the make the correct times after 2014 based on the times of examees took

mkeintz
PROC Star

You want to leave TIMES unchanged for dates <='31dec2014'd, but want to increment times by 1 for all later dates within an ID1/examlevel group.

 

Thank you for providing a data step for TEST, but you should have tested it before submitting it to the forum.  In particular, reading the examdate as a character variable would defeat any attempt to solve your request without first converting it to a sas date value.  There were also error messages generated by the input statement.

 

This is a task where RETAINing the prior value of TIMES is beneficial.  I put the retained value in variable _TIMES.

 

data test;
  input id $4. examlevel :$1. examdate :mmddyy10. grade :$4. times :2.;
datalines;
id1  A 4/20/2013 Fail 3
id1  A  2/23/2014  Fail 5
id1  A  5/18/2015  Pass 8
id1  B  10/12/2013 Fail 2
id1  B 6/20/2014  Fail   4
id1  B 7/10/2015  Fail 6
id1  B 2/12/2016  Pass 8
run;

data want (drop=_:);
  set test;
  by id examlevel;
  retain _times;
  if examdate<='31dec2014'd then _times=times;
  else if first.examlevel=0 then do;
    times=_times+1;
    _times=times;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
daisy6
Quartz | Level 8

Thank you very much Mkeintz! That is what I want. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 672 views
  • 0 likes
  • 3 in conversation