BookmarkSubscribeRSS Feed
Sandhya
Fluorite | Level 6
Hi,

I have a dataset which has an ID, ADATE and SERVICE.

ID ADATE SERVICE

1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
3 6/5/2009 I
4 7/29/2009 LA
5 3/23/2009 OP
...

Now I need to check the Service that are not 'I'. Once I find out which are the records that are not equal to 'I', I need to find out if there are any records for the same id which has the same date or 1 day earlier to that day. If yes, then I need to convert the SERVICE from OP to 'I', else do not change the SERVICE value.

Since I have a fairly huge dataset, I do not have the luxury of performing multiple iterations. So I'm looking at doing this in minimal number of iterations.

Please help.

Thanks in advance,
Sandhya.
10 REPLIES 10
ArtC
Rhodochrosite | Level 12
If you sort
[pre] by id descending adate;[/pre]
you can check a the service if not 1 then the next obs is the one to check for your date range. For this DATA step approach you will need to use either a LAG function or a RETAIN.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello AtrC,

The OP mentioned that besides 1 day earlier date the same date is also possible. In thsi case the algorithm should be more complicated.

Sincerely,
SPR
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Sandhya,

If I understood you correctly then this is solution (not simple one):
[pre]
data i;
input ID ADATE MMDDYY10. SERVICE $;
format adate date7.;
date=adate;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
2 1/13/2009 OP
3 6/4/2009 OP
3 6/5/2009 OP
4 7/29/2009 LA
5 3/23/2009 OP
;
run;
proc sort data=i out=s;
by ID adate;
run;
proc SQL noprint;
select COUNT(distinct ID) into :n from i;
%let n=%TRIM(&n);
select distinct ID into :id1-:id&n from i;
quit;
%macro a;
%local i j k k1;
%do i=1 %to &n;
proc sql noprint;
select count(*) into :nd from s where ID=&&ID&i;
%let nd=%TRIM(&nd);
select date into :d1-:d&n from s where ID=&&ID&i;
select service into :s1-:s&n from s where ID=&&ID&i;
quit;
data dr&i;
length service $2;
%do j=1 %to &nd;
%if &&s&j NE I %then %do;
%do k=&j %to 1 %by -1;
%if &k GT 1 %then %do;
if &&d&j = &&d&k OR %EVAL(&&d&j - &&d&k) = 1
then %let s&j=I;
%end;
%end;
%end;
id=&&ID&i;
adate=&&d&j;
service="&&s&j";
format adate date7.;
output;
%end;
run;
%if &i = 1 %then %do; data r; retain id adate service; set dr&i; run; %end;
%else %do; data r; set r dr&i; run; %end;
%end;
%mend a;
%a;
[/pre]
Sincerely,
SPR
polingjw
Quartz | Level 8
If the data set is large, I doubt that a macro solution will be efficient. Does this solution, which still assumes the data set is sorted by ID and ADATE, work?

[pre]
data have;
input ID ADATE MMDDYY10. SERVICE $;
format adate date7.;
date=adate;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
2 1/13/2009 OP
3 6/4/2009 OP
3 6/5/2009 OP
4 7/29/2009 LA
5 3/23/2009 OP
;
run;
proc sort data=have;
by ID adate;
run;

data want(keep=id adate service);
set have;
by id;
lastdate = lag(adate);
if not last then
set have(firstobs=2 keep=adate rename=(adate=nextdate)) end=last;
if service ne 'I' and ((not last.id and nextdate=adate) or (not first.id and adate-lastdate in (0 1))) then service = 'I';
run;

[pre]
polingjw
Quartz | Level 8
If you have SAS 9.2, here is a multidata hash object solution that does not require sorting. However, the code is a little messy.

[pre]
data have;
input ID ADATE MMDDYY10. SERVICE $;
format adate date7.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
2 1/13/2009 OP
3 6/4/2009 OP
3 6/5/2009 OP
4 7/29/2009 LA
5 3/23/2009 OP
;
run;


data want(keep=id adate service);
if _n_=1 then do;
declare hash h(dataset:'have(keep=id adate rename=(adate=date))', multidata: 'y');
h.defineKey('id');
h.defineData('id', 'date');
h.definedone();
end;
set have;
foundself = 0;
if service ne 'I' then do;
h.find();
if adate = date and foundself = 0 then foundself = 1;
else if (adate = date and foundself = 1) or date = adate-1 then service = 'I';
h.has_next(result: r);
do while(service ne 'I' and r ne 0);
h.find_next();
if adate = date and foundself = 0 then foundself = 1;
else if (adate = date and foundself = 1) or date = adate-1 then service = 'I';
h.has_next(result: r);
end;
end;
run;
[/pre] Message was edited by: polingjw
Ksharp
Super User
Hi.
For the sake of your complex logic , I am afraid you need more iterations.



[pre]
data i;
input ID $ ADATE : MMDDYY10. SERVICE $;
format adate mmddyy10.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
3 6/5/2009 I
4 7/29/2009 LA
5 3/23/2009 OP
;

proc sort data=i nodupkey;
by id adate service;
run;

data index;
set i;
where service not eq 'I';
run;

proc sql feedback; *createing table contained observations which is needed to change into I;
create table flag as
select a.*
from index as a left join i as b on a.id eq b.id
where ( b.adate eq a.adate ) or ( b.adate eq (a.adate - 1))
group by a.id,a.service
having count(*) ge 2;
quit;

proc sort data=flag nodupkey;
by id adate service;
run;

data flag;
set flag;
service='I';
run;

proc sort data=i;
by id adate;
run;

data result;
merge i flag;
by id adate;
run;

[/pre]



Ksharp Message was edited by: Ksharp
Sandhya
Fluorite | Level 6
Thanks guys, it was very helpful.

Sandhya.
LewisC_sas
SAS Employee
Hello Sandhya,

From reading, and rereading, the messages in this thread, I think that I understand your problem well enough now for me to contribute the following code that may be of some use. Of course, there may have been some subtlety that I missed.

During testing, I used the original dataset that you provided and also the modified one that SPR provided. In both cases, the mainly PROC SQL code below generated the results that I expected. However, of course, that is no guarantee that there are no bugs lurking somewhere in it.

The code below is just something additional that you may want to look at, or consider. It complements the nice responses that others in this thread have contributed.

==========================================================
options ls=max;

data work.ds;
input id $ adate:mmddyy10. service $;
format adate mmddyy10.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
2 1/13/2009 OP
3 6/4/2009 OP
3 6/5/2009 OP
4 7/29/2009 LA
5 3/23/2009 OP
;


/*--
data work.ds;
input id $ adate:mmddyy10. service $;
format adate mmddyy10.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
3 6/5/2009 I
4 7/29/2009 LA
5 3/23/2009 OP
;
--*/

proc sql;

create table work.sI as
select *
from work.ds
where service = 'I';

create table work.sNI as
select *
from work.ds
where service not = 'I';


/*---------------------------------------------------------*/
/*-- Identify those records in work.sNI that have the --*/
/*-- same id as one or more records in work.sI and whose --*/
/*-- adates are either the same as one of these records --*/
/*-- or corresponds to a record in work.sI that has a --*/
/*-- date one day earlier than the corresponding work.sNI--*/
/*-- record. The records in the result set for this --*/
/*-- query are those that need to have their service --*/
/*-- changed from 'OP' to 'I'. --*/
/*---------------------------------------------------------*/
create table work.needToChange as
select *
from work.sNI
where id in (select id
from work.sI
where ((sNI.adate - sI.adate) in (0,1)))
and service = 'OP';

/*-------------------------------------------------------*/
/*-- Identify the work.sNI records that do NOT need to --*/
/*-- have their service values change from 'OP' to 'I'.--*/
/*-------------------------------------------------------*/
create table work.dontNeedToChange as
select * from work.sNI
except all
select * from work.needToChange;


/*------------------------------------------------------*/
/*-- Create the new dataset, sorted if need be. --*/
/*------------------------------------------------------*/
create table work.result as
select id, adate, service from work.sI
union all
select id, adate, 'I' from work.needToChange
union all
select id, adate, service from work.dontNeedToChange
order by id, adate, service; /* if ordering is desired */

quit;

proc print; run;
==========================================================

Sincerely,
qseries@sas
polingjw
Quartz | Level 8
If you want to use SQL, here is a really quick solution to this problem. I don't know if it would be efficient for a large data set due to the correlated sub-queries that are used.

[pre]
data work.ds;
input id $ adate:mmddyy10. service $;
format adate mmddyy10.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
2 1/13/2009 OP
3 6/4/2009 OP
3 6/5/2009 OP
4 7/29/2009 LA
5 3/23/2009 OP
;


proc sql;
create table want as
select id, adate, case
when service = 'I' then 'I'
when (select count(*) from ds as ds1 where ds1.id=ds3.id and ds1.adate = ds3.adate) ge 2 then 'I'
when (select count(*) from ds as ds2 where ds2.id=ds3.id and ds2.adate = ds3.adate - 1) ge 1 then 'I'
else service
end as service
from ds as ds3;
quit;
[/pre]
Ksharp
Super User
Sorry.I found some problem in my code;
The following is corrected code.

[pre]
data i;
input ID $ ADATE : MMDDYY10. SERVICE $;
format adate mmddyy10.;
datalines;
1 12/7/2009 I
2 1/12/2009 I
2 1/13/2009 OP
3 6/5/2009 I
4 7/29/2009 LA
5 3/23/2009 OP
;

proc sort data=i nodupkey;
by id adate service;
run;

data index;
set i;
where service not eq 'I';
run;

proc sql feedback; *createing table contained observations which is needed to change into I;
create table flag as
select a.*
from index as a left join i as b on a.id eq b.id
where ( b.adate eq a.adate ) or ( b.adate eq (a.adate - 1))
group by a.id,a.service
having count(*) ge 2;
quit;

proc sort data=flag nodupkey;
by id adate service;
run;

data flag;
set flag;
_service='I';
drop service;

run;

proc sort data=i;
by id adate;
run;

data result;
merge i flag;
by id adate;
run;
data result;
set result;
if _service eq 'I' then service=_service;
drop _service;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1217 views
  • 0 likes
  • 6 in conversation