SQL question

Accepted Solution Solved
Reply
Super Contributor
Super Contributor
Posts: 444
Accepted Solution

SQL question

hi guys,'

i have a table of patient-id,date,and test_value and i need to write a SQL  that would count the number of individuals   who have the same test value on at least 3 consecutive test dates. any idea please?

22 29/12/1999 20

22 29/12/1999 20

23 01/11/1999 10

23 01/12/1999 10

23 08/12/1999 10

23 19/04/2000 50

23 20/06/2000 100

23 15/08/2000 120

23 10/09/2000 180

23 11/9/2000 180

23 12/9/2000 180

24 01/8/2001 160

24 01/8/2001 160

24 15/8/2001 160

24 01/1/2002 160

25 1/6/1999 10

25 2/6/1999 10

25 3/6/1999 10

25 19/4/2000 50

25 20/6/2000 100

25 15/8/2000 120

25 31/10/2000 180

25 28/11/2000 80

25 29/11/2000 80

25 30/11/2000 80

25 15/08/2001 160

25 01/1/2001 190


Accepted Solutions
Solution
‎05-04-2013 01:03 AM
Super User
Posts: 10,048

Re: SQL question

OK. SQL can handle this.

data have;
input id $ date :ddmmyy10. score;
format date ddmmyy10.;
cards;
22 29/12/1999 20
22 29/12/1999 20
23 01/11/1999 10
23 01/12/1999 20
23 08/12/1999 10
23 19/04/2000 10
23 20/06/2000 100
23 15/08/2000 120
23 10/09/2000 180
23 11/9/2000 110
23 12/9/2000 180
24 01/8/2001 160
24 01/8/2001 160
24 15/8/2001 160
24 01/1/2002 160
25 1/6/1999 10
25 2/6/1999 10
25 3/6/1999 10
25 19/4/2000 50
25 20/6/2000 100
25 15/8/2000 120
25 31/10/2000 180
25 28/11/2000 80
25 29/11/2000 80
25 30/11/2000 80
25 15/08/2001 160
25 01/1/2001 190
;
run;
 
proc sql;
create table x as
 select *,
(select count(distinct date) from have 
  where have.id=h.id and have.score=h.score and have.date between h.date-1 and h.date+1) as n
  from have as h ;
create table want as
 select distinct id from x where n eq 3;
quit;

Ksharp

View solution in original post


All Replies
Regular Contributor
Posts: 222

Re: SQL question

this should give you what you are looking for

proc sql;

    select distinct catx('_',patient_id,test_value) as patient_testValue length=8

           ,count(catx('_',patient_id,test_value)) as count

     from patients

     group by 1

     having calculated count gt 2

    ;

quit;

Super User
Posts: 19,878

Re: SQL question

Posted in reply to AhmedAl_Attar

Does that factor in same consecutive dates?

Regular Contributor
Posts: 222

Re: SQL question

,

Yes, the query ignores the date and focuses on the patient_testValue combination. This is might be a good thing if you want to look for double/triple charging (Fraud!!)

Alternatively, we can an inner query to remove duplicate dates, like this

proc sql;

    select distinct catx('_',patient_id,test_value) as patient_testValue length=8

           ,count(catx('_',patient_id,test_value)) as count

     from (select distinct patient_id, test_value, date from patients)

     group by 1

     having calculated count gt 2

    ;

quit;

Ahmed

Super Contributor
Posts: 578

Re: SQL question

Posted in reply to AhmedAl_Attar

I think the fact that it ignores the date means that it misses one of the requirements from the original question.

Respected Advisor
Posts: 3,156

Re: SQL question

I concur after .

I also believe Proc SQL will be a lousy choice in dealing with anything requiring sequential processing (quote from OP, " at least 3 consecutive test dates"). In below I am presenting an example that   's code would have different outcome.

data have;

input id $ date :ddmmyy10. score;

cards;

22 29/12/1999 20

22 29/12/1999 20

23 01/11/1999 10

23 01/12/1999 20

23 08/12/1999 10

23 19/04/2000 10

23 20/06/2000 100

23 15/08/2000 120

23 10/09/2000 180

23 11/9/2000 110

23 12/9/2000 180

24 01/8/2001 160

24 01/8/2001 160

24 15/8/2001 160

24 01/1/2002 160

25 1/6/1999 10

25 2/6/1999 10

25 3/6/1999 10

25 19/4/2000 50

25 20/6/2000 100

25 15/8/2000 120

25 31/10/2000 180

25 28/11/2000 80

25 29/11/2000 80

25 30/11/2000 80

25 15/08/2001 160

25 01/1/2001 190

;

proc sort data=have;

by id date;

run;

data Count;

  if _n_=1 then do;

    if 0 then set have;

    declare hash h(ordered:'y');

  h.definekey('id');

  h.definedata('id');

  h.definedone();

  end;

  set have end=last;

     by id date;

  lagdate=lag(date);

      if score eq lag(score) then  _c+(date ne lagdate);

  else _c=0;

  if _c>=2 then rc=h.replace();

  if first.id then _c=0;

     if last then do;Count=h.num_items;output; rc=h.output(dataset:'wanted_id');end;

  keep count;

run;

Table 'count' will have the number that OP requested, while table 'wanted_id' will have all of the qualified 'ID's. In this release, multiple same dates will be treated as '1 day', and that can be easily tweaked to reflect OP's intention otherwise.

Haikuo

Super Contributor
Posts: 275

Re: SQL question

The same patient has twice of at least 3 consecutive test dates, is it counted once or twice? 

data have;

input id date ddmmyy10. dose;

format date ddmmyy10.;

cards;

22 29/12/1999 20

22 29/12/1999 20

23 01/11/1999 10

23 01/12/1999 10

23 08/12/1999 10

23 19/04/2000 50

23 20/06/2000 100

23 15/08/2000 120

23 10/09/2000 180

23 11/9/2000 180

23 12/9/2000 180

24 01/8/2001 160

24 01/8/2001 160

24 15/8/2001 160

24 01/1/2002 160

25 1/6/1999   10

25 2/6/1999   10

25 3/6/1999   10

25 19/4/2000  50

25 20/6/2000 100

25 15/8/2000 120

25 31/10/2000 180

25 28/11/2000 80

25 29/11/2000 80

25 30/11/2000 80

25 15/08/2001 160

25 01/1/2001 190

;

run;

proc sort;

by id date;

run;

data want;

  merge have have(firstobs=2 rename=(date=newdate id=nid));

  if id=nid then flag=newdate-date;

  if lag(flag)=flag=1 then output;

  drop nid newdate;

  run;

proc print;

run;

Solution
‎05-04-2013 01:03 AM
Super User
Posts: 10,048

Re: SQL question

OK. SQL can handle this.

data have;
input id $ date :ddmmyy10. score;
format date ddmmyy10.;
cards;
22 29/12/1999 20
22 29/12/1999 20
23 01/11/1999 10
23 01/12/1999 20
23 08/12/1999 10
23 19/04/2000 10
23 20/06/2000 100
23 15/08/2000 120
23 10/09/2000 180
23 11/9/2000 110
23 12/9/2000 180
24 01/8/2001 160
24 01/8/2001 160
24 15/8/2001 160
24 01/1/2002 160
25 1/6/1999 10
25 2/6/1999 10
25 3/6/1999 10
25 19/4/2000 50
25 20/6/2000 100
25 15/8/2000 120
25 31/10/2000 180
25 28/11/2000 80
25 29/11/2000 80
25 30/11/2000 80
25 15/08/2001 160
25 01/1/2001 190
;
run;
 
proc sql;
create table x as
 select *,
(select count(distinct date) from have 
  where have.id=h.id and have.score=h.score and have.date between h.date-1 and h.date+1) as n
  from have as h ;
create table want as
 select distinct id from x where n eq 3;
quit;

Ksharp

Super Contributor
Super Contributor
Posts: 444

Re: SQL question

Thanks Ksharp,

Thanks guys for your help

A SQL Code is what  i needed .

Respected Advisor
Posts: 3,156

Re: SQL question

OK, , Apparently I read it wrong regarding "3 consecutive test dates." So it is literal "consecutive dates" instead of " dates from 3 consecutive records". While my comments still hold true in term of the latter. It becomes very awkward, if possible, to apply SQL for that sort of scenarios, unless involving multiple steps (ODS output) or some undocumented features (eg. monotonic()).

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 311 views
  • 7 likes
  • 7 in conversation