how to select subjects have all 3 visits

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

how to select subjects have all 3 visits

Hello,everyone,

I am a greenhand and would appreciate your help! I have a sample from a cohort study,each subject is supposed to have 3 visits.But actually,some of them only have the first and the third,or the first and the second.Now I am going to select those who have all three visits(like the ID 6238).I tried but failed.Thank you very much for helping me this out.

Thanks a lot!

part of the data set:

                    Obs        ID        visit

                          1      2448       1
                          2      2448       3
                          3      6238       1
                          4      6238       2
                          5      6238       3
                          6      9428       1
                          7      9428       2
                          8     10552       1
                          9     10552       2
                         10     11252       1
                         11     11252       2
                         12     11252       3
                         13     11263       1
                         14     11263       2
                         15     11263       3
                         16     12629       1
                         17     12629       2
                         18     12806       1
                         19     12806       2
                         20     12806       3


Accepted Solutions
Solution
‎11-25-2012 09:40 AM
PROC Star
Posts: 7,360

Re: how to select subjects have all 3 visits

If you want to insure that each case not only has three records, but that they are for visits 1, 2 and 3, then I would use a double DOW.  e.g.:

data have;

  input ID visit;

  cards;

2448       1

2448       3

6238       1

6238       2

6238       3

9428       1

9428       2

10552       1

10552       2

11252       1

11252       2

11252       3

11263       1

11263       2

11263       3

12629       1

12629       2

12806       1

12806       2

12806       3

;

proc sort data=have;

  by id;

run;

data want (drop=checkSmiley Happy;

  do until (last.id);

    set have;

    by id;

    array check(3);

    check(visit)=1;

  end;

  do until (last.id);

    set have;

    by id;

    if sum(of check(*)) eq 3 then output;

  end;

run;

View solution in original post


All Replies
Solution
‎11-25-2012 09:40 AM
PROC Star
Posts: 7,360

Re: how to select subjects have all 3 visits

If you want to insure that each case not only has three records, but that they are for visits 1, 2 and 3, then I would use a double DOW.  e.g.:

data have;

  input ID visit;

  cards;

2448       1

2448       3

6238       1

6238       2

6238       3

9428       1

9428       2

10552       1

10552       2

11252       1

11252       2

11252       3

11263       1

11263       2

11263       3

12629       1

12629       2

12806       1

12806       2

12806       3

;

proc sort data=have;

  by id;

run;

data want (drop=checkSmiley Happy;

  do until (last.id);

    set have;

    by id;

    array check(3);

    check(visit)=1;

  end;

  do until (last.id);

    set have;

    by id;

    if sum(of check(*)) eq 3 then output;

  end;

run;

Contributor
Posts: 23

Re: how to select subjects have all 3 visits

Dear Dr.Arthur,

I tried with small sample,it works.Thanks!

But My whole sample is like more than 4000 subjects and more than 10000 observations,it takes very long time.

Jing

Contributor
Posts: 23

Re: how to select subjects have all 3 visits

And I try to assign a new variable 'f' to indicate if all three visits then f=1,else f=0,then I want to output f=1.But keep failing...

Jing

Super Contributor
Posts: 1,636

Re: how to select subjects have all 3 visits

HI Jing,

Is the code below helpful?

data have;

  input ID visit;

  cards;

2448       1

2448       3

6238       1

6238       2

6238       3

9428       1

9428       2

10552       1

10552       2

11252       1

11252       2

11252       3

11263       1

11263       2

11263       3

12629       1

12629       2

12806       1

12806       2

12806       3

;

proc sql;

  create table temp as

    select id,count(id) as t

  from have

  group by id

   having t>2;

  create table want as

    select id,visit, 1 as f

   from have where id in (select id from temp);

  quit;

  proc print data=want;run;

Good luck!

Linlin

Contributor
Posts: 23

Re: how to select subjects have all 3 visits

Dear Linlin,

I have not learned the procedure SQL.Do you have other idea?

Thanks.

Jing

Super Contributor
Posts: 1,636

Re: how to select subjects have all 3 visits

Hi Jing,

The code Art provided is very efficient. Maybe my code is easier for you to understand.

data have;

  input ID visit;

  cards;

2448       1

2448       3

6238       1

6238       2

6238       3

9428       1

9428       2

10552       1

10552       2

11252       1

11252       2

11252       3

11263       1

11263       2

11263       3

12629       1

12629       2

12806       1

12806       2

12806       3

;

/* output a dataset with IDs having three visits */

data temp;

  set have;

  by id notsorted;

  if first.id then count=0;

  count+1;

  if count=3;

data want;/*(assume your dataset already sorted)*/

merge have(in=a) temp(in=b keep=id);

by id;

if a and b;

proc print;run;

Linlin

Contributor
Posts: 23

Re: how to select subjects have all 3 visits

Dear Lin,

I need to learn Dr.Arthur's coding.The try yours as another method.Then I can learn both.

Thanks a lot.

Jing

Contributor
Posts: 23

Re: how to select subjects have all 3 visits

Dear Dr.Arthur,

This is my coding.randid=id,period=visit,I try your coding in this small sample,it shows what I want.But when I ran in whole sample with many other variables,it takes long time.Would you please help me checking if there are problems after I revise your coding in my pgm.If it is correct,maybe I just need to wait...

Thanks.

Jing

proc sort data=fr;
by randid period;
run;
data fr1;
set fr;
by randid period;
do until (last.randid);
array check(3);
check(period)=1;
end;
do until (last.randid);
set fr;
by randid;
if sum(of check(*)) eq 3 then output;
end;
run;

proc print data=fr1(ods=20);
run;

PROC Star
Posts: 7,360

Re: how to select subjects have all 3 visits

Jing,

I have to think that you entered something incorrectly.  In the following code, I expanded your example data to represent 20,000 observations.  On my computer it took 0.04 seconds to create the file, 0.04 seconds to sort it, and 0.04 seconds to run through the data step with the double DOW loop.

It would help if you posted the code that you actually ran.

PROC Star
Posts: 1,090

Re: how to select subjects have all 3 visits

I agree with Art; given the characteristics that you provide, there should absolutely be no performance problems.

Please post your code, and the error will probably be apparent.

Tom

PROC Star
Posts: 7,360

Re: how to select subjects have all 3 visits

Your last post confuses me.  As I understood your original post, you wanted one file that only contained the records that met the condition.

The following creates two files, the original one, plus a version of the raw data with the extra field added:

data want (drop=f checkSmiley Happy rawdatawithextrafield (drop=checkSmiley Happy;

  do until (last.id);

    set have;

    by id;

    array check(3);

    check(visit)=1;

  end;

  do until (last.id);

    set have;

    by id;

    if sum(of check(*)) eq 3 then do;

      f=1;

      output want;

    end;

    else f=0;

    output rawdatawithextrafield;

  end;

run;

Contributor
Posts: 23

Re: how to select subjects have all 3 visits


Dear Dr,Arthur,

this works.

I need to think your coding.

Thank you very much!

Jing

PROC Star
Posts: 7,360

Re: how to select subjects have all 3 visits

It is a somewhat difficult concept to pick up early on in your learning curve.  However, that said, the code is taking control away from the way SAS normally works.

The two do until loops read each record within a by group.  Thus, the first loop reads the records:

2448       1

2448       3

and creates the array, check, with the values:

check1 1

check2 .

check3 1

Then, the second do loop again reads those same two records, but this time knows the values in the array check.  Since the sum of check will only equal 3 if there were at least three records for id 2448 and, between those records, contained values 1, 2 and 3 for visit.

Then the first loop reads the records for the next id, etc., etc.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 519 views
  • 6 likes
  • 4 in conversation