Help using Base SAS procedures

question about specific selection

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

question about specific selection

Following up the question in my previous post, I would like to ask if any sas expert can help me on this simple question.

I’d like to know how many total excise times for each person six months before the 1st test_date when 2<=score<=4.

BTW, thanks again Haikuo and Arthur for your help on my previous question.  Much appreciate.

Regards.

data test;

input @1 id @4 test_date yymmdd10. @13 score;

format test_date yymmdd10.;

cards;

004 20050615 9

001 20021120 2

002 20030128 3

005 20040917 9

003 20040314 6

001 20030523 7

004 20001225 5

005 20061225 4

002 20010807 7

001 20031010 4

run;

data exercise;

input @1 id @5 exercise_date yymmdd10.;

format exercise_date yymmdd10.;

cards;

001 20011206

001 20030806

010 20030526

001 20020728

008 20051223

002 20021127

003 20040527

002 20020901

005 20050728

005 20050925

002 20041002

006 20061112

001 20020928

001 20030905

run;


Accepted Solutions
Solution
‎08-14-2012 05:31 PM
PROC Star
Posts: 7,491

Re: question about specific selection

If you only want to narrow the results of your previous run to only the earliest test_date, and you've already run the code, couldn't you just run something like the following on the file you originally produced:

proc sql;

  create table really_want as

    select * from want

      group by id

        having test_date eq min(test_date)

;

quit;

View solution in original post


All Replies
Solution
‎08-14-2012 05:31 PM
PROC Star
Posts: 7,491

Re: question about specific selection

If you only want to narrow the results of your previous run to only the earliest test_date, and you've already run the code, couldn't you just run something like the following on the file you originally produced:

proc sql;

  create table really_want as

    select * from want

      group by id

        having test_date eq min(test_date)

;

quit;

Respected Advisor
Posts: 4,173

Re: question about specific selection

What about:

proc sql;

  select e.id, count(*) as exercises_6monthPrev_cnt

  from

    (select * from test where 2<=score<=4 group by id having min(test_date)=test_date) t

    , exercise e

  where t.id=e.id and t.test_date >= intnx('month',e.exercise_date,6,'s')

  group by e.id

  ;

quit;

Trusted Advisor
Posts: 1,022

Re: question about specific selection

You can use PROC MEANS with the min stat to get the earliest date with 2<=socre<=4 for each id.  Not presorting required:


proc means data=test (where=(2<=score<=4)) min noprint nway;
   class id;
   var_test_date;
   output out=tdates (keep=id test_date)  min=;
run;

** But for most non-hash solutions, the exercise data should be sorted to interleave with the TDATES dataset above: **;

proc sort data=exercise  out=exer_srt;

  by id exercise_date;

run;

** Note: the "set   xxx yyy" with a BY statement always has the XXX data precede the YYY data for a given BY value.**:

data want (keep=id test_date cut_date n_dates);
  do until (last.id);

    set tdates  (in=int)  exer_srt;

    by id;

    if first.id then n_dates=0;

    if int then do;

      cut_date=intnx('month',test_date,-6,'end');

      tdate=test_date;

    end;

    else if exercise_date <= cut_date then n_dates=n_dates+1;

  end;

  if cut_date^=. then do;

    test_date=tdate;

    output;

  end;

  format cut_date date9.;

run;

Occasional Contributor
Posts: 16

Re: question about specific selection

Hi, thanks very much, Arthur, Patrick and Mkeintz, for your prompt response and help!

Regards.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 201 views
  • 6 likes
  • 4 in conversation