BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tmcrouse
Calcite | Level 5

I am racking my brain trying to figure out how to get this right. I have tried proc transpose to put the various fst srvc dts in one row thinking that would help. But it did not. Then tried do statements but that has not worked. Any suggestions? I need to look at whenever there is a maj restudy, look at the date of the restudy and then look for a proc done after that restudy but it has to be between fst_srvc_dt+1 and fst_srvc_dt+90. Anotherwords, taking the first example, the restudy was done 4/30/2012. So I begin counting on 5/1/2012 and go no more than 90days out. That means the procedure on 6/4/2012 qualifies and it was a perf. I know how to so a statement to say if all these conditions are met, create a column called sub_table and if the condition is the perf I will call it MAJPERF. If the condition is a cath it will be MAJCATH, pet is MAJPET and ct is MAJCT. I hope I am making sense. I put the do code I tried but that did not work.

Data Have:

MBR_SYS_IDFST_SRVC_DTProv_tin_maj_restudyPROV_TINnat_perfnat_cathnat_petnat_ct
11114/22/2012 593415206
11114/30/20121593415206 1
11116/4/2012 5937339391
11116/25/2012 593733939
22226/22/2012 5934152061
22227/26/2012 593415206
22228/28/2012 591726273 1
22228/28/2012 593415206
22229/4/20121593415206
222210/8/2012 593415206 1
33337/30/20121593415206 1
333310/10/2012 5934152061
44442/26/2012 262157500 1
44442/27/2012 5934152061
44442/28/20121593415206 1
44445/18/2012 590724459 1
44445/18/2012 593415206

Data Need:

MBR_SYS_IDFST_SRVC_DTProv_tin_maj_restudyPROV_TINnat_perfnat_cathnat_petnat_ctsub_table
11116/4/2012 5937339391 MAJPERF
222210/8/2012 593415206 1 MAJCATH
333310/10/2012 5934152061 MAJPERF
44445/18/2012 590724459 1 MAJCATH

data maj_restudyA;

set maj_restudy nobs=last_obs;

by mbr_sys_id;

if prov_tin_pci=1  and prov_tin_maj_restudy = 1 and fst_srvc_dt<=&enddt3-90 then do;

  wb = fst_srvc_dt + 1;

  we = fst_srvc_dt + 90;

  do pt = _N_ + 1 to last_obs;

    set cardiac.maj_restudy(keep= mbr_sys_id fst_srvc_dt nat_cath nat_ct nat_perf nat_pet

         rename=(mbr_sys_id=mem fst_srvc_dt=dt))

         point=pt;

     if dt >= wb and dt<=we and nat_cath = 1 then 1 end as 'MAJCATH';

       else if dt>=wb and dt<=we and nat_ct = 1 then 1 end as 'MAJCT';

       else if dt>=wb and dt<=we and nat_perf = 1 then 1 end as 'MAJPERF';

       else if dt>=wb and dt<=we and nat_pet = 1 then 1 end as 'MAJPET';

        end;

  output;

end;

   if we =. then we=fst_srvc_dt;

      if wb=' ' then wb=fst_srvc_dt;

      if we=wb  then output;

drop wb we mem dt pt;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Vince28_Statcan
Quartz | Level 8

data want;

     if _N_=1 then do;

     declare hash h(dataset: "maj_restudy(where=(prov_tin_maj_restudy=1) rename=(fst_srvc_dt=major_study_start_date))");

     h.defineKey('mbr_sys_id');

     h.defineData(major_study_start_date);

     h.defineDone();

     end;

     set maj_restudy;

     rc=h.find();

     if rc=0 then do;

     if (1 LE fst_srvc_dt - major_study_start_date LE 90) and (nat_perf=1 or nat_cath=1 or nat_pet=1 or nat_ct=1) then do

          if nat_perf=1 then sub_table="MAJPERF";

          else if nat_cath=1    then subtable="MAJCATH";

          else if nat_pet=1 then subtable="MAJPET";

          else if nat_pet=1 then subtable="MAJCT";

          output;

          end;

     end;

     drop rc major_study_start_date;

run;

This is an alternative based on your data example. However, if there are multiple Major Studies in any given MBR_SYS_ID, it won't work as is since the hash table defined this way only supports 1 occurence of the key. You may need to play with formats a little, I assumed your dates were naturally numeric values with a date format and not strings as well as I assumed that your indicator columns were numeric.

Basically, it uses the lookup power of hash tables to obtain the start date of the unique major study for each MBR_SYS_ID (if there is one) and puts it in the data vector to allow computation. This would give you multiple rows for a given MBR_SYS_ID if there had been more than one of the 4 columns with 1 for the same MBR_SYS_ID within the following 90 days.

Vince

P.S. When you say "but it did not work". It is best to either give an example of the bad output or a copy of your error log for us to help. For instance, I can't tell from your post if it was a logic or syntax error and reading through someone elses' thoughts/code looking for both is harder. Especially when I don't know your data structure variable types and formats.

A quick glance tells me that

   if we =. then we=fst_srvc_dt;

      if wb=' ' then wb=fst_srvc_dt;

      if we=wb  then output;

is basically outputing every single row that didn't have a major study in addition to what should've achieved your desired result in the big outer do;end; block

View solution in original post

7 REPLIES 7
Vince28_Statcan
Quartz | Level 8

data want;

     if _N_=1 then do;

     declare hash h(dataset: "maj_restudy(where=(prov_tin_maj_restudy=1) rename=(fst_srvc_dt=major_study_start_date))");

     h.defineKey('mbr_sys_id');

     h.defineData(major_study_start_date);

     h.defineDone();

     end;

     set maj_restudy;

     rc=h.find();

     if rc=0 then do;

     if (1 LE fst_srvc_dt - major_study_start_date LE 90) and (nat_perf=1 or nat_cath=1 or nat_pet=1 or nat_ct=1) then do

          if nat_perf=1 then sub_table="MAJPERF";

          else if nat_cath=1    then subtable="MAJCATH";

          else if nat_pet=1 then subtable="MAJPET";

          else if nat_pet=1 then subtable="MAJCT";

          output;

          end;

     end;

     drop rc major_study_start_date;

run;

This is an alternative based on your data example. However, if there are multiple Major Studies in any given MBR_SYS_ID, it won't work as is since the hash table defined this way only supports 1 occurence of the key. You may need to play with formats a little, I assumed your dates were naturally numeric values with a date format and not strings as well as I assumed that your indicator columns were numeric.

Basically, it uses the lookup power of hash tables to obtain the start date of the unique major study for each MBR_SYS_ID (if there is one) and puts it in the data vector to allow computation. This would give you multiple rows for a given MBR_SYS_ID if there had been more than one of the 4 columns with 1 for the same MBR_SYS_ID within the following 90 days.

Vince

P.S. When you say "but it did not work". It is best to either give an example of the bad output or a copy of your error log for us to help. For instance, I can't tell from your post if it was a logic or syntax error and reading through someone elses' thoughts/code looking for both is harder. Especially when I don't know your data structure variable types and formats.

A quick glance tells me that

   if we =. then we=fst_srvc_dt;

      if wb=' ' then wb=fst_srvc_dt;

      if we=wb  then output;

is basically outputing every single row that didn't have a major study in addition to what should've achieved your desired result in the big outer do;end; block

tmcrouse
Calcite | Level 5

I don't believe there is more than 1 occurence per member for the restudies. I tried this but it gave me this error:

It looks like it does not understand the h.definedone:

455  data maj_restudy2;

456       if _N_=1 then do;

457       declare hash h(datasets: "maj_restudy(where=(prov_tin_maj_restudy=1)

457! rename=(fst_srvc_dt=major_study_start_date))");

ERROR: Unknown argument tag datasets.

458       h.defineKey('mbr_sys_id');

459       h.defineData(major_study_start_date);

460       h.defineDone;

          -------------

          559         22

                      76

ERROR 559-185: Invalid object attribute reference h.defineDone.

ERROR 22-322: Syntax error, expecting one of the following: (, +, =.

ERROR 76-322: Syntax error, statement will be ignored.

461       end;

462       set maj_restudy;

463       rc=h.find();

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.

NOTE: The SAS System stopped processing this step because of errors.

NOTE: DATA statement used (Total process time):

      real time           0.00 seconds

      cpu time            0.01 seconds

464       if rc=0 then do;

465       if (1 LE fst_srvc_dt - major_study_start_date LE 90) and (nat_perf=1 or nat_cath=1 or

465! nat_pet=1 or nat_ct=1) then do

466            if nat_perf=1 then sub_table="MAJPERF";

467            else if nat_cath=1    then sub_table="MAJCATH";

468            else if nat_pet=1 then sub_table="MAJPET";

469            else if nat_ct=1 then sub_table="MAJCT";

470            output;

471            end;

472       end;

473       drop rc major_study_start_date;

474  run;

Vince28_Statcan
Quartz | Level 8

my bad, it's <

(dataset:

and not

(datasets:

updating original post.

tmcrouse
Calcite | Level 5

It still does not like that h.definedone; statement

Vince28_Statcan
Quartz | Level 8

Sigh, sorry, updated my reply again.

defineDone is a method of the hash table and SAS requires () syntax at the end

h.defineDone();

I always omit it by mistake when I don't have data to test my syntax.

Edit:

If this can be of any additionnal diagnosis of your original post code. Since you are not keeping a new variable  for your MBR_SYS_ID each time you find a major study case, your do pt=_N_+1 ... end; statement loops on all studies content that is sorted below your current mbr_sys_id so if some studies overlap, each major study will also cause other studies' procedures that occured within the appropriate timeframe to be output including even the record for a new study if it has an associated procedure applied that day.

tmcrouse
Calcite | Level 5


Awesome. That worked like a charm

Haikuo
Onyx | Level 15

Besides Hash() approach, Proc SQL is also handy for your problem:

proc sql;

  select distinct a.* from have a ,

  (select MBR_SYS_ID, FST_SRVC_DT from have where Prov_tin_maj_restudy=1) b

  where a.MBR_SYS_ID=b.MBR_SYS_ID

  and b.FST_SRVC_DT < a.FST_SRVC_DT < b.FST_SRVC_DT+90

and sum( nat_perf, nat_cath, nat_pet, nat_ct) >=1;

quit;

Haikuo

Update: Sub-query is not needed since the condition is straightforward.

proc sql;

  select distinct a.* from have a ,

  have(where= (Prov_tin_maj_restudy=1))b

  where a.MBR_SYS_ID=b.MBR_SYS_ID

  and b.FST_SRVC_DT < a.FST_SRVC_DT < b.FST_SRVC_DT+90

and sum( a.nat_perf, a.nat_cath, a.nat_pet, a.nat_ct) >=1;

quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1633 views
  • 0 likes
  • 3 in conversation