BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rlafond
Obsidian | Level 7

Hello all,

 

I have a data set setup like this:

 

IDDateCode
1005/13/2019HCC-5
1005/13/2019HCC-9
1005/13/2019HCC-36
1014/10/2019HCC-4
1014/10/2019HCC-101

 

For every combination of ID and Date, I have a variable number of Code values.

 

For every combination of ID and Date, I need to review the list of Code values present to see if they satisfy a rule. A rule is whether two specific Code values are present for a combination of ID and Date. I have 37 rules total.

 

If a rule is satisfied, I need to write a new record for that ID and Date with the rule number. Multiple rules can be satisfied for each combination of ID and Date.

 

EXAMPLE:

 

Supposed one of my rules is the presence of HCC-9 and HCC-36 = HCC-900.

 

In the above table, when ID=100 and Date = 5/13/2019, this rule is satisfied. So my new data set should look like this:

 

IDDateCode
1005/13/2019HCC-5
1005/13/2019HCC-9
1005/13/2019HCC-36
1005/13/2019HCC-900
1014/10/2019HCC-4
1014/10/2019HCC-101

 

I have not written the rules table yet because I'm not sure what the best way is to compose it (wide? long?).

 

Any thoughts? Not sure how to approach this.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @rlafond,

 

You could write the codes of each BY group into a SAS hash object and at the end of each BY group go through the list of rules (stored in another hash object which receives its content from your rules table) and output the additional record(s) for each satisfied rule. Finally, clear the first hash object before processing the next BY group.

 

Example:

data rules;
input rule (code1 code2 codec) ($);
cards;
1 HCC-9 HCC-36 HCC-900
2 HCC-X HCC-Y  HCC-Z
;

data have;
input id $ date :mmddyy. code $;
format date mmddyy10.;
cards;
100 5/13/2019 HCC-5
100 5/13/2019 HCC-9
100 5/13/2019 HCC-36
101 4/10/2019 HCC-4
101 4/10/2019 HCC-101
101 4/11/2019 HCC-Y
101 4/11/2019 HCC-Y
101 4/11/2019 HCC-4
101 4/11/2019 HCC-101
102 4/11/2019 HCC-Y
102 4/11/2019 HCC-9
102 4/11/2019 HCC-Y
102 4/11/2019 HCC-4
102 4/11/2019 HCC-36
102 4/11/2019 HCC-X
;

data want(keep=id date code);
if _n_=1 then do;
  dcl hash h();
  h.definekey('code');
  h.definedone();
  dcl hash r(dataset:'rules', ordered:'y');
  r.definekey('rule');
  r.definedata('code1', 'code2', 'codec');
  r.definedone();
  dcl hiter ri('r');
  if 0 then set rules;
end;
do until(last.date);
  set have;
  by id date; /* add "notsorted" if data are only grouped */
  rc=h.add();
  output;
end;
do while(ri.next()=0);
  if h.check(key:code1)=0 & h.check(key:code2)=0 then do;
    code=codec;
    output;
  end;
end;
h.clear();
run;

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @rlafond,

 

You could write the codes of each BY group into a SAS hash object and at the end of each BY group go through the list of rules (stored in another hash object which receives its content from your rules table) and output the additional record(s) for each satisfied rule. Finally, clear the first hash object before processing the next BY group.

 

Example:

data rules;
input rule (code1 code2 codec) ($);
cards;
1 HCC-9 HCC-36 HCC-900
2 HCC-X HCC-Y  HCC-Z
;

data have;
input id $ date :mmddyy. code $;
format date mmddyy10.;
cards;
100 5/13/2019 HCC-5
100 5/13/2019 HCC-9
100 5/13/2019 HCC-36
101 4/10/2019 HCC-4
101 4/10/2019 HCC-101
101 4/11/2019 HCC-Y
101 4/11/2019 HCC-Y
101 4/11/2019 HCC-4
101 4/11/2019 HCC-101
102 4/11/2019 HCC-Y
102 4/11/2019 HCC-9
102 4/11/2019 HCC-Y
102 4/11/2019 HCC-4
102 4/11/2019 HCC-36
102 4/11/2019 HCC-X
;

data want(keep=id date code);
if _n_=1 then do;
  dcl hash h();
  h.definekey('code');
  h.definedone();
  dcl hash r(dataset:'rules', ordered:'y');
  r.definekey('rule');
  r.definedata('code1', 'code2', 'codec');
  r.definedone();
  dcl hiter ri('r');
  if 0 then set rules;
end;
do until(last.date);
  set have;
  by id date; /* add "notsorted" if data are only grouped */
  rc=h.add();
  output;
end;
do while(ri.next()=0);
  if h.check(key:code1)=0 & h.check(key:code2)=0 then do;
    code=codec;
    output;
  end;
end;
h.clear();
run;
ballardw
Super User

@rlafond wrote:

Hello all,

 

I have a data set setup like this:

 

ID Date Code
100 5/13/2019 HCC-5
100 5/13/2019 HCC-9
100 5/13/2019 HCC-36
101 4/10/2019 HCC-4
101 4/10/2019 HCC-101

 

For every combination of ID and Date, I have a variable number of Code values.

 

For every combination of ID and Date, I need to review the list of Code values present to see if they satisfy a rule. A rule is whether two specific Code values are present for a combination of ID and Date. I have 37 rules total.

 

If a rule is satisfied, I need to write a new record for that ID and Date with the rule number. Multiple rules can be satisfied for each combination of ID and Date.

 

EXAMPLE:

 

Supposed one of my rules is the presence of HCC-9 and HCC-36 = HCC-900.

 

In the above table, when ID=100 and Date = 5/13/2019, this rule is satisfied. So my new data set should look like this:

 

ID Date Code
100 5/13/2019 HCC-5
100 5/13/2019 HCC-9
100 5/13/2019 HCC-36
100 5/13/2019 HCC-900
101 4/10/2019 HCC-4
101 4/10/2019 HCC-101

 

I have not written the rules table yet because I'm not sure what the best way is to compose it (wide? long?).

 

Any thoughts? Not sure how to approach this.

 

 


Your final decisions will likely have to be based on how you expect the data to actually be used.

From your description I might think that a wide format, because you may end up with 37 variables with values of 1 for true and 0 for false but only one row per id/date combination.

 

It may be that since you are apparently going to be doing a lot of comparisons across values in rows as currently structured that an intermediate step may be to transpose your existing data so all of the codes are available on one row.

 

You need to clarify what

Supposed one of my rules is the presence of HCC-9 and HCC-36 = HCC-900. means. I do not see any what that HCC-36 can equal HCC-900.

If that is inserting a new row with a new code value I strongly suspect that the reason for that insertion has not be carefully thought out.

 

 

One way I might approach this:

data have;
input ID Date :mmddyy10. Code $;
format date mmddyy10.;
dummy=1;
datalines;
100 5/13/2019 HCC-5 
100 5/13/2019 HCC-9 
100 5/13/2019 HCC-36 
101 4/10/2019 HCC-4 
101 4/10/2019 HCC-101 
;
run;

proc transpose data=have out=trans;
by id date;
id code;
var dummy;
run;

data example;
   set trans;
   rule1 = hcc_9 and hcc_36;
run;

The above sets the value of rule1 to 1 (true) when both hcc_9 and hcc_36 are equal to one. Which happens when the code is HCC-9 and HCC-36 in the by group.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 878 views
  • 1 like
  • 3 in conversation