DATA Step, Macro, Functions and more

How to count and split as group by using merging hash

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

How to count and split as group by using merging hash

Hi, I have data as example following I try to merging with hash.

But I can not count exactly.

Also I try split as below table as group of ndcID but only get only once value for every group of ndcID.

 

  id           sum      max        count              linkid     ndcID

1102781 (should 4)1AA 
210181(should 2)1AB 
32015(should 2)2AA 
41428(should 2)2AB 
53028(should 2)2AC

 

 

options validvarname=any;
data ex2;
length link_id 8;
input link_id study_entry_date :date9.;
format study_entry_date yymmddn8.; 
datalines;
1 05JAN2012
2 10FEB2012
run;
data claim;
length link_id 8;
input claim_id ndc_id $ link_id rx_dos_dt :date9. days_supply;
format rx_dos_dt yymmddn8.; 
datalines;
1 AA 1 05JAN2014 2
2 AA 1 08JAN2014 3
3 AB 1 14JAN2014 5
4 AD 1 05JAN2013 12
5 AA 2 05JAN2013 10
6 AB 2 10FEB2015 7
7 AC 2 27FEB2015 15
8 AC 3 27JAN2015 15
9 AC 4 27FEB2015 15
10 AA 1 05APR2014 2
11 AA 1 08JAN2015 3
12 AB 1 14JUL2014 5
13 AD 1 25JAN2013 12
14 AA 2 20JAN2013 10
15 AB 2 10MAR2015 7
16 AC 2 27MAR2015 15
17 AC 3 27JUN2015 15
18 AC 4 27AUG2015 15
run;
data ndc;
input ndc_id $;
datalines;
AA
AB
AC
run;
proc sort data=claim;
by link_id ndc_id;
run;
data ndcmerge(keep = link_id ndc_id sum max count);
length sum 8 max 8 count 8;

dcl hash bene_hash(dataset:'ex2', ordered:'a');
bene_hash.definekey('link_id');
bene_hash.definedone();
dcl hash ndc_hash(dataset:'ndc', ordered:'a');
ndc_hash.definekey('ndc_id');
ndc_hash.definedone();
* define dynamic hash object;
declare hash a_groups_data;
declare hash groups (ordered:'a');
groups.defineKey('link_id', 'ndc_id');
groups.defineData('link_id', 'ndc_id', 'a_groups_data');
groups.defineDone();

do count = 0 by 0 until (last.link_id);
do sum = 0 by 0 until (last.ndc_id);
set claim end=last_obs;
by link_id ndc_id;
if bene_hash.find() = 0 and ndc_hash.find() = 0 then do;
prevVal = ifn(first.link_id or first.ndc_id, . , lag(rx_dos_dt));
format prevVal yymmddn8.;
diff = rx_dos_dt - prevVal;
diffPre = ifn(first.link_id or first.ndc_id, . , lag(diff));
if diff > diffPre then
max = diff;
else 
max = diffPre;

sum ++ days_supply;
if last.link_id or last.ndc_id then
count = count + 1;
if last.ndc_id then do;
output;

if groups.find() ne 0 then do;
put 'Not Found group of NDC then ADD';
a_groups_data = _new_ hash(ordered:'a');
a_groups_data.defineKey('_n_');
a_groups_data.defineData('link_id', 'ndc_id', 'sum', 'max', 'count');
a_groups_data.defineDone();
groups.add();
end;
else if groups.find() eq 0 then do;;
put 'Found group of NDC';
end;
a_groups_data.add(); 
end; 
end;
end;
end;

dcl hiter groups_iterator ('groups');
do while (groups_iterator.next() = 0);
tablename = cats('group_eq_',ndc_id);
a_groups_data.output (dataset:tablename);
end;
run;

Accepted Solutions
Solution
‎11-19-2017 06:52 AM
Super User
Posts: 2,516

Re: How to count and split as group by using merging hash

[ Edited ]
Posted in reply to hksonngan

This may do want you want:

          
data NDCMERGE(keep = LINK_ID NDC_ID SUM MAX COUNT);
  retain MAX;
  dcl hash LINK_HASH(dataset:'EX2');
  LINK_HASH.definekey('LINK_ID');
  LINK_HASH.definedone();
  dcl hash NDC_HASH(dataset:'NDC');
  NDC_HASH.definekey('NDC_ID');
  NDC_HASH.definedone();
  set CLAIM end=LAST_OBS;
  by LINK_ID NDC_ID;
  PREVVAL=lag(RX_DOS_DT);
  if LINK_HASH.check()=0 and NDC_HASH.check()=0;
  if first.NDC_ID then call missing(SUM, COUNT, PREVVAL, MAX);
  COUNT + 1;
  SUM   + DAYS_SUPPLY;
  MAX   = max(MAX, RX_DOS_DT-PREVVAL);
  if last.NDC_ID then output;
run;

 

MAX link_id ndc_id SUM COUNT
278 1 AA 10 4
181 1 AB 10 2
15 2 AA 20 2
28 2 AB 14 2
28 2 AC 30 2

 

 

View solution in original post


All Replies
Super User
Posts: 2,516

Re: How to count and split as group by using merging hash

Posted in reply to hksonngan

Please show the desired outcome.

New Contributor
Posts: 3

Re: How to count and split as group by using merging hash

I want get the result of COUNT as following:

 

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
11027841AA
21018121AB
3201522AA
4142822AB
5302822AC

 

But the result COUNT incorrect as program return:

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
11027811AA
21018121AB
3201512AA
4142822AB
5302832AC

 

Also I want split above table to three group AA, AB, and AC as:

 

 GROUP_AA:

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
11027811AA
3201512AA

 

GROUP_AB:

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
21018121AB
4142822AB

 

GROUP_AC:

 

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
5302832AC

 

But program return only once row for every three group as: 

 GROUP_AA:

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
11027811AA

 

GROUP_AB:

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
21018121AB

 

GROUP_AC:

 

IDSUMMAX DURATIONCOUNTLINK_IDNDC_ID
5302832AC
Super User
Posts: 2,516

Re: How to count and split as group by using merging hash

Posted in reply to hksonngan

Also using the lag() function inside an if clause is a good recipe for unexpected results.

 

Solution
‎11-19-2017 06:52 AM
Super User
Posts: 2,516

Re: How to count and split as group by using merging hash

[ Edited ]
Posted in reply to hksonngan

This may do want you want:

          
data NDCMERGE(keep = LINK_ID NDC_ID SUM MAX COUNT);
  retain MAX;
  dcl hash LINK_HASH(dataset:'EX2');
  LINK_HASH.definekey('LINK_ID');
  LINK_HASH.definedone();
  dcl hash NDC_HASH(dataset:'NDC');
  NDC_HASH.definekey('NDC_ID');
  NDC_HASH.definedone();
  set CLAIM end=LAST_OBS;
  by LINK_ID NDC_ID;
  PREVVAL=lag(RX_DOS_DT);
  if LINK_HASH.check()=0 and NDC_HASH.check()=0;
  if first.NDC_ID then call missing(SUM, COUNT, PREVVAL, MAX);
  COUNT + 1;
  SUM   + DAYS_SUPPLY;
  MAX   = max(MAX, RX_DOS_DT-PREVVAL);
  if last.NDC_ID then output;
run;

 

MAX link_id ndc_id SUM COUNT
278 1 AA 10 4
181 1 AB 10 2
15 2 AA 20 2
28 2 AB 14 2
28 2 AC 30 2

 

 

New Contributor
Posts: 3

Re: How to count and split as group by using merging hash

Thanks, this is right and concise code.

Do you know how to use hash to split this table to three group?

I know we can use SQL to split it but just to learn about hash.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 174 views
  • 1 like
  • 2 in conversation