Hi, I have data as example following I try to merging with hash.
But I can not count exactly.
Also I try split as below table as group of ndcID but only get only once value for every group of ndcID.
id sum max count linkid ndcID
1 | 10 | 278 | 1 (should 4) | 1 | AA |
2 | 10 | 181 | 2 (should 2) | 1 | AB |
3 | 20 | 15 | 1 (should 2) | 2 | AA |
4 | 14 | 28 | 2 (should 2) | 2 | AB |
5 | 30 | 28 | 3 (should 2) | 2 | AC |
options validvarname=any;
data ex2;
length link_id 8;
input link_id study_entry_date :date9.;
format study_entry_date yymmddn8.;
datalines;
1 05JAN2012
2 10FEB2012
run;
data claim;
length link_id 8;
input claim_id ndc_id $ link_id rx_dos_dt :date9. days_supply;
format rx_dos_dt yymmddn8.;
datalines;
1 AA 1 05JAN2014 2
2 AA 1 08JAN2014 3
3 AB 1 14JAN2014 5
4 AD 1 05JAN2013 12
5 AA 2 05JAN2013 10
6 AB 2 10FEB2015 7
7 AC 2 27FEB2015 15
8 AC 3 27JAN2015 15
9 AC 4 27FEB2015 15
10 AA 1 05APR2014 2
11 AA 1 08JAN2015 3
12 AB 1 14JUL2014 5
13 AD 1 25JAN2013 12
14 AA 2 20JAN2013 10
15 AB 2 10MAR2015 7
16 AC 2 27MAR2015 15
17 AC 3 27JUN2015 15
18 AC 4 27AUG2015 15
run;
data ndc;
input ndc_id $;
datalines;
AA
AB
AC
run;
proc sort data=claim;
by link_id ndc_id;
run;
data ndcmerge(keep = link_id ndc_id sum max count);
length sum 8 max 8 count 8;
dcl hash bene_hash(dataset:'ex2', ordered:'a');
bene_hash.definekey('link_id');
bene_hash.definedone();
dcl hash ndc_hash(dataset:'ndc', ordered:'a');
ndc_hash.definekey('ndc_id');
ndc_hash.definedone();
* define dynamic hash object;
declare hash a_groups_data;
declare hash groups (ordered:'a');
groups.defineKey('link_id', 'ndc_id');
groups.defineData('link_id', 'ndc_id', 'a_groups_data');
groups.defineDone();
do count = 0 by 0 until (last.link_id);
do sum = 0 by 0 until (last.ndc_id);
set claim end=last_obs;
by link_id ndc_id;
if bene_hash.find() = 0 and ndc_hash.find() = 0 then do;
prevVal = ifn(first.link_id or first.ndc_id, . , lag(rx_dos_dt));
format prevVal yymmddn8.;
diff = rx_dos_dt - prevVal;
diffPre = ifn(first.link_id or first.ndc_id, . , lag(diff));
if diff > diffPre then
max = diff;
else
max = diffPre;
sum ++ days_supply;
if last.link_id or last.ndc_id then
count = count + 1;
if last.ndc_id then do;
output;
if groups.find() ne 0 then do;
put 'Not Found group of NDC then ADD';
a_groups_data = _new_ hash(ordered:'a');
a_groups_data.defineKey('_n_');
a_groups_data.defineData('link_id', 'ndc_id', 'sum', 'max', 'count');
a_groups_data.defineDone();
groups.add();
end;
else if groups.find() eq 0 then do;;
put 'Found group of NDC';
end;
a_groups_data.add();
end;
end;
end;
end;
dcl hiter groups_iterator ('groups');
do while (groups_iterator.next() = 0);
tablename = cats('group_eq_',ndc_id);
a_groups_data.output (dataset:tablename);
end;
run;
This may do want you want:
data NDCMERGE(keep = LINK_ID NDC_ID SUM MAX COUNT);
retain MAX;
dcl hash LINK_HASH(dataset:'EX2');
LINK_HASH.definekey('LINK_ID');
LINK_HASH.definedone();
dcl hash NDC_HASH(dataset:'NDC');
NDC_HASH.definekey('NDC_ID');
NDC_HASH.definedone();
set CLAIM end=LAST_OBS;
by LINK_ID NDC_ID;
PREVVAL=lag(RX_DOS_DT);
if LINK_HASH.check()=0 and NDC_HASH.check()=0;
if first.NDC_ID then call missing(SUM, COUNT, PREVVAL, MAX);
COUNT + 1;
SUM + DAYS_SUPPLY;
MAX = max(MAX, RX_DOS_DT-PREVVAL);
if last.NDC_ID then output;
run;
MAX | link_id | ndc_id | SUM | COUNT |
---|---|---|---|---|
278 | 1 | AA | 10 | 4 |
181 | 1 | AB | 10 | 2 |
15 | 2 | AA | 20 | 2 |
28 | 2 | AB | 14 | 2 |
28 | 2 | AC | 30 | 2 |
Please show the desired outcome.
I want get the result of COUNT as following:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
1 | 10 | 278 | 4 | 1 | AA |
2 | 10 | 181 | 2 | 1 | AB |
3 | 20 | 15 | 2 | 2 | AA |
4 | 14 | 28 | 2 | 2 | AB |
5 | 30 | 28 | 2 | 2 | AC |
But the result COUNT incorrect as program return:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
1 | 10 | 278 | 1 | 1 | AA |
2 | 10 | 181 | 2 | 1 | AB |
3 | 20 | 15 | 1 | 2 | AA |
4 | 14 | 28 | 2 | 2 | AB |
5 | 30 | 28 | 3 | 2 | AC |
Also I want split above table to three group AA, AB, and AC as:
GROUP_AA:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
1 | 10 | 278 | 1 | 1 | AA |
3 | 20 | 15 | 1 | 2 | AA |
GROUP_AB:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
2 | 10 | 181 | 2 | 1 | AB |
4 | 14 | 28 | 2 | 2 | AB |
GROUP_AC:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
5 | 30 | 28 | 3 | 2 | AC |
But program return only once row for every three group as:
GROUP_AA:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
1 | 10 | 278 | 1 | 1 | AA |
GROUP_AB:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
2 | 10 | 181 | 2 | 1 | AB |
GROUP_AC:
ID | SUM | MAX DURATION | COUNT | LINK_ID | NDC_ID |
5 | 30 | 28 | 3 | 2 | AC |
Also using the lag() function inside an if clause is a good recipe for unexpected results.
This may do want you want:
data NDCMERGE(keep = LINK_ID NDC_ID SUM MAX COUNT);
retain MAX;
dcl hash LINK_HASH(dataset:'EX2');
LINK_HASH.definekey('LINK_ID');
LINK_HASH.definedone();
dcl hash NDC_HASH(dataset:'NDC');
NDC_HASH.definekey('NDC_ID');
NDC_HASH.definedone();
set CLAIM end=LAST_OBS;
by LINK_ID NDC_ID;
PREVVAL=lag(RX_DOS_DT);
if LINK_HASH.check()=0 and NDC_HASH.check()=0;
if first.NDC_ID then call missing(SUM, COUNT, PREVVAL, MAX);
COUNT + 1;
SUM + DAYS_SUPPLY;
MAX = max(MAX, RX_DOS_DT-PREVVAL);
if last.NDC_ID then output;
run;
MAX | link_id | ndc_id | SUM | COUNT |
---|---|---|---|---|
278 | 1 | AA | 10 | 4 |
181 | 1 | AB | 10 | 2 |
15 | 2 | AA | 20 | 2 |
28 | 2 | AB | 14 | 2 |
28 | 2 | AC | 30 | 2 |
Thanks, this is right and concise code.
Do you know how to use hash to split this table to three group?
I know we can use SQL to split it but just to learn about hash.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.