BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasg
Calcite | Level 5

Hi All,

 

Here is the sample data. I need to derive two variables llt_code and comments. Below is the rules for derivation.

 

Year    Study    Term             LLT

2000   123         nausea          A

2000   456         nausea          B

2001    789         nausea         C

2001    789         nausea         A

 

If TERM is same but has different values in LLT then populate LLT_CODE = “Codes do not match” for all these records. Or, set LLT_CODE to “Match”.

 

Comment: Concatenate unique different LLT_TERM within same MHTERM with ‘; ‘ delimeter.

                  Ex: LLT_CODE =’ Codes do not match’ and comment =’Comparing with A; B; C’

 

Final Data:

 

Year    Study    Term             LLT   LLT_TERM                             comments

2000   123         nausea          A     Codes do not match                Comparing with A; B; C

2000   456         nausea          B     Codes do not match                 Comparing with A; B; C

2001    789         nausea         C     Codes do not match                  Comparing with A; B; C

2001    789         nausea         A     Codes do not match                  Comparing with A; B; C

 

Could someone please help me with the logic.

 

Thanks a Lot.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
OK. No problem. Just assign a  value to missing value.
Note: the value stand for missing value can't be the same as LLT, here I use 'NA'.


data have;
input Year    Study    Term  : $20.           LLT $;
if missing(LLT) then LLT='NA';
cards;
2000   123         nausea          A
2000   456         nausea          B
2001    789         nausea         C
2001    442         nausea         A
2000    234      headache       ache
2001 423     headache       ache
2000    123         toothpain      pain
2000    234      toothpain       ache
2001 423         toothpain       ache
2001   23       headache      pain
2001   234       headache       pain
2002   123       headache       pain
2002   456        headache       .
;
run;
data want;
length list $ 2000  LLT_TERM comments $ 80;
n=0;
 do until(last.term);
  set have;
  by term notsorted;
  n+1;
  if not findw(list,strip(llt),'; ') then list=catx(';',list,llt);
 end;
 flag=countw(list,';');
 do until(last.term);
  set have;
  by term notsorted;  
  if flag=1 then do;comments=' '; LLT_TERM='matched'; end;
   else do;comments='Comparing with '||list; LLT_TERM=' Codes do not match';end;
  output;
 end;
drop n flag list;
 run;


View solution in original post

13 REPLIES 13
PGStats
Opal | Level 21

All your example cases are the same. Please show a more complete example with some matches. Tell us what happens when some records match and others don't.

PG
sasg
Calcite | Level 5

Year    Study    Term             LLT
2000   123         nausea          A
2000   456         nausea          B
2001    789         nausea         C
2001    442         nausea         A
2000    234      headache       ache
2001 423     headache       ache
2000    123         toothpain      pain
2000    234      toothpain       ache
2001 423         toothpain       ache

output should look like this:

Final Data: based on the term we need to apply the logic.
 
Year    Study    Term             LLT   LLT_TERM                             comments
2000   123         nausea          A     Codes do not match                Comparing with A; B; C
2000   456         nausea          B     Codes do not match                 Comparing with A; B; C
2001    789         nausea         C     Codes do not match                  Comparing with A; B; C
2001    442         nausea         A     Codes do not match                  Comparing with A; B; C

2000    234      headache       ache   match
2001 423     headache       ache   match
2000    123         toothpain      pain   Codes do not match   Comparing with pain; ache
2000    234      toothpain       ache  Codes do not match   Comparing with pain; ache
2001 423         toothpain       ache  Codes do not match   Comparing with pain; ache

Ksharp
Super User

Assuming I understand what you mean.

 

data have;
input Year    Study    Term  : $20.           LLT $;
cards;
2000   123         nausea          A
2000   456         nausea          B
2001    789         nausea         C
2001    442         nausea         A
2000    234      headache       ache
2001 423     headache       ache
2000    123         toothpain      pain
2000    234      toothpain       ache
2001 423         toothpain       ache
;
run;
data want;
length list $ 20000  LLT_TERM comments $ 80;
n=0;
 do until(last.term);
  set have;
  by term notsorted;
  n+1;
  if not findw(list,strip(llt),'; ') then list=catx(';',list,llt);
 end;
 flag=countw(list,';');
 do until(last.term);
  set have;
  by term notsorted;  
  if flag=1 then do;comments=' '; LLT_TERM='matched'; end;
   else do;comments='Comparing with '||list; LLT_TERM=' Codes do not match';end;
  output;
 end;
drop n flag list;
 run;
PGStats
Opal | Level 21

Great code @Ksharp! Could be improved by putting the assignments to comments and LLT_TERM before the second do until loop. You could also get rid of flag and n.

PG
Ksharp
Super User

@PGStats , Yes. As you wish . Glad to see you come back .

sasg
Calcite | Level 5

Thank You Very Much. Working as expected.

sasg
Calcite | Level 5

Hi ,

 

have issue with the code for  below scenarios, Where llt is missing for 2000 or 2001. It is showing as count 1 so assigning as match,but this scenario should be treated as codes do not match.

 

2000   123         nausea          A
2001    442        nausea          .

 

have another scenario: where in 2001 same term but llt value is different for 2001  records, in this scenario also getting match, but it should come as do not match.

 

2000   123         nausea          A

2000   234         nausea          A
2001    442        nausea         A

2001    442        nausea          .

 

 

sasg
Calcite | Level 5

Hi This solution is working for all other scenarios, but facing issues in the below scenarios. Could you please suggest how to fix this issue.

 

study        year       term             llt

123        2001         headache      pain

234        2001         headache       pain

123        2002         headache       pain

456        2002         headache       .

 

234        2001         neckpain       injury

456        2002         neckpain       .

 

In the above scenarios flag as 1, because there is no other code in this scenario I want to display codes does not match but it is showing as match. Please suggest me the fix for this scenario.

 

Thanks,

sasg

 

Ksharp
Super User
OK. No problem. Just assign a  value to missing value.
Note: the value stand for missing value can't be the same as LLT, here I use 'NA'.


data have;
input Year    Study    Term  : $20.           LLT $;
if missing(LLT) then LLT='NA';
cards;
2000   123         nausea          A
2000   456         nausea          B
2001    789         nausea         C
2001    442         nausea         A
2000    234      headache       ache
2001 423     headache       ache
2000    123         toothpain      pain
2000    234      toothpain       ache
2001 423         toothpain       ache
2001   23       headache      pain
2001   234       headache       pain
2002   123       headache       pain
2002   456        headache       .
;
run;
data want;
length list $ 2000  LLT_TERM comments $ 80;
n=0;
 do until(last.term);
  set have;
  by term notsorted;
  n+1;
  if not findw(list,strip(llt),'; ') then list=catx(';',list,llt);
 end;
 flag=countw(list,';');
 do until(last.term);
  set have;
  by term notsorted;  
  if flag=1 then do;comments=' '; LLT_TERM='matched'; end;
   else do;comments='Comparing with '||list; LLT_TERM=' Codes do not match';end;
  output;
 end;
drop n flag list;
 run;


sasg
Calcite | Level 5

Thank a Lot.

sasg
Calcite | Level 5

one more last question if I want to display LLT_CODE in the comments as alphabetical order.

 

ex:

2000   123         nausea         vomiting
2001   456         nausea         nausea

2000   123         headache       injury
2001   456        headache         pain

 

comments should be 'Comparing nausea ; vomiting'

 'Comparing injury;  pain'

Ksharp
Super User

OK. No problem.

 


data have;
input Year    Study    Term  : $20.           LLT $;
if missing(LLT) then LLT='NA';
cards;
2000   123         nausea          C
2000   456         nausea          B
2001    789         nausea         A
2001    442         nausea         A
2000    234      headache       ache
2001 423     headache       ache
2000    123         toothpain      pain
2000    234      toothpain       ache
2001 423         toothpain       ache
2001   23       headache      pain
2001   234       headache       pain
2002   123       headache       pain
2002   456        headache       .
;
run;
data want;
length list $ 2000  LLT_TERM comments $ 80;
n=0;
 do until(last.term);
  set have;
  by term notsorted;
  n+1;
  if not findw(list,strip(llt),'; ') then list=catx(';',list,llt);
 end;
 flag=countw(list,';');
 do until(last.term);
  set have;
  by term notsorted;  
  if flag=1 then do;comments=' '; LLT_TERM='matched'; end;
   else do;comments=list; LLT_TERM=' Codes do not match';end;
  output;
 end;
drop n flag list;
 run;



 data final_want;
  set want;
  length new_comments $ 2000;
  array x{999} $ 40;
  do i=1 to countw(comments,';');
    x{i}=scan(comments,i,';');
	if x{i}='NA' then x{i}=' ';
  end;
 call sortc(of x{*});
 new_comments='Comparing with: '||catx(';',of x{*});
 drop comments x1-x999 i;
run;
error_prone
Barite | Level 11
Problem can be solved by using "look ahead and back"-technique. You will find papers and posts explaining it in detail.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1429 views
  • 1 like
  • 4 in conversation