BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
CathyVI
Lapis Lazuli | Level 10

Hi,

I have SAS dataset with list of names. I would like to accomplish the following. If any name in List A is present in any row in List B, then find=1, else find=0. The find column will mean that For example, Edison is present in list B (any row). If you want to further assist me, please you can help me identify which rows in List B Edison is. i don't know how to do that either. A name in List A could be in single or multiple rows in List B.

I have tried the if then else, findc and cat functions but no progress.

Thanks in advance!

 

List A List B Find
EDISON Agan, B. K., Macalino, G. E., Nsouli-Maktabi, H., Wang, X., Gaydos, J. C., Ganesan, A., Kortepeter, M. G., & Sanchez, J. L. 1
AGAN Masel J, Deiss RG, Wang X, Sanchez JL, Ganesan A, Macalino GE, Gaydos JC, Kortepeter MG, Agan BK 1
WANG Sakaue T, Dorayappan KDP, Zingarelli R, Khadraoui W, Anbalagan M, Wallbillich J, Bognar B, Wanner R, Cosgrove C, Suarez A, Koga H, Maxwell GL, O'Malley DM, Cohn DE, Selvendiran K. 1
BURGESS Said Y, Filippatou A, Tran C, Rezavi L, Guo K, Smith MD, Resto Y, Chen JJ, Calabresi PA, Caturegli P, Pittock SJ, Flanagan EP, Sotirchos ES. 1
BURGESS Yang J, Wang D, Wang W, Wu C, Li C, Shi W, Qian J, Xie F, Shen H, Tang Y. 1
TAYLOR Yang C, Hou P, Wang D, Wang Z, Duan W, Liu J, Yu S, Fu F, Jin Z. 1
RUSIECKI Laing ED, Sterling SL, Richard SA, Phogat S, Samuels EC, Epsi NJ, Yan L, Moreno N, Coles C, Mehalko J, Drew M, English C, Chung KK, Clifton GT, Munster VJ, de Wit E, Tribble D, Agan BK, Esposito D, Lanteri C, Mitre E, Burgess TH, Broder CC. 1
TINTLE Huang H, Rusiecki J, Zhao N, Chen Y, Ma S, Yu H, Ward MH, Udelsman R, Zhang Y. 0
MALDONADO Brenner AV, Inskip PD, Rusiecki J, Rabkin CS, Engels J, Pfeiffer RM. 1
SIMONS Rockwell EM, Abernathy HA, Evans LM, Bhowmik R, Giandomenico DA, Salzer JS, Maldonado CJ, Choi YS, Boyce RM. 0
TINTLE Bettner LF, Peterson RA, Bergstedt DT, Kelmenson LB, Demoruelle MK, Mikuls TR, Edison JD, Parish MC, Feser ML, Frazer-Abel AA, Moss LK, Mahler M, Holers VM, Deane KD. 0
MALDONADO Bergstedt DT, Tarter WJ, Peterson RA, Feser ML, Parish MC, Striebich CC, Demoruelle MK, Moss L, Bemis EA, Norris JM, Holers VM, Edison JD, Thiele GM, Mikuls TR, Deane KD. 1
BOGGS Lee JA, Mikuls TR, Deane KD, Sayles HR, Thiele GM, Edison JD, Wagner BD, Feser ML, Moss LK, Kelmenson LB, Robinson WH, Payne JB. 1
SHRIVER Goff SH, Bergstedt DT, Feser ML, Moss L, Mikuls TR, Edison JD, Holers VM, Martinez-Prat L, Aure MAR, Mahler M, Deane KD. 1
  Ezekwe EAD Jr, Weskamp AL, Rahim R, Makiya MA, Wetzler L, Ware JM, Nelson C, Castillo PA, Riley CA, Brown T, Penrod L, Constantine GM, Khoury P, Boggs NA, Klion AD.  
  Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang YT, Cardoni W, Coppit G, Srivastava S, Goodman JF, Shriver CD, Liu T, Rodland KD.  
  Madda R, Petyuk VA, Wang YT, Shi T, Shriver CD, Rodland KD, Liu T.  
  Taylor KM, Ricks KM, Kuehnert PA, Eick-Cost AA, Scheckelhoff MR, Wiesen AR, Clements TL, Hu Z, Zak SE, Olschner SP, Herbert AS, Bazaco SL, Creppage KE, Fan MT, Sanchez JL.  
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Sounds like you want to treat this as if you had two independent datasets.

One dataset with just the variable A.  And a second dataset with just the variable B.

So if you really have that mixed_up dataset

Spoiler
data mixed_up;
  infile cards dsd dlm='|' truncover;
  input A :$20. B :$300. FIND ;
cards;
EDISON|Agan, B. K., Macalino, G. E., Nsouli-Maktabi, H., Wang, X., Gaydos, J. C., Ganesan, A., Kortepeter, M. G., & Sanchez, J. L.|1
AGAN|Masel J, Deiss RG, Wang X, Sanchez JL, Ganesan A, Macalino GE, Gaydos JC, Kortepeter MG, Agan BK|1
WANG|Sakaue T, Dorayappan KDP, Zingarelli R, Khadraoui W, Anbalagan M, Wallbillich J, Bognar B, Wanner R, Cosgrove C, Suarez A, Koga H, Maxwell GL, O'Malley DM, Cohn DE, Selvendiran K.|1
BURGESS|Said Y, Filippatou A, Tran C, Rezavi L, Guo K, Smith MD, Resto Y, Chen JJ, Calabresi PA, Caturegli P, Pittock SJ, Flanagan EP, Sotirchos ES.|1
BURGESS|Yang J, Wang D, Wang W, Wu C, Li C, Shi W, Qian J, Xie F, Shen H, Tang Y.|1
TAYLOR|Yang C, Hou P, Wang D, Wang Z, Duan W, Liu J, Yu S, Fu F, Jin Z.|1
RUSIECKI|Laing ED, Sterling SL, Richard SA, Phogat S, Samuels EC, Epsi NJ, Yan L, Moreno N, Coles C, Mehalko J, Drew M, English C, Chung KK, Clifton GT, Munster VJ, de Wit E, Tribble D, Agan BK, Esposito D, Lanteri C, Mitre E, Burgess TH, Broder CC.|1
TINTLE|Huang H, Rusiecki J, Zhao N, Chen Y, Ma S, Yu H, Ward MH, Udelsman R, Zhang Y.|0
MALDONADO|Brenner AV, Inskip PD, Rusiecki J, Rabkin CS, Engels J, Pfeiffer RM.|1
SIMONS|Rockwell EM, Abernathy HA, Evans LM, Bhowmik R, Giandomenico DA, Salzer JS, Maldonado CJ, Choi YS, Boyce RM.|0
TINTLE|Bettner LF, Peterson RA, Bergstedt DT, Kelmenson LB, Demoruelle MK, Mikuls TR, Edison JD, Parish MC, Feser ML, Frazer-Abel AA, Moss LK, Mahler M, Holers VM, Deane KD.|0
MALDONADO|Bergstedt DT, Tarter WJ, Peterson RA, Feser ML, Parish MC, Striebich CC, Demoruelle MK, Moss L, Bemis EA, Norris JM, Holers VM, Edison JD, Thiele GM, Mikuls TR, Deane KD.|1
BOGGS|Lee JA, Mikuls TR, Deane KD, Sayles HR, Thiele GM, Edison JD, Wagner BD, Feser ML, Moss LK, Kelmenson LB, Robinson WH, Payne JB.|1
SHRIVER|Goff SH, Bergstedt DT, Feser ML, Moss L, Mikuls TR, Edison JD, Holers VM, Martinez-Prat L, Aure MAR, Mahler M, Deane KD.|1
 |Ezekwe EAD Jr, Weskamp AL, Rahim R, Makiya MA, Wetzler L, Ware JM, Nelson C, Castillo PA, Riley CA, Brown T, Penrod L, Constantine GM, Khoury P, Boggs NA, Klion AD.| 
 |Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang YT, Cardoni W, Coppit G, Srivastava S, Goodman JF, Shriver CD, Liu T, Rodland KD.| 
 |Madda R, Petyuk VA, Wang YT, Shi T, Shriver CD, Rodland KD, Liu T.| 
 |Taylor KM, Ricks KM, Kuehnert PA, Eick-Cost AA, Scheckelhoff MR, Wiesen AR, Clements TL, Hu Z, Zak SE, Olschner SP, Herbert AS, Bazaco SL, Creppage KE, Fan MT, Sanchez JL.| 
;

Then first turn the mixed_up dataset into two separate datasets.

data A;
  set mixed_up(keep=A FIND);
  where A ne ' ';
run;     

proc sort data=A nodupkey;
  by A ; 
run;

data B;
  set mixed_up(keep=B);
run;

Then just combine them and take the MAX of the results of the FINDW() function call.

proc sql;
 create table want as 
 select A, max(0<findw(B,A,' ,.','it')) as FIND
 from A,B
 group by A
 ;
run;

Results:

Tom_0-1760585157659.png

And if you really do have to use that mixed_up dataset you could use subqueries to do it in one SQL query.

proc sql;
 create table want as 
 select A, max(0<findw(B,A,' ,.','it')) as FIND
 from (select distinct A from mixed_up where A ne ' ') A
    , (select distinct B from mixed_up where B ne ' ') B
 group by A
 ;
run;

View solution in original post

10 REPLIES 10
Cynthia_sas
Diamond | Level 26

Hi:
More information is needed. Why are there duplicates in List A. For example, is there any difference in the 2 occurrences of the name TINTLE or BURGESS in List A? Can you explain how you expect these duplicates to be handled in terms of the lookup? Also, please clarify whether you have 2 separate datasets -- List A with the list of names you want to look up and then List B with the string that seems to contain multiple names that you want to search in. What happens if a row in List B has 2 names -- for example, List B has both Edison and Burgess -- is this a possible scenario? Is there any reason for the unsorted order of the names in List A (in addition to the duplicates)? Is this order important or in need of being kept?
If there was a better description of the structure of your data and the constraints of the data forum members might be able to make more constructive suggestions.
The structure of the data and the code that you have already tried might answer some of these questions. You indicated which functions you'd tried, but the functions don't work in a vacuum. What programming approach did you try? DATA step with IF, DATA step with IF and MERGE, SQL? HASH table? Can you provide more information?
Cynthia

CathyVI
Lapis Lazuli | Level 10
thank you for your response. The data are actually two separate dataset. List A and List B. The duplicates can be removed but I only presented the data as it is.
List B has both Edison and Burgess - yes it is a possible scenario but It doesn't really matter as long as both names have been identified on separate occasions. The order is not important. As I said I only posted the dataset without altering it. duplicates in List A can be removed. Thank you
Tom
Super User Tom
Super User

Sounds like you want to treat this as if you had two independent datasets.

One dataset with just the variable A.  And a second dataset with just the variable B.

So if you really have that mixed_up dataset

Spoiler
data mixed_up;
  infile cards dsd dlm='|' truncover;
  input A :$20. B :$300. FIND ;
cards;
EDISON|Agan, B. K., Macalino, G. E., Nsouli-Maktabi, H., Wang, X., Gaydos, J. C., Ganesan, A., Kortepeter, M. G., & Sanchez, J. L.|1
AGAN|Masel J, Deiss RG, Wang X, Sanchez JL, Ganesan A, Macalino GE, Gaydos JC, Kortepeter MG, Agan BK|1
WANG|Sakaue T, Dorayappan KDP, Zingarelli R, Khadraoui W, Anbalagan M, Wallbillich J, Bognar B, Wanner R, Cosgrove C, Suarez A, Koga H, Maxwell GL, O'Malley DM, Cohn DE, Selvendiran K.|1
BURGESS|Said Y, Filippatou A, Tran C, Rezavi L, Guo K, Smith MD, Resto Y, Chen JJ, Calabresi PA, Caturegli P, Pittock SJ, Flanagan EP, Sotirchos ES.|1
BURGESS|Yang J, Wang D, Wang W, Wu C, Li C, Shi W, Qian J, Xie F, Shen H, Tang Y.|1
TAYLOR|Yang C, Hou P, Wang D, Wang Z, Duan W, Liu J, Yu S, Fu F, Jin Z.|1
RUSIECKI|Laing ED, Sterling SL, Richard SA, Phogat S, Samuels EC, Epsi NJ, Yan L, Moreno N, Coles C, Mehalko J, Drew M, English C, Chung KK, Clifton GT, Munster VJ, de Wit E, Tribble D, Agan BK, Esposito D, Lanteri C, Mitre E, Burgess TH, Broder CC.|1
TINTLE|Huang H, Rusiecki J, Zhao N, Chen Y, Ma S, Yu H, Ward MH, Udelsman R, Zhang Y.|0
MALDONADO|Brenner AV, Inskip PD, Rusiecki J, Rabkin CS, Engels J, Pfeiffer RM.|1
SIMONS|Rockwell EM, Abernathy HA, Evans LM, Bhowmik R, Giandomenico DA, Salzer JS, Maldonado CJ, Choi YS, Boyce RM.|0
TINTLE|Bettner LF, Peterson RA, Bergstedt DT, Kelmenson LB, Demoruelle MK, Mikuls TR, Edison JD, Parish MC, Feser ML, Frazer-Abel AA, Moss LK, Mahler M, Holers VM, Deane KD.|0
MALDONADO|Bergstedt DT, Tarter WJ, Peterson RA, Feser ML, Parish MC, Striebich CC, Demoruelle MK, Moss L, Bemis EA, Norris JM, Holers VM, Edison JD, Thiele GM, Mikuls TR, Deane KD.|1
BOGGS|Lee JA, Mikuls TR, Deane KD, Sayles HR, Thiele GM, Edison JD, Wagner BD, Feser ML, Moss LK, Kelmenson LB, Robinson WH, Payne JB.|1
SHRIVER|Goff SH, Bergstedt DT, Feser ML, Moss L, Mikuls TR, Edison JD, Holers VM, Martinez-Prat L, Aure MAR, Mahler M, Deane KD.|1
 |Ezekwe EAD Jr, Weskamp AL, Rahim R, Makiya MA, Wetzler L, Ware JM, Nelson C, Castillo PA, Riley CA, Brown T, Penrod L, Constantine GM, Khoury P, Boggs NA, Klion AD.| 
 |Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang YT, Cardoni W, Coppit G, Srivastava S, Goodman JF, Shriver CD, Liu T, Rodland KD.| 
 |Madda R, Petyuk VA, Wang YT, Shi T, Shriver CD, Rodland KD, Liu T.| 
 |Taylor KM, Ricks KM, Kuehnert PA, Eick-Cost AA, Scheckelhoff MR, Wiesen AR, Clements TL, Hu Z, Zak SE, Olschner SP, Herbert AS, Bazaco SL, Creppage KE, Fan MT, Sanchez JL.| 
;

Then first turn the mixed_up dataset into two separate datasets.

data A;
  set mixed_up(keep=A FIND);
  where A ne ' ';
run;     

proc sort data=A nodupkey;
  by A ; 
run;

data B;
  set mixed_up(keep=B);
run;

Then just combine them and take the MAX of the results of the FINDW() function call.

proc sql;
 create table want as 
 select A, max(0<findw(B,A,' ,.','it')) as FIND
 from A,B
 group by A
 ;
run;

Results:

Tom_0-1760585157659.png

And if you really do have to use that mixed_up dataset you could use subqueries to do it in one SQL query.

proc sql;
 create table want as 
 select A, max(0<findw(B,A,' ,.','it')) as FIND
 from (select distinct A from mixed_up where A ne ' ') A
    , (select distinct B from mixed_up where B ne ' ') B
 group by A
 ;
run;
CathyVI
Lapis Lazuli | Level 10

@Tom Thank you. This is very helpful but one more thing.

FIND is actually what I want. I don't have variable FIND in my dataset. I have two separate dataset List A and List B. How can I build the "if then statement" or what line of code can I add to your code to get the FIND variable. The table I presented was the output I wanted. Am sorry I should have stated that in my first post. 

Tom
Super User Tom
Super User

@CathyVI wrote:

@Tom Thank you. This is very helpful but one more thing.

FIND is actually what I want. I don't have variable FIND in my dataset. I have two separate dataset List A and List B. How can I build the "if then statement" or what line of code can I add to your code to get the FIND variable. The table I presented was the output I wanted. Am sorry I should have stated that in my first post. 


I don't understand your first question.  The code I showed creates the FIND variable.  I only left it in the A dataset I made from your example so I could check that the results matched.

                          The SAS System                              26
                                         13:54 Tuesday, October 14, 2025

                      The COMPARE Procedure
               Comparison of WORK.A with WORK.WANT
                          (Method=EXACT)

                        Data Set Summary

   Dataset             Created          Modified  NVar    NObs

   WORK.A     15OCT25:23:25:00  15OCT25:23:25:00     2      11
   WORK.WANT  15OCT25:23:29:30  15OCT25:23:29:30     2      11


                        Variables Summary

              Number of Variables in Common: 2.


                       Observation Summary

                  Observation      Base  Compare

                  First Obs           1        1
                  Last  Obs          11       11

 Number of Observations in Common: 11.
 Total Number of Observations Read from WORK.A: 11.
 Total Number of Observations Read from WORK.WANT: 11.

 Number of Observations with Some Compared Variables Unequal: 0.
 Number of Observations with All Compared Variables Equal: 11.

 NOTE: No unequal values were found. All values compared are exactly
       equal.

To make that mixed_up dataset from the WANT and B you could use a MERGE statement with no BY statement.

data results;
  merge want b ;
run;

But I have no idea how that would be useful for anything.

Tom_0-1760626300513.png

 

Quentin
Super User

@CathyVI wrote:

@Tom Thank you. This is very helpful but one more thing.

FIND is actually what I want. I don't have variable FIND in my dataset. I have two separate dataset List A and List B. How can I build the "if then statement" or what line of code can I add to your code to get the FIND variable. The table I presented was the output I wanted. Am sorry I should have stated that in my first post. 


Look again at Tom's answer.  The FIND variable is created in the PROC SQL step:

select A, max(0<findw(B,A,' ,.','it')) as FIND

PROC SQL is very helpful in this case because it makes it easy to create a cartesian product.  If you're not familiar with PROC SQL, it's possible to write a cartesian product in DATA step, but it's more code.

Ksharp
Super User

If you have SAS/IML model, you could try this code which could give you which obs include this word.

options parmcards=x;
filename x temp;
parmcards;
EDISON	Agan, B. K., Macalino, G. E., Nsouli-Maktabi, H., Wang, X., Gaydos, J. C., Ganesan, A., Kortepeter, M. G., & Sanchez, J. L.	1
AGAN	Masel J, Deiss RG, Wang X, Sanchez JL, Ganesan A, Macalino GE, Gaydos JC, Kortepeter MG, Agan BK	1
WANG	Sakaue T, Dorayappan KDP, Zingarelli R, Khadraoui W, Anbalagan M, Wallbillich J, Bognar B, Wanner R, Cosgrove C, Suarez A, Koga H, Maxwell GL, O'Malley DM, Cohn DE, Selvendiran K.	1
BURGESS	Said Y, Filippatou A, Tran C, Rezavi L, Guo K, Smith MD, Resto Y, Chen JJ, Calabresi PA, Caturegli P, Pittock SJ, Flanagan EP, Sotirchos ES.	1
BURGESS	Yang J, Wang D, Wang W, Wu C, Li C, Shi W, Qian J, Xie F, Shen H, Tang Y.	1
TAYLOR	Yang C, Hou P, Wang D, Wang Z, Duan W, Liu J, Yu S, Fu F, Jin Z.	1
RUSIECKI	Laing ED, Sterling SL, Richard SA, Phogat S, Samuels EC, Epsi NJ, Yan L, Moreno N, Coles C, Mehalko J, Drew M, English C, Chung KK, Clifton GT, Munster VJ, de Wit E, Tribble D, Agan BK, Esposito D, Lanteri C, Mitre E, Burgess TH, Broder CC.	1
TINTLE	Huang H, Rusiecki J, Zhao N, Chen Y, Ma S, Yu H, Ward MH, Udelsman R, Zhang Y.	0
MALDONADO	Brenner AV, Inskip PD, Rusiecki J, Rabkin CS, Engels J, Pfeiffer RM.	1
SIMONS	Rockwell EM, Abernathy HA, Evans LM, Bhowmik R, Giandomenico DA, Salzer JS, Maldonado CJ, Choi YS, Boyce RM.	0
TINTLE	Bettner LF, Peterson RA, Bergstedt DT, Kelmenson LB, Demoruelle MK, Mikuls TR, Edison JD, Parish MC, Feser ML, Frazer-Abel AA, Moss LK, Mahler M, Holers VM, Deane KD.	0
MALDONADO	Bergstedt DT, Tarter WJ, Peterson RA, Feser ML, Parish MC, Striebich CC, Demoruelle MK, Moss L, Bemis EA, Norris JM, Holers VM, Edison JD, Thiele GM, Mikuls TR, Deane KD.	1
BOGGS	Lee JA, Mikuls TR, Deane KD, Sayles HR, Thiele GM, Edison JD, Wagner BD, Feser ML, Moss LK, Kelmenson LB, Robinson WH, Payne JB.	1
SHRIVER	Goff SH, Bergstedt DT, Feser ML, Moss L, Mikuls TR, Edison JD, Holers VM, Martinez-Prat L, Aure MAR, Mahler M, Deane KD.	1
. 	Ezekwe EAD Jr, Weskamp AL, Rahim R, Makiya MA, Wetzler L, Ware JM, Nelson C, Castillo PA, Riley CA, Brown T, Penrod L, Constantine GM, Khoury P, Boggs NA, Klion AD.	 
. 	Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang YT, Cardoni W, Coppit G, Srivastava S, Goodman JF, Shriver CD, Liu T, Rodland KD.	 
. 	Madda R, Petyuk VA, Wang YT, Shi T, Shriver CD, Rodland KD, Liu T.	 
. 	Taylor KM, Ricks KM, Kuehnert PA, Eick-Cost AA, Scheckelhoff MR, Wiesen AR, Clements TL, Hu Z, Zak SE, Olschner SP, Herbert AS, Bazaco SL, Creppage KE, Fan MT, Sanchez JL.	 
;


data have;
infile x expandtabs truncover lrecl=800;
input ListA	: $20. ListB $600.;
run;



proc iml;
use have ;
read all var{ListA ListB} ;
close;
do i=1 to nrow(ListA);
flag=findw(ListB,strip(ListA[i]),,'psi' );
  if any(flag^=0) then do;
    loc=loc(flag^=0);
	obs=obs//t(loc);
	word=word//repeat(ListA[i],ncol(loc));
  end;
  else do;obs=obs//.;word=word//ListA[i];end;
end;
create want var{word obs};
append;
close;
quit;


proc print noobs;run;

Ksharp_0-1760600783399.png

 

Kurt_Bremser
Super User

Here's the code for my suggestion:

data list_a;
input name $20.;
datalines;
EDISON
AGAN
WANG
BURGESS
BURGESS
TAYLOR
RUSIECKI
TINTLE
MALDONADO
SIMONS
TINTLE
MALDONADO
BOGGS
SHRIVER
;

data list_b;
infile datalines truncover;
input name_list $300.;
datalines;
Agan, B. K., Macalino, G. E., Nsouli-Maktabi, H., Wang, X., Gaydos, J. C., Ganesan, A., Kortepeter, M. G., & Sanchez, J. L.	1
Masel J, Deiss RG, Wang X, Sanchez JL, Ganesan A, Macalino GE, Gaydos JC, Kortepeter MG, Agan BK	1
Sakaue T, Dorayappan KDP, Zingarelli R, Khadraoui W, Anbalagan M, Wallbillich J, Bognar B, Wanner R, Cosgrove C, Suarez A, Koga H, Maxwell GL, O'Malley DM, Cohn DE, Selvendiran K.
Said Y, Filippatou A, Tran C, Rezavi L, Guo K, Smith MD, Resto Y, Chen JJ, Calabresi PA, Caturegli P, Pittock SJ, Flanagan EP, Sotirchos ES.
Yang J, Wang D, Wang W, Wu C, Li C, Shi W, Qian J, Xie F, Shen H, Tang Y.
Yang C, Hou P, Wang D, Wang Z, Duan W, Liu J, Yu S, Fu F, Jin Z.
Laing ED, Sterling SL, Richard SA, Phogat S, Samuels EC, Epsi NJ, Yan L, Moreno N, Coles C, Mehalko J, Drew M, English C, Chung KK, Clifton GT, Munster VJ, de Wit E, Tribble D, Agan BK, Esposito D, Lanteri C, Mitre E, Burgess TH, Broder CC.
Huang H, Rusiecki J, Zhao N, Chen Y, Ma S, Yu H, Ward MH, Udelsman R, Zhang Y.	
Brenner AV, Inskip PD, Rusiecki J, Rabkin CS, Engels J, Pfeiffer RM.
Rockwell EM, Abernathy HA, Evans LM, Bhowmik R, Giandomenico DA, Salzer JS, Maldonado CJ, Choi YS, Boyce RM.
Bettner LF, Peterson RA, Bergstedt DT, Kelmenson LB, Demoruelle MK, Mikuls TR, Edison JD, Parish MC, Feser ML, Frazer-Abel AA, Moss LK, Mahler M, Holers VM, Deane KD.
Bergstedt DT, Tarter WJ, Peterson RA, Feser ML, Parish MC, Striebich CC, Demoruelle MK, Moss L, Bemis EA, Norris JM, Holers VM, Edison JD, Thiele GM, Mikuls TR, Deane KD.
Lee JA, Mikuls TR, Deane KD, Sayles HR, Thiele GM, Edison JD, Wagner BD, Feser ML, Moss LK, Kelmenson LB, Robinson WH, Payne JB.
Goff SH, Bergstedt DT, Feser ML, Moss L, Mikuls TR, Edison JD, Holers VM, Martinez-Prat L, Aure MAR, Mahler M, Deane KD.
Ezekwe EAD Jr, Weskamp AL, Rahim R, Makiya MA, Wetzler L, Ware JM, Nelson C, Castillo PA, Riley CA, Brown T, Penrod L, Constantine GM, Khoury P, Boggs NA, Klion AD.	 
Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang YT, Cardoni W, Coppit G, Srivastava S, Goodman JF, Shriver CD, Liu T, Rodland KD.	 
Madda R, Petyuk VA, Wang YT, Shi T, Shriver CD, Rodland KD, Liu T.	 
Taylor KM, Ricks KM, Kuehnert PA, Eick-Cost AA, Scheckelhoff MR, Wiesen AR, Clements TL, Hu Z, Zak SE, Olschner SP, Herbert AS, Bazaco SL, Creppage KE, Fan MT, Sanchez JL.	 
;

data list_b_long;
set list_b;
length name $20;
do i = 1 to countw(name_list,",");
  name = compress(upcase(scan(scan(name_list,i,","),1," ")),"","ka");
  output;
end;
keep name;
run;

proc sort data=list_a nodupkey;
by name;
run;

proc sort data=list_b_long nodupkey;
by name;
run;

data want;
merge
  list_a (in=a)
  list_b_long (in=b)
;
by name;
if a;
find = b;
run;

Please do always present data in this way (DATA steps with DATALINES). Copy/pasting from Excel leaves all kinds of trash that needs to be filtered out before we can work with the data.

My footnotes contain a link to a macro that converts datasets to code automatically.

Tom
Super User Tom
Super User

Note that there is an easier macro for generating data in form that is easy to post:

https://github.com/sasutils/macros/blob/master/ds2post.sas

 

This call

%ds2post(sashelp.class)

Will write this code to the SAS log

data work.class (label='Student Data');
  infile datalines dsd dlm='|' truncover;
  input Name :$8. Sex :$1. Age Height Weight ;
datalines4;
Alfred|M|14|69|112.5
Alice|F|13|56.5|84
Barbara|F|13|65.3|98
Carol|F|14|62.8|102.5
Henry|M|14|63.5|102.5
James|M|12|57.3|83
Jane|F|12|59.8|84.5
Janet|F|15|62.5|112.5
Jeffrey|M|13|62.5|84
John|M|12|59|99.5
Joyce|F|11|51.3|50.5
Judy|F|14|64.3|90
Louise|F|12|56.3|77
Mary|F|15|66.5|112
Philip|M|16|72|150
Robert|M|12|64.8|128
Ronald|M|15|67|133
Thomas|M|11|57.5|85
William|M|15|66.5|112
;;;;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 603 views
  • 6 likes
  • 6 in conversation