BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

Assume I've two datasets Have1 and Have2. In Have1 dataset I've two variables ERR_MSG_TXT and FLT_ID. In Have2 dataset I've two variables called RUN_ID and STAT_CD.

 

Now I want to create a new dataset WANT which should have a variables called RUN_ID and FLT_ID.So in the new dataset WANT, I've to look for ERR_MSG_TXT in dataset Have1 and STAT_CD variable in dataset Have2. If it matches, then I've to get the respective variables RUN_ID and FLT_ID from datasets Have1 and Have2. I'm not certain which method (look up or join?) works better here. Appericiate if someone of you help me with the code to accomplish this Task.

 

Dataset Have1:

FLT_IDERR_MSG_TXT
148Syntax error
149Not a valid procedure
150Warning

Dataset Have2:

RUN_IDSTAT_CD
1Syntax error
2Warning
3Not a valid procedure

Desired Output: Dataset WANT

RUN_IDFLT_ID
1148
2150
3149

 

 

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

@David_Billa hi and welcome to the SAS Community 🙂

 

There are multiple ways to do this, but here is one

 

data have1;
input FLT_ID ERR_MSG_TXT $50.;
datalines;
148 Syntax error
149 Not a valid procedure
150 Warning
;

data have2;
input RUN_ID STAT_CD $50.;
datalines;
1 Syntax error
2 Warning
3 Not a valid procedure
;

data want(keep=RUN_ID FLT_ID);
    if _N_=1 then do;
        declare hash h(dataset:'have1(rename=(ERR_MSG_TXT=STAT_CD))');
        h.definekey('STAT_CD');
        h.definedata('FLT_ID');
        h.definedone();
    end;

    set have2;
    FLT_ID=.;

    rc=h.find();
run;
David_Billa
Rhodochrosite | Level 12

Great! May I know why you've assigned missing value for FLT_ID?

 

Also could you please tell me about the method using join to complete this task? If not join, then using any other method.

Patrick
Opal | Level 21

Here a few more ways how this can be done.

Please note: In the moment there is a 1:1 relationship between the tables in your sample data. If that's different and especially if there are also non-matching cases possible then you will eventually need to amend the logic to get to your desired result.

data have1;
input FLT_ID ERR_MSG_TXT $50.;
datalines;
148 Syntax error
149 Not a valid procedure
150 Warning
;

data have2;
input RUN_ID STAT_CD $50.;
datalines;
1 Syntax error
2 Warning
3 Not a valid procedure
;

/*** option 1 ***/
proc sql;
/*  create table want1 as*/
  select r.run_id, l.flt_id
  from have1 l inner join have2 r
  on l.ERR_MSG_TXT=r.STAT_CD
  ;
quit;

/*** option 2 ***/
/* create format from data set */
data cntl;
  set have1(rename=(ERR_MSG_TXT=start FLT_ID=label));
  retain fmtname '$FLT_ID' type 'c';
run;
proc format cntlin=cntl;
run;

/* option 2, variant 1: Use format to populate a new variable */
data want2;
  set have2;
  FLT_ID=put(STAT_CD,$FLT_ID3.);
  keep run_id FLT_ID;
run;
proc print;
run;

/* option 2, variant 2: Apply format for printing values of existing variable */
proc print data=have2;
  format STAT_CD $FLT_ID3.;
run;
David_Billa
Rhodochrosite | Level 12
May I know how to amend the logic for non matching cases? One simple
example should be OK
Patrick
Opal | Level 21

@David_Billa wrote:
May I know how to amend the logic for non matching cases? One simple
example should be OK

That really depends how your desired result needs to look like, i.e. do you want to include or exclude non-matching cases, do you want to include all non matching rows from one table or from both ...and so on.

Using the code already provided why don't you give it a go yourself first and then eventually come back and ask targeted questions for things where you get stuck with your code.

You need of course first to create additional sample data for such cases. 

With SQL: Look up in the documentation SQL Left Join, Right Join, Full Outer Join.... and for formats how to add an OTHER case 

s_lassen
Meteorite | Level 14

Yet another way to do this, and to get the non-matches output as well.

 

Assuming that HAVE1 is your lookup dataset, and HAVE2 is the result of various runs, which you want to check:

data have1;
input FLT_ID ERR_MSG_TXT $50.;
datalines;
148 Syntax error
149 Not a valid procedure
150 Warning
;

data have2;
input RUN_ID STAT_CD $50.;
datalines;
1 Syntax error
2 Warning
3 Not a valid procedure
;

proc sql;
  create index ERR_MSG_TXT on have1(ERR_MSG_TXT);
quit;

data want(keep=RUN_ID FLT_ID) notfound(keep=RUN_ID STAT_CD);
  set have2;
  ERR_MSG_TXT=STAT_CD;
  set have1 key=ERR_MSG_TXT/unique;
  if _iorc_ then do;
    output notfound;
    _error_=0;
    end;
  else output want;
run;

If you omit the "_error_=0" line, you get a message in the log when a value is not in the lookup dataset, which may be nice.

PGStats
Opal | Level 21

Left join can handle mismatches in a reasonable manner:

 

data have1;
input FLT_ID ERR_MSG_TXT $50.;
datalines;
148 Syntax error
149 Not a valid procedure
150 Warning
999 Unknown message
;

data have2;
input RUN_ID STAT_CD $50.;
datalines;
1 Syntax error
2 Warning
3 Not a valid procedure
4 Bravo!
;

proc sql;
create table want as
select 
	a.RUN_ID,
	coalesce(b.FLT_ID, 999) as FLT_ID
from 
	have2 as a left join 
	have1 as b on b.ERR_MSG_TXT = a.STAT_CD;
select * from want;
quit;
RUN_ID 	FLT_ID
4 	999
3 	149
1 	148
2 	150
PG

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1041 views
  • 4 likes
  • 5 in conversation