DATA Step, Macro, Functions and more

lookup table

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

lookup table

Hi, 

This is probably a question that's been discussed many times, but I will appreciate if anyone out there can kindly shed light for me. 

 

I have a dataset (primary) contains following columns (5000 rows)

ID Proc1 Proc2 Proc3 Proc4 (........) Proc55

1  288-3 199-2  211-3 133-2            254-8 (the value for Proc1-Proc55 are string characters)

2  274-2  292-2 213-2 493-2            280-2

.

.

.

 

and another dataset (lookup) contains following (70 rows)

Proc

288-3

292-2

381-2

.

.

.

 

The aim is to create a new variable in the "primary_new" dataset: flag. If any one of the rows in the "primary" file contain a matching data as the one in the "lookup", flag=1, else flag=0.

e.g. 

ID Proc1 Proc2 Proc3 Proc4 (........) Proc55  flag

1  288-3 199-2  211-3 133-2            254-8     1

2  274-2  292-2 213-2 493-2            280-2     1

3  777-2  777-1    .          .                    .          0

I was looking at the array approach, but I am not sure whether I am in the right direction? an example code will be much appreciated. 


Accepted Solutions
Solution
‎10-09-2017 12:51 AM
Super Contributor
Posts: 441

Re: lookup table

Posted in reply to error_prone

The first thing I'd want to do is transpose that data - it will make it much easier to work with especially when you want to do further work on it. Once you've done that a simple merge will work nicely i.e.

 

data have;
	length id 8 proc1 $5 proc2 $5 proc3 $5 proc4 $5;
	infile datalines;
	input id proc1 proc2 proc3 proc4;
	datalines;
1  288-3 199-2  211-3 133-2
2  274-2  292-2 213-2 493-2
;
run;

data lookup;
	length proc $5;
	infile datalines;
	input Proc;
	datalines;
288-3
292-2
381-2
;
run;

proc transpose data=have out=transposed(rename=(col1=proc _name_=procno));
	by id;
	var proc1-proc4;
run;

proc sort data=transposed;
	by proc;
run;

proc sort data=lookup;
	by proc;
run;

data want;
	merge lookup(in=a) transposed(in=b);
	by proc;
	if a and b;
run;

View solution in original post


All Replies
Contributor
Posts: 29

Re: lookup table

Posted in reply to newusernew

data yo;

infile datalines dsd;

length ID 8.;

length PR1 PR2 PR3 $6.;

input ID PR1 PR2 PR3;

datalines;

1,AA,JJ,SS

2,BB,KK,TT

3,CC,LL,UU

4,DD,MM,VV

5,EE,NN,WW

6,FF,AA,XX

7,GG,PP,YY

8,HH,QQ,ZZ

9,II,RR,AA

run;

 

DATA YO2;

SET YO;

ARRAY PR[9] PR1-PR9;

DO I=1 TO 9;

IF PR[I]='AA' THEN FLG='1';

END;

RUN;

Contributor
Posts: 29

Re: lookup table

Posted in reply to ali_jooan

well .. you can change the array size to three, run the do loop to 3 instead of 9 and drop the iterator 'i' from the data step and should be all good.

Frequent Contributor
Posts: 149

Re: lookup table

Posted in reply to newusernew
Posting from m mobile phone, so the following code is untested an may contain typos.

data primary_new;
Set primary;
If _n_ = 1 then do;
Declare hash h(dataset: "lookup"):
h.defineKey("proc");
h.defineDone();
End;

array p proc1-proc55;

do i = 1 to dim(p);
flag = not h.check(key: p[i]);
if flag then continue;
end;
Run;
Solution
‎10-09-2017 12:51 AM
Super Contributor
Posts: 441

Re: lookup table

Posted in reply to error_prone

The first thing I'd want to do is transpose that data - it will make it much easier to work with especially when you want to do further work on it. Once you've done that a simple merge will work nicely i.e.

 

data have;
	length id 8 proc1 $5 proc2 $5 proc3 $5 proc4 $5;
	infile datalines;
	input id proc1 proc2 proc3 proc4;
	datalines;
1  288-3 199-2  211-3 133-2
2  274-2  292-2 213-2 493-2
;
run;

data lookup;
	length proc $5;
	infile datalines;
	input Proc;
	datalines;
288-3
292-2
381-2
;
run;

proc transpose data=have out=transposed(rename=(col1=proc _name_=procno));
	by id;
	var proc1-proc4;
run;

proc sort data=transposed;
	by proc;
run;

proc sort data=lookup;
	by proc;
run;

data want;
	merge lookup(in=a) transposed(in=b);
	by proc;
	if a and b;
run;
New Contributor
Posts: 3

Re: lookup table

Posted in reply to ChrisBrooks
thanks for the suggestion Chris. Your code works perfect. I changed the merge step to proc sql left join instead. seems to work fine without the need to sort the data.
Regular Contributor
Posts: 191

Re: lookup table

Posted in reply to newusernew

A format will do the work as well Smiley Happy

 

data have;

length id 8 proc1 $5 proc2 $5 proc3 $5 proc4 $5;

input id proc1 proc2 proc3 proc4;

datalines;

1 288-3 199-2 211-3 133-2

2 274-2 292-2 213-2 493-2

;

run;

data lookup;

length proc $5;

infile datalines;

input Proc;

datalines;

288-3

292-2

381-2

;

run;

data lu;

set lookup(rename=(proc=start));

length fmtname type $32 label 8;

fmtname = 'lu';

type = 'C';

label = 1;

if _n_ = 1 then do;

output;

hlo = 'O';

label = 0;

output;

end;

else output;

run;

proc format cntlin=lu;

run;

data want;

set have;

array procs[4] proc1 - proc4;

length flag 8;

flag = 0;

do i = 1 to 4;

flag = ifn(flag=0,put(procs(i),lu.),flag);

end;

run;

 

//Fredrik

New Contributor
Posts: 3

Re: lookup table

hi FredrikE,

 

your code works perfectly fine (and much more efficient). I tried and it works. It just that I am a new user of SAS, and I don't quite understand how the code starting from ---data lu--- work.

 

Pardon me for the naivety as proc format is a very unfamiliar procedure for me.

 

Whereas the other method suggested by Chris, although it involves a bit more coding, I can see the logic behind it. 

 

having said that, your approach is brilliant, and I would have accepted your code as solution also if the forum rules allow. 

 

also, thank you everyone else for your contribution.

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 212 views
  • 5 likes
  • 5 in conversation