Solved: Re: Look up a value in a separate table based on which interval the ke...

RTao · Posted 11-07-2023 03:33 PM

Hi, basically what I need is based on where the Key falls in the interval for each quarter, a column indicating the proper group it belongs should be created from have1. Left join won't work since I don't really have a unique key, this is simply a lookup based on a range. Any help is appreciated!

Example:

have1:

Quarter	Key
1	31.39715
1	53.59118
1	84.20986
1	81.27638
2	58.62042
2	12.07567
2	91.42742
3	69.06109
4	53.8079
4	33.30056

have 2:

Quarter	Group	Cutoff
1	A	20
1	B	50
1	C	90
1	D	100
2	A	10
2	B	20
2	C	30
2	D	100
3	A	60
3	B	70
3	C	80
3	D	100
4	A	10
4	B	60
4	C	99
4	D	100

want:

Quarter	Key	Group
1	31.39715	B
1	53.59118	C
1	84.20986	C
1	81.27638	C
2	58.62042	D
2	12.07567	B
2	91.42742	D
3	69.06109	A
4	53.8079	A
4	33.30056	A

SASJedi · Posted 11-13-2023 11:14 AM

Here is a possible approach using formats. First, create the data HAVE sets:

data have1;
infile datalines dsd dlm='|';
input Quarter Key;
datalines;
1|31.39715
1|53.59118
1|84.20986
1|81.27638
2|58.62042
2|12.07567
2|91.42742
3|69.06109
4|53.8079
4|33.30056
;


data have2;
infile datalines dsd dlm='|';
input Quarter Group:$1. Cutoff;
datalines;
1|A|20
1|B|50
1|C|90
1|D|100
2|A|10
2|B|20
2|C|30
2|D|100
3|A|60
3|B|70
3|C|80
3|D|100
4|A|10
4|B|60
4|C|99
4|D|100
;

Next, use the HAVE2 dataset to create a series of formats, one for each Quarter:

data cntlin;
	set have2 (rename=(Group=Label));
	by quarter ;
	drop quarter cutoff;
	fmtname=cats('qtr',quarter,'_');
    Start=put(lag1(Cutoff),32.);
	End=put(Cutoff,32.);
	if first.quarter then Start='LOW';
run;

proc format cntlin=cntlin fmtlib;
run;

Finally, apply the appropriate format to the Key values using PUTN, which allows assigning the format at execution time:

data want;
	set have1;
	Group=putn(Key,cats('qtr',quarter,'_'));
run;

This produces the desired results:

Obs	Quarter	Key	Group
1	1	31.3972	B
2	1	53.5912	C
3	1	84.2099	C
4	1	81.2764	C
5	2	58.6204	D
6	2	12.0757	B
7	2	91.4274	D
8	3	69.0611	B
9	4	53.8079	B
10	4	33.3006	B

Note that the values for Quarter 4 different from your "Want" data, but these values are correct based on the cutoff values provided in your HAVE2 dataset.

May the SAS be with you!
Mark

Check out my Jedi SAS Tricks for SAS Users

View solution in original post

SASJedi · Posted 11-13-2023 11:14 AM

Here is a possible approach using formats. First, create the data HAVE sets:

data have1;
infile datalines dsd dlm='|';
input Quarter Key;
datalines;
1|31.39715
1|53.59118
1|84.20986
1|81.27638
2|58.62042
2|12.07567
2|91.42742
3|69.06109
4|53.8079
4|33.30056
;


data have2;
infile datalines dsd dlm='|';
input Quarter Group:$1. Cutoff;
datalines;
1|A|20
1|B|50
1|C|90
1|D|100
2|A|10
2|B|20
2|C|30
2|D|100
3|A|60
3|B|70
3|C|80
3|D|100
4|A|10
4|B|60
4|C|99
4|D|100
;

Next, use the HAVE2 dataset to create a series of formats, one for each Quarter:

data cntlin;
	set have2 (rename=(Group=Label));
	by quarter ;
	drop quarter cutoff;
	fmtname=cats('qtr',quarter,'_');
    Start=put(lag1(Cutoff),32.);
	End=put(Cutoff,32.);
	if first.quarter then Start='LOW';
run;

proc format cntlin=cntlin fmtlib;
run;

Finally, apply the appropriate format to the Key values using PUTN, which allows assigning the format at execution time:

data want;
	set have1;
	Group=putn(Key,cats('qtr',quarter,'_'));
run;

This produces the desired results:

Obs	Quarter	Key	Group
1	1	31.3972	B
2	1	53.5912	C
3	1	84.2099	C
4	1	81.2764	C
5	2	58.6204	D
6	2	12.0757	B
7	2	91.4274	D
8	3	69.0611	B
9	4	53.8079	B
10	4	33.3006	B

Note that the values for Quarter 4 different from your "Want" data, but these values are correct based on the cutoff values provided in your HAVE2 dataset.

May the SAS be with you!
Mark

Check out my Jedi SAS Tricks for SAS Users

Patrick · Posted 11-13-2023 08:34 PM

And here a SQL way using the source data created by @SASJedi

The proposed data step with formats will perform better so I'd certainly be using this approach for any bigger data volumes.

The advantage of pure SQL is that it could execute in-database (with a few amendments to the current SAS SQL flavor) if both of your source tables are in a database.

proc sql;
  select 
    h1.quarter
    ,h1.key
    ,h2.group
  from have1 h1 left join have2 h2
    on h1.quarter=h2.quarter
      and h1.key <= h2.Cutoff
  group by h1.quarter, h1.key
  having h2.Cutoff-h1.key = min(h2.Cutoff-h1.key)
  ;
quit;

Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Re: Look up a value in a separate table based on which interval the key falls

Click image to register for webinar

Classroom Training Available!