Hi all,
Please excuse me if I don't explain this very well, I will do my best to outline what I'm after.
I have 2 datasets in SAS that look like this:
1)
2)
I would like a final dataset that takes dataset 2 and assigns it a band based on the min-max spread depicted by dataset 1, something that looks like the following:
Any help is much appreciated! I had thought about somehow forcing a one to many merge, where every business in dataset 2 was duplicated by the amount of bands, then simply running if minvalue<value<=maxvalue then dummyvariable="YES", then getting rid of any rows that aren't "YES".
Many thanks,
Declan
There are a few approaches ... here's just one.
The first data set is easy to transform into a format:
data form;
set dataset1;
fmtname = 'bandcat';
rename band=label min_value=start max_value=end;
run;
proc format cntlin=form;
run;
Once you have the format, you can apply it anywhere a format is legal. For example:
data want;
set dataset2;
if (0 <= value <= 900) then band = put(value, bandcat.);
run;
That would create BAND as a character variable. If you want it as numeric, it takes a slightly more complex formula:
if (0 <= value <= 900) then band = input(put(value, bandcat.), 1.);
There are a few approaches ... here's just one.
The first data set is easy to transform into a format:
data form;
set dataset1;
fmtname = 'bandcat';
rename band=label min_value=start max_value=end;
run;
proc format cntlin=form;
run;
Once you have the format, you can apply it anywhere a format is legal. For example:
data want;
set dataset2;
if (0 <= value <= 900) then band = put(value, bandcat.);
run;
That would create BAND as a character variable. If you want it as numeric, it takes a slightly more complex formula:
if (0 <= value <= 900) then band = input(put(value, bandcat.), 1.);
@DeclanBall if it is sets of 100's, math is fun:
data have;
input business $ value;
datalines;
A 50
B 70
C 360
D 780
E 356
F 126
G 800
H 200
I 150
;
data want;
set have;
if mod(value,100)=0 then Band=int(divide(value,100));
else Band=int(divide(value,100))+1;
run;
Try PROC SQL Cartesian product with where clause to exclude the records you don't need.
DATA value;
input band min max;
datalines ;
1 0 100
2 101 200
3 201 300
4 301 400
5 401 501
6 501 600
7 601 700
8 701 800
9 801 900
;
DATA HAVE;
INPUT Business $ value;
datalines ;
A 50
B 70
C 360
D 780
E 356
F 126
G 800
H 200
I 150
;
RUN;
PROC SQL;
CREATE TABLE WANT AS
SELECT Business,Value,Band
FROM HAVE,VALUE
WHERE VALUE>MIN AND VALUE<=MAX;
QUIT;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.