DATA Step, Macro, Functions and more

Create new variables based on a look up table

Reply
Occasional Contributor
Posts: 17

Create new variables based on a look up table

Hi there,

I need to create a one line per customer file that creates 8 'monthly premium quote' variables based on a persons gender and age (we create quotes for smokers and non-smokers as well but we don't have this in our dataset). So I have the data file that looks like:
cust_num age_yrs gender
1 33 F

I also have a quote table that looks like this:

Coverage age_grp female_ns female_s male_ns male_s
100000 30-34 20.3 25.7 30.5 35.6
200000 35-39 24.6 28.4 34.6 39.1

I was wondering if there is a way to pull in my 'lookup table' and create the quotes I need. I was thinking of transposing my look up table but then I still need to use it with my data to set up the new variables. I was also thinking I might be able to do this with formats somehow? Any advice would be greatly appreciated!

Thanks,
Christy
Super Contributor
Super Contributor
Posts: 3,174

Re: Create new variables based on a look up table

Yes, PROC FORMAT (using CNTLIN=) would work creating a numeric format for your age range (START and END variables) and the LABEL variable will be a character string with your "match condition" values. When you find a match using the PUT function in a DATA step, you will want to use the INPUT function and the SCAN function to parse the LABEL (returned by the PUT function) and convert the sub-fields to numeric.

Have a look at the SAS support http://support.sas.com/ website and search on phrase "PROC FORMAT" and include the CNTLIN keyword -- you prepare a SAS file with specific-named SAS variables (FMTNAME, START, END, HLO, LABEL) and pass the file to PROC FORMAT using CNTLIN=, which generates the format from your "quote table" data.

Scott Barry
SBBWorks, Inc.
Occasional Contributor
Posts: 17

Re: Create new variables based on a look up table

Hi Scott,

I'm just trying to get my data set up and I'm a bit confused... do I need to create a new variable with the match conditions and if so, how?

I've used the proc format cntlin code before so I should be ok with that, its just getting my data ready. Everything I've read is just setting up 1 condition, not multiple ones like I have...

Thanks for any help,
Christy
Super Contributor
Super Contributor
Posts: 3,174

Re: Create new variables based on a look up table

As I mentioned, use the PUT function in an assignment statement to retrieve the results from your format, based on age range. Code snippet shown below:

length fmtdata $nnn ; /* max size to hold data returned from format / put */
fmtdata = put(age,fmtname.);
if fmtdata ne '?' then do; /* test for OTHER='?' condition */
* code below presumes that the FORMAT RESULT variable is format with ;
* some number of sub-fields, separated by a blank, and using numeric ;
* length nnn when the format is generated. ;
f_s = input(scan(fmtdata,1,' '),nnn.);
f_ns = input(scan(fmtdata,2,' '),nnn.);
* ..etc..;
* at this point you have numeric variables with quote info from format. ;


Scott Barry
SBBWorks, Inc.
Occasional Contributor
Posts: 17

Re: Create new variables based on a look up table

Thanks Scott,

I guess I'm confused about even creating the format... when I try to create it I'm getting an error: ERROR: This range is repeated, or values overlap: 18-25.
because of the age range.

This is the code I have:

data ctrl;
set test_quotes (Rename=(age1=start age2=end)) end=last;
retain fmtname 'Quote' ;
length start end 8;

output;

if last then do;
hlo='O';
label='***ERROR***';
output;
end;
run;

proc format cntlin=ctrl;
run;
Super Contributor
Super Contributor
Posts: 3,174

Re: Create new variables based on a look up table

Did you debug the CNTLIN= data file to confirm the SAS error diagnostic being true? That would be the first step. Yes, the presumption is that you have discrete non-overlapping START and END values to populate your format. Per the SAS documentation, this is a requirement, unless you are using other PROC FORMAT features beyond the scope of this post.

Also, your LENGTH statement is unnecessary - anyway, if it did matter for numeric variable granularity, it would need to be coded ahead of the SET statement, otherwise your input variables "could" be truncated, depending on the declared length. Though not germaine to this post, regardless it's a SAS programming habit for you to consider.

I suggest you analyze your input data, focusing on the critical variables FMTNAME, HLO, START and END.

Scott Barry
SBBWorks, Inc.
Respected Advisor
Posts: 3,890

Re: Create new variables based on a look up table

Something like that?

proc format;
value agegrp
30-34 ='30-34'
35-39 ='35-39'
;
run;

data customer;
input cust_num age_yrs gender $;
datalines;
1 33 F
;

data quotes;
input Coverage age_grp $ female_ns female_s male_ns male_s;
datalines;
100000 30-34 20.3 25.7 30.5 35.6
200000 35-39 24.6 28.4 34.6 39.1
run;

proc sql;
select *
from customer left join quotes
on put(customer.age_yrs,agegrp.)=quotes.age_grp
;
quit;
Ask a Question
Discussion stats
  • 6 replies
  • 161 views
  • 0 likes
  • 3 in conversation