Hello,
I am using the following SAS code to search for premium datasets from a policies list. So if I have a policy number let's say 1111-1111, it search for this policy into all datasets available.
However, in the on condition, we have
on(a.agreement_nbr=b.agreement_nbr_dr and a.&varname1.=b.province and a.company_cd=b.legacy_underwriting_company);
In some old premium datasets, a.&varname1 eq 'PQ' and in b.province = 'QC', so we are loosing 4772 policies number from the original list. Is there a way to say to remap it's value to QC when it is PQ in the on statement
Here's the complete code
%macro searchpremium /minoperator;
Data dest1.found_agreements;
attrib
/*********************************** Agreement Number location information ***************************************************/
KEY label="Record to be anonymized key" length=$31. format=$char31. informat=$char31.
ORGNL_SAS_DS_ROW_NBR label="Original SAS dataset Row number" length=8 format=8. informat=8.
agreement_nbr label="Agreement Number" length=$20. format=$char20. informat=$char20.
cie label="company cd " length=$2. format=$char2. informat=$char2.
lob label="Line Of Business" length=$4. format=$char4. informat=$char4.
year label="Year" length=$4. format=$char4. informat=$char4.
month label="Month" length=$3. format=$char3. informat=$char3.
libname label="path" length=$100. format=$char100. informat=$char100.
filename label="Dataset Name" length=$25. format=$char25. informat=$char25.
company_cd label="Company CD" length=$1. format=$char1. informat=$char1.
policy_expiry_date label="Policy Expiry Date" length=8 format=yymmdd10. informat=yymmdd10.
/*********************************** Data Retention dataset information ***************************************************/
/* Modification: appending policy_number and policy_number_src then renaming agreement_nbr and removing duplicate
POLICY_NUMBER label="Policy Number" length=$25. format=$char25. informat=$char25.
POLICY_NUMBER_SRC label="Policy Number Source" length=$25. format=$char25. informat=$char25.
*/
AGREEMENT_NBR_DR label="Agreement Number Data Retention" length=$25. format=$char25. informat=$char25.
LINE_OF_BUSINESS label="Line of Business" length=$3. format=$char3. informat=$char3.
PROVINCE label="Province" length=$10. format=$char10. informat=$char10.
LEGACY_UNDERWRITING_COMPANY label="Legacy Underwriting Company" length=$10. format=$char10. informat=$char10.
UNDERWRITING_COMPANY label="Underwriting Company" length=$10. format=$char10. informat=$char10.
DISTRIBUTOR_NUMBER label="Distributor Number" length=$10. format=$char10. informat=$char10.
DISTRIBUTOR_NUMBER_SRC label="Distributor Number Source" length=$10. format=$char10. informat=$char10.
SOURCE_SYSTEM label="Source System" length=$15. format=$char15. informat=$char15.
DATASOURCE_BACKEND_REFERENCE label="Data Source Backend Reference" length=$10. format=$char10. informat=$char10.
LAST_TERM_EFFECTIVE_DATE label="Last Term Effective Date" length=8 format=YYMMDD10. informat=YYMMDD10.
LAST_TERM_EXPIRY_DATE label="Last Term Expiry Date" length=8 format=YYMMDD10. informat=YYMMDD10.
CONTRACT_UUID label="Contract UUID" length=$36. format=$char36. informat=$char36.
;
stop;
run;
%do h=1 %to %sysfunc(countw(&cielist.));
%if %tslit(%scan(&cielist.,&h)) eq 'je' %then %let value=2;
%else %if %tslit(%scan(&cielist.,&h)) eq 'gc' %then %let value=0;
%else %let value=1;
%put &=value;
%do i=1 %to %sysfunc(countw(&loblist.))-&value.;
%do k=1 %to %sysfunc(countw(&yearlist.));
%do j=1 %to %sysfunc(countw(&monthlist.));
%let fname=%scan(&cielist.,&h)_%scan(&loblist.,&i)_prm%scan(&monthlist.,&j)%scan(&yearlist.,&k);
%let libnm=/dwh_actuariat/sasdata/Data_Retention/sas%scan(&yearlist.,&k)/%scan(&cielist.,&h)/%scan(&subfolder.,&i)/;
%let cie=%scan(&cielist.,&h);
%let does_it_exist=%sysfunc(fileexist(&libnm.));
%put &=does_it_exist &=libnm.;
%if &does_it_exist eq 1 %then
%do;
%put &=libnm &=fname.;
libname src1 spde "&libnm.";
%put &libnm.&fname.;
%if %sysfunc(exist(src1.&fname.)) eq 1 %then
%do;
/****************** checking if the variable exist ****************/
%if %varexist(src1.&fname.,province_cd) %then %let varname1=province_cd;
%else %if %varexist(src1.&fname.,RISK_PROVINCE_CD) %then %let varname1=RISK_PROVINCE_CD;
%else %if %varexist(src1.&fname.,POLICY_PROVINCE_CD) %then %let varname1=POLICY_PROVINCE_CD;
proc sql;
create table table_info as
select distinct strip(agreement_nbr) as agreement_nbr length=20,
a.ORGNL_SAS_DS_ROW_NBR,
"%scan(&cielist.,&h)" as cie length=2,
"%scan(&loblist.,&i)" as lob length=4,
"%scan(&yearlist.,&k)" as year length=4,
"%scan(&monthlist.,&j)" as month length=3,
"&libnm" as libname length=100,
"&fname" as filename length=25,
a.company_cd,
catx('|',"%scan(&cielist.,&h)","%scan(&loblist.,&i)","%scan(&yearlist.,&k)","%scan(&monthlist.,&j)",a.ORGNL_SAS_DS_ROW_NBR) as key length=31,
input(a.policy_expiry_dt,YYMMDD10.) as policy_expiry_date length=8,
b.*
from src1.&fname. as a
inner join lookup as b
on(a.agreement_nbr=b.agreement_nbr_dr and a.&varname1.=b.province and a.company_cd=b.legacy_underwriting_company);
quit;
%if &sqlobs ne 0 %then
%do;
proc append base=dest1.found_agreements data=table_info force;
run;
%end;
proc sql;
drop table table_info;
quit;
%end;
%else
%do;
%put "the file &fname. does not exist";
%end;
%end;
%end;
%end
%end;
%end;
%end;
proc sort data=dest1.found_agreements out=dest1.found_agreements_in_premds;
by key ;
run;
data listofdatasetstocleanup (keep= libname filename);
set dest1.found_agreements_in_premds;
run;
proc sort data=listofdatasetstocleanup nodupkey out=dest1.listofdatasetstocleanup;
by libname filename;
run;
%mend searchpremium;
%searchpremium;
Hello @alepage,
I would also prefer the format approach suggested by ballardw (where the definition 'PQ' = 'QC' in the VALUE statement should be sufficient), especially if you need the same on-the-fly replacement ('PQ' → 'QC') in several places of your programs or if there are more pairs of values.
Otherwise, yes, you can use a CASE expression as you have suggested.
Example:
data have1;
input prov $ x;
cards;
AB 1
ON 2
PQ 3
BC 4
;
data have2;
input province $ y;
cards;
MB 555
QC 333
BC 444
ON 222
;
%let varname1=prov;
proc sql;
create table want as
select b.province, a.x, b.y
from have1 a inner join have2 b
on case when a.&varname1='PQ' then 'QC' else a.&varname1 end = b.province;
quit;
Or use the IFC function:
on ifc(a.&varname1='PQ','QC',a.&varname1) = b.province;
If I understand correctly you have described one case where you have a variable that may take values of 'PQ' and 'QC' and you want them to match a value of 'QC' in another data set.
Do you have any other values of these variables that also must match? A more complete description might change a suggestion.
Personally instead of trying to get a CASE statement to work I would provide an operation on the value. If this is a small list of values that need to change mapping for the comparison then a custom format may provide the alternate value:
proc format; value $alt_province 'QC','PQ' = 'QC' ; run;
and the comparison would become
and put(a.&varname1., $alt_province.) = b.province and
Caveat: if you have values of the "province" variables that may be longer than two characters you may have to set a default length of the format to be long enough to support them as the code I show will tend to default to a length of 2 a could truncate other values. That would require (default=6) for example to handle up to 6 character length values after the name of the format $alt_province in the code. Do note that any other value of the variable other than those on the left side of the = are not affected by the format and default to their value.
Hello @alepage,
I would also prefer the format approach suggested by ballardw (where the definition 'PQ' = 'QC' in the VALUE statement should be sufficient), especially if you need the same on-the-fly replacement ('PQ' → 'QC') in several places of your programs or if there are more pairs of values.
Otherwise, yes, you can use a CASE expression as you have suggested.
Example:
data have1;
input prov $ x;
cards;
AB 1
ON 2
PQ 3
BC 4
;
data have2;
input province $ y;
cards;
MB 555
QC 333
BC 444
ON 222
;
%let varname1=prov;
proc sql;
create table want as
select b.province, a.x, b.y
from have1 a inner join have2 b
on case when a.&varname1='PQ' then 'QC' else a.&varname1 end = b.province;
quit;
Or use the IFC function:
on ifc(a.&varname1='PQ','QC',a.&varname1) = b.province;
You should add a line at the end of your first step to remove these nasty notes:
NOTE: Variable KEY is uninitialized. NOTE: Variable ORGNL_SAS_DS_ROW_NBR is uninitialized. NOTE: Variable agreement_nbr is uninitialized. NOTE: Variable cie is uninitialized. NOTE: Variable lob is uninitialized. NOTE: Variable year is uninitialized. NOTE: Variable month is uninitialized. NOTE: Variable libname is uninitialized. NOTE: Variable filename is uninitialized. NOTE: Variable company_cd is uninitialized. NOTE: Variable policy_expiry_date is uninitialized. NOTE: Variable AGREEMENT_NBR_DR is uninitialized. NOTE: Variable LINE_OF_BUSINESS is uninitialized. NOTE: Variable PROVINCE is uninitialized. NOTE: Variable LEGACY_UNDERWRITING_COMPANY is uninitialized. NOTE: Variable UNDERWRITING_COMPANY is uninitialized. NOTE: Variable DISTRIBUTOR_NUMBER is uninitialized. NOTE: Variable DISTRIBUTOR_NUMBER_SRC is uninitialized. NOTE: Variable SOURCE_SYSTEM is uninitialized. NOTE: Variable DATASOURCE_BACKEND_REFERENCE is uninitialized. NOTE: Variable LAST_TERM_EFFECTIVE_DATE is uninitialized. NOTE: Variable LAST_TERM_EXPIRY_DATE is uninitialized. NOTE: Variable CONTRACT_UUID is uninitialized.
like this:
...
call missing(of _all_);
stop;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.