Hello Everyone,
I have a strange situation in SAS that I’ve never come across before.
I have two data sets:
· 1. Data Dictionary (DATA A) – which defines the numbers or codes reference in DATA B
· 2. DATA B – the variable field(s) and the code(s) per each ID
DATA A:
NAME CODES CODEVALUES
Crayon 01 Blue
Crayon 02 Red
Crayon 03 Yellow
PayFine
Fruit 01 Apple
Fruit 02 Watermelon
ChargeTax 01 Yes
ChargeTax 02 No
………
DATA B:
ID Crayon PayFine Fruit ChargeTax ………
12345 01 Yes 02 01
67891 03 No 02 02
00001 02 01 01
Desired Results:
12345 67891 00001
Blue Yellow Red
Yes No
Watermelon Watermelon Apple
Yes No Yes
Things to notice:
The part that is confusing to me is the how to link the content of DATA A to the Variable headers of DATA B.
Notice how Crayon is defined in DATA A using NAME, CODES & CODEVALUES, then referenced as a column in DATA B calling on CODES to pull in the correct CODEVALUES.
Keep in mind DATA A & DATA B are very large tables and DATA B has 65 column headers.
Thanks for any assistance or guidance on this one.
:smileyconfused:
Well, linking a and b should be straight-forward:
proc sql;
create table WANT as
select B.*,
A.CODEVALUES
from DATAB B
full join DATAA A
on B.CRAYON=A.CODES;
quit;
This would then give you a dataset which look like data B, with all the observations from A merged in.
Then you would do a datastep to populate the information in a normalized format, however I would suggest that if you have many rows as you state then this is not a good idea as you will end up with lots and lots of columns and few rows.
Crayon PayFine Fruit ChargeTax
These are only 4 out of the 65 variables that need to be validated against DATA A to pull the CODE and CODEVALUES.
Once this logic is complete the same code is needed to run against another 40 tables with 65 to 80 variables.
Keep in mind my last blurb - there are 65 variables that need to be checked.
Keep in mind DATA A & DATA B are very large tables and DATA B has 65 column headers.
Look like format definitions to me. Use dataA to generate formats via PROC FORMAT with CNTLIN dataset.
Then attach the format to appropriate variable and PROC PRINT or any other proc will display the CODEVALUES for you .
DATA A;
length name $30 codes 8 codevalues $50 ;
input NAME CODES CODEVALUES ;
cards;
Crayon 01 Blue
Crayon 02 Red
Crayon 03 Yellow
PayFine . .
Fruit 01 Apple
Fruit 02 Watermelon
ChargeTax 01 Yes
ChargeTax 02 No
run;
proc format cntlin=A(rename=(name=fmtname codes=start codevalues=label) where=(start ne .)) ;
run;
proc sql noprint ;
select distinct catx(' ',name,cats(name,'.'))
into :fmtlist separated by ' '
from a
where codes ne .
;
quit;
DATA B;
input ID $ Crayon PayFine $ Fruit ChargeTax;
format &fmtlist ;
cards;
12345 01 Yes 02 01
67891 03 No 02 02
00001 02 . 01 01
run;
proc print data=b width=min;
run;
Hello Tom,
For HIPPA reasons I removed names, but you should understand the basics as to what i'm trying to accomplish.
DATA B;
In my situation was already created using a .SAS program supplied by vendor (60 seperate .SAS files that contain different column headers and data - below is one sample).
How do I use the above code to reference NAME, CODES and CODEVALUES found in DATA A?
The data sample above works great even though I had to change the code to look for CHAR instead of . when referencing the CODE field (field is CHAR/ has a mixture of numbers and text)
Sample load from vendor:
Look at PLANTYPE from both DATA A & DATA B and you'll see what my desired output is
48 MMP HMO
01 HMO
That's just one field for all of the fields they supplied that need to be mapped back to DATA A.
DATA B;
length
pbp_a_hnumber $5
pbp_a_plan_identifier $3
pbp_a_ben_cov $1
pbp_a_plan_type $2
orgtype $2
bid_id $13
pbp_b4a_maxenr_yn $1
pbp_b4a_maxenr_per $1
pbp_b4a_coins_yn $1
pbp_b4a_coins_wavdia_yn $1
pbp_b4a_coins_wavdia_dh $1
pbp_b4a_ded_yn $1
pbp_b4a_copay_yn $1
pbp_b4a_copay_wavdia_yn $1
pbp_b4a_copay_wavdia_mi_dh $1
pbp_b4a_ded_er_cstshrs_yn $1
pbp_b4a_ded_er_cstshrs $3
pbp_b4b_maxenr_yn $1
pbp_b4b_maxenr_type $1
pbp_b4b_maxenr_per $1
pbp_b4b_coins_yn $1
pbp_b4b_coins_wavdmc_yn $1
pbp_b4b_coins_wavd_mcdh $1
pbp_b4b_ded_yn $1
pbp_b4b_copay_yn $1
pbp_b4b_copay_wavdia_yn $1
pbp_b4b_copay_wavd_mcdh $1
pbp_b4c_bendesc_yn $1
pbp_b4c_bendesc_amo $1
pbp_b4c_wwc_maxplan_yn $1
pbp_b4c_wwc_maxplan_per $1
pbp_b4c_maxenr_yn $1
pbp_b4c_maxenr_per $1
pbp_b4c_coins_yn $1
pbp_b4c_coins_wavdww_yn $1
pbp_b4c_ded_yn $1
pbp_b4c_copay_yn $1
pbp_b4c_copay_wavdww_yn $1
;
infile '/windows/pbp_b4_emerg_urgent.txt' delimiter = '09'x
dsd lrecl = 30000 firstobs=2 missover;
input
pbp_a_hnumber
pbp_a_plan_identifier
segment_id
pbp_a_ben_cov
pbp_a_plan_type
orgtype
bid_id
version
pbp_b4a_maxenr_yn
pbp_b4a_maxenr_amt
pbp_b4a_maxenr_per
pbp_b4a_coins_yn
pbp_b4a_coins_pct_mc_min
pbp_b4a_coins_pct_mc_max
pbp_b4a_coins_wavdia_yn
pbp_b4a_coins_wavdia_dh
pbp_b4a_coins_wavd_dh
pbp_b4a_ded_yn
pbp_b4a_ded_amt
pbp_b4a_copay_yn
pbp_b4a_copay_amt_mc_min
pbp_b4a_copay_amt_mc_max
pbp_b4a_copay_wavdia_yn
pbp_b4a_copay_wavdia_mi_dh
pbp_b4a_copay_wavd_dh
pbp_b4a_ded_er_cstshrs_yn
pbp_b4a_ded_er_cstshrs
pbp_b4b_maxenr_yn
pbp_b4b_maxenr_type
pbp_b4b_maxenr_amt
pbp_b4b_maxenr_per
pbp_b4b_coins_yn
pbp_b4b_coins_pct_mc_min
pbp_b4b_coins_pct_mc_max
pbp_b4b_coins_wavdmc_yn
pbp_b4b_coins_wavd_mcdh
pbp_b4b_coins_wavdmc_dh
pbp_b4b_ded_yn
pbp_b4b_ded_amt
pbp_b4b_copay_yn
pbp_b4b_copay_amt_mc_min
pbp_b4b_copay_amt_mc_max
pbp_b4b_copay_wavdia_yn
pbp_b4b_copay_wavd_mcdh
pbp_b4b_copay_wavdmc_dh
pbp_b4c_bendesc_yn
pbp_b4c_bendesc_amo
pbp_b4c_wwc_maxplan_yn
pbp_b4c_wwc_maxplan_amt
pbp_b4c_wwc_maxplan_per
pbp_b4c_maxenr_yn
pbp_b4c_maxenr_amt
pbp_b4c_maxenr_per
pbp_b4c_coins_yn
pbp_b4c_coins_pct_ww
pbp_b4c_coins_wavdww_yn
pbp_b4c_ded_yn
pbp_b4c_ded_amt
pbp_b4c_copay_yn
pbp_b4c_copay_amt_ww
pbp_b4c_copay_wavdww_yn
;
label
pbp_a_hnumber = 'H Number'
pbp_a_plan_identifier = 'Plan ID'
segment_id = 'Segment ID'
pbp_a_ben_cov = 'Coverage Criteria'
pbp_a_plan_type = 'Plan Type'
orgtype = 'Organization Type'
bid_id = 'Bid ID'
version = 'Version Number'
pbp_b4a_maxenr_yn = 'Emerg Care Max Enr YN'
pbp_b4a_maxenr_amt = 'Emerg Care Max Enr Amt'
pbp_b4a_maxenr_per = 'Emerg Care Max Enr Per'
pbp_b4a_coins_yn = 'Emerg Care Coinsurance YN'
pbp_b4a_coins_pct_mc_min = 'Emerg Care MC Min Coins Pct'
pbp_b4a_coins_pct_mc_max = 'Emerg Care MC Max Coins Pct'
pbp_b4a_coins_wavdia_yn = 'Emerg Care MC Coins Waived YN'
pbp_b4a_coins_wavdia_dh = 'Emerg Care Coins Waived D/H'
pbp_b4a_coins_wavd_dh = 'Emerg Care Coins Waived D/H'
pbp_b4a_ded_yn = 'Emerg Care Deductible YN'
pbp_b4a_ded_amt = 'Emerg Care Deductible Amt'
pbp_b4a_copay_yn = 'Emerg Care Copayment YN'
pbp_b4a_copay_amt_mc_min = 'Emerg Care MC Min Copay Amt'
pbp_b4a_copay_amt_mc_max = 'Emerg Care MC Max Copay Amt'
pbp_b4a_copay_wavdia_yn = 'Emerg Care MC Copay Waived YN'
pbp_b4a_copay_wavdia_mi_dh = 'Emerg Care Copay Waived D/H'
pbp_b4a_copay_wavd_dh = 'Emerg Care Copay Waived D/H'
pbp_b4a_ded_er_cstshrs_yn = 'Emerg ER Cst Shr Deductible YN'
pbp_b4a_ded_er_cstshrs = 'Emerg ER Cost Shr Deductibles'
pbp_b4b_maxenr_yn = 'Urg Care Max Enr YN'
pbp_b4b_maxenr_type = 'Urg Care Max Enr Type'
pbp_b4b_maxenr_amt = 'Urg Care Max Enr Amt'
pbp_b4b_maxenr_per = 'Urg Care Max Enr Per'
pbp_b4b_coins_yn = 'Urg Care Coinsurance YN'
pbp_b4b_coins_pct_mc_min = 'Urg Care MC Min Coins Pct'
pbp_b4b_coins_pct_mc_max = 'Urg Care MC Max Coins Pct'
pbp_b4b_coins_wavdmc_yn = 'Urg Care MC Coins Waived YN'
pbp_b4b_coins_wavd_mcdh = 'Urg Care MC Coins Waived D/H'
pbp_b4b_coins_wavdmc_dh = 'Urg Care MC Coins Waived D/H'
pbp_b4b_ded_yn = 'Urg Care Deductible YN'
pbp_b4b_ded_amt = 'Urg Care Deductible Amt'
pbp_b4b_copay_yn = 'Urg Care Copayment YN'
pbp_b4b_copay_amt_mc_min = 'Urg Care MC Min Copay Amt'
pbp_b4b_copay_amt_mc_max = 'Urg Care MC Max Copay Amt'
pbp_b4b_copay_wavdia_yn = 'Urg Care Copay Waived (IA) YN'
pbp_b4b_copay_wavd_mcdh = 'Urg Care MC Copay Waived D/H'
pbp_b4b_copay_wavdmc_dh = 'Urg Care MC Copay Waived D/H'
pbp_b4c_bendesc_yn = 'Worldwide Cov Desc YN'
pbp_b4c_bendesc_amo = 'Worldwide Cov Desc AMO'
pbp_b4c_wwc_maxplan_yn = 'Worldwide Cov MaxPlan YN'
pbp_b4c_wwc_maxplan_amt = 'Worldwide Cov MaxPlan Amt'
pbp_b4c_wwc_maxplan_per = 'Worldwide Cov MaxPlan Per'
pbp_b4c_maxenr_yn = 'Emerg Care Max Enr YN'
pbp_b4c_maxenr_amt = 'Emerg Care Max Enr Amt'
pbp_b4c_maxenr_per = 'Emerg Care Max Enr Per'
pbp_b4c_coins_yn = 'Emerg Care Coinsurance YN'
pbp_b4c_coins_pct_ww = 'Emerg Care WW Coins Pct'
pbp_b4c_coins_wavdww_yn = 'WW Coins Waived YN'
pbp_b4c_ded_yn = 'Emerg Care Deductible YN'
pbp_b4c_ded_amt = 'Emerg Care Deductible Amt'
pbp_b4c_copay_yn = 'Emerg Care Copayment YN'
pbp_b4c_copay_amt_ww = 'Emerg Care WW Copay Amt'
pbp_b4c_copay_wavdww_yn = 'WW Copay Waived YN'
;
run;
SAMPLE of A:
NAME | CODES | CODEVALUES |
bid_id | ||
pbp_a_ben_cov | 1 | Part A and Part B |
pbp_a_ben_cov | 2 | Part B Only |
pbp_a_plan_type | 1 | HMO |
pbp_a_plan_type | 18 | 1876 Cost |
pbp_a_plan_type | 19 | HCPP - 1833 Cost |
pbp_a_plan_type | 2 | HMOPOS |
pbp_a_plan_type | 20 | National Pace |
pbp_a_plan_type | 29 | Medicare Prescription Drug Plan |
pbp_a_plan_type | 30 | Employer/Union Only Direct Contract PDP |
pbp_a_plan_type | 31 | Regional PPO |
pbp_a_plan_type | 32 | Fallback |
pbp_a_plan_type | 4 | Local PPO |
pbp_a_plan_type | 40 | Employer/Union Only Direct Contract PFFS |
pbp_a_plan_type | 42 | RFB HMO |
pbp_a_plan_type | 43 | RFB HMOPOS |
pbp_a_plan_type | 44 | RFB Local PPO |
pbp_a_plan_type | 45 | RFB PSO (State License) |
pbp_a_plan_type | 47 | Employer Direct PPO |
pbp_a_plan_type | 48 | MMP HMO |
pbp_a_plan_type | 49 | MMP HMOPOS |
pbp_a_plan_type | 5 | PSO (State License) |
pbp_a_plan_type | 7 | MSA |
pbp_a_plan_type | 8 | RFB PFFS |
pbp_a_plan_type | 9 | PFFS |
pbp_b4a_coins_pct_mc_max | ||
pbp_b4a_coins_pct_mc_min | ||
pbp_b4a_coins_wavd_dh | ||
pbp_b4a_coins_wavdia_dh | 1 | Days |
pbp_b4a_coins_wavdia_dh | 2 | Hours |
pbp_b4a_coins_wavdia_yn | 1 | Yes |
pbp_b4a_coins_wavdia_yn | 2 | No |
pbp_b4a_coins_yn | 1 | Yes |
pbp_b4a_coins_yn | 2 | No |
SAMPLE of B:
pbp_a_hnumber | pbp_a_plan_identifier | pbp_a_ben_cov | pbp_a_plan_type | orgtype | bid_id | pbp_b4a_maxenr_yn | pbp_b4a_maxenr_per | pbp_b4a_coins_yn | pbp_b4a_coins_wavdia_yn | pbp_b4a_coins_wavdia_dh | pbp_b4a_ded_yn | pbp_b4a_copay_yn | pbp_b4a_copay_wavdia_yn | pbp_b4a_copay_wavdia_mi_dh | pbp_b4a_ded_er_cstshrs_yn | pbp_b4a_ded_er_cstshrs | pbp_b4b_maxenr_yn | pbp_b4b_maxenr_type | pbp_b4b_maxenr_per | pbp_b4b_coins_yn | pbp_b4b_coins_wavdmc_yn | pbp_b4b_coins_wavd_mcdh | pbp_b4b_ded_yn | pbp_b4b_copay_yn | pbp_b4b_copay_wavdia_yn | pbp_b4b_copay_wavd_mcdh | pbp_b4c_bendesc_yn | pbp_b4c_bendesc_amo | pbp_b4c_wwc_maxplan_yn | pbp_b4c_wwc_maxplan_per | pbp_b4c_maxenr_yn | pbp_b4c_maxenr_per | pbp_b4c_coins_yn | pbp_b4c_coins_wavdww_yn | pbp_b4c_ded_yn | pbp_b4c_copay_yn | pbp_b4c_copay_wavdww_yn | segment_id | version | pbp_b4a_maxenr_amt | pbp_b4a_coins_pct_mc_min | pbp_b4a_coins_pct_mc_max | pbp_b4a_coins_wavd_dh | pbp_b4a_ded_amt | pbp_b4a_copay_amt_mc_min | pbp_b4a_copay_amt_mc_max | pbp_b4a_copay_wavd_dh | pbp_b4b_maxenr_amt | pbp_b4b_coins_pct_mc_min | pbp_b4b_coins_pct_mc_max | pbp_b4b_coins_wavdmc_dh | pbp_b4b_ded_amt | pbp_b4b_copay_amt_mc_min | pbp_b4b_copay_amt_mc_max | pbp_b4b_copay_wavdmc_dh | pbp_b4c_wwc_maxplan_amt | pbp_b4c_maxenr_amt | pbp_b4c_coins_pct_ww | pbp_b4c_ded_amt | pbp_b4c_copay_amt_ww |
555 | 001 | 1 | 48 | 05 | 555 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0 | 4 | ||||||||||||||||||||||||||||||||||||||||||||
555 | 001 | 1 | 01 | 01 | 555 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 5 | 65 | 65 | 24 | 10 | 35 | 65 | |||||||||||||||||||||||||||
555 | 002 | 1 | 01 | 01 | 555 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 6 | 65 | 65 | 24 | 10 | 35 | 65 | |||||||||||||||||||||||||||
888 | 003 | 1 | 01 | 01 | 888 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 4 | 65 | 65 | 24 | 15 | 50 | 65 | |||||||||||||||||||||||||||
999 | 004 | 1 | 01 | 01 | 999 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 4 | 65 | 65 | 24 | 5 | 15 | 65 | |||||||||||||||||||||||||||
1010 | 005 | 1 | 01 | 01 | 1010 | 2 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 0 | 5 | 65 | 65 | 24 | 20 | 50 | 65 | |||||||||||||||||||||||||||
666 | 801 | 1 | 01 | 01 | 666 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 0 | 3 | 20 | 20 | 3 | 20 | 20 | 3 | ||||||||||||||||||||||||||||
666 | 802 | 1 | 01 | 01 | 666 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 0 | 3 | 20 | 20 | 3 | 20 | 20 | 3 | ||||||||||||||||||||||||||||
666 | 803 | 1 | 01 | 01 | 666 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 0 | 2 | 20 | 20 | 3 | 20 | 20 | 3 |
If you just want to attach the formats then the code doesn't really change much.
If you want to actually replace the code with the decodes then you will need new variables (even if the old was already character because of changes in length)
Basic syntax for one variable could be :
data new_b;
set b ;
new_pbp_a_plan_type = put(pbp_a_plan_type,$pbp_a_plan_type.);
drop pbp_a_plan_type ;
rename new_pbp_a_plan_type = pbp_a_plan_type;
run;
Now you generate those three lines of code from the list of variables with codelists.
Or you could create INFORMATS from the data in A and use those while reading the raw files and convert the code to the decodes on the way into the SAS dataset.
data a;
length fmtname $31 type $1 start $10 label $100. ;
infile cards truncover ;
input fmtname start label $100. ;
type='J';
if start ne ' ';
cards;
pbp_a_ben_cov 1 Part A and Part B
pbp_a_ben_cov 2 Part B Only
pbp_a_plan_type 1 HMO
pbp_a_plan_type 18 1876 Cost
pbp_a_plan_type 19 HCPP - 1833 Cost
pbp_a_plan_type 2 HMOPOS
pbp_a_plan_type 20 National Pace
pbp_a_plan_type 29 Medicare Prescription Drug Plan
pbp_a_plan_type 30 Employer/Union Only Direct Contract PDP
pbp_a_plan_type 31 Regional PPO
pbp_a_plan_type 32 Fallback
pbp_a_plan_type 4 Local PPO
pbp_a_plan_type 40 Employer/Union Only Direct Contract PFFS
pbp_a_plan_type 42 RFB HMO
pbp_a_plan_type 43 RFB HMOPOS
pbp_a_plan_type 44 RFB Local PPO
pbp_a_plan_type 45 RFB PSO (State License)
pbp_a_plan_type 47 Employer Direct PPO
pbp_a_plan_type 48 MMP HMO
pbp_a_plan_type 49 MMP HMOPOS
pbp_a_plan_type 5 PSO (State License)
pbp_a_plan_type 7 MSA
pbp_a_plan_type 8 RFB PFFS
pbp_a_plan_type 9 PFFS
pbp_b4a_coins_pct_mc_max . .
run;
proc format cntlin=a noprint ;
run;
proc sql noprint ;
select distinct catx(' ',fmtname,cats('$',fmtname,'.'))
into :infmtlist separated by ' '
from a
;
quit;
data b;
length id $5 pbp_a_ben_cov $17 pbp_a_plan_type $50 ;
input id $ pbp_a_ben_cov pbp_a_plan_type pbp_b4a_coins_pct_mc_max ;
informat &infmtlist ;
cards;
00001 1 1 50
00002 2 44 75
run;
proc print;
run;
The problem I'm having is DATA A set a is aleady defined and changes based on a selection made by another analyst who decides how many rows need to be included in the report.
This is done through an Excel grid that is imported in and filtered on one field. Right now there's only 86 fields to include in the report via DATA A, but once she is done defining the fileds that may change to 1,000.
Looking at the above examples means I would need to rebuild the program everytime she makes a change instead of referencing the data set created after the Excel file is imported.
Keep in mind - I am new to SAS...
Not sure I understand your concern. The program above is one that read in-line data to create an example of the "A" dataset. In your situation you would create that initially from the Excel file.
Excel is a poor data entry tool as it is designed to allow any type of value in any cell, but a database requires a single type of value in a single variable (column). So you might have some trouble getting the Excel file imported properly. Once you have it as a dataset you can use it to define INFORMATs or FORMATs using the example above.
Now in terms of automating the rest of the processing you might need to resort to some type of code generation.
One step that perhaps you should do is to take the list of values NAME column of the A spreadsheet and compare it to the list of variables in the B dataset.
For example:
proc contents data=B noprint out=contents ; run;
proc sql noprint ;
create table varlist as
select distinct c.name , c.type, case when (A.codes is null) then 0 else 1 end as formatted
from contents c left join A
on upcase(c.name) = upcase(a.name)
order by 1
;
quit;
Is the "B" data set
Hash Table :
DATA A; length name $ 30 codes $ 10 codevalues $50 ; input NAME CODES CODEVALUES ; cards; Crayon 01 Blue Crayon 02 Red Crayon 03 Yellow PayFine . . Fruit 01 Apple Fruit 02 Watermelon ChargeTax 01 Yes ChargeTax 02 No ; run; DATA B; input (ID Crayon PayFine Fruit ChargeTax ) ( : $40.); cards; 12345 01 Yes 02 01 67891 03 No 02 02 00001 02 . 01 01 ; run; data want; if _n_ eq 1 then do; if 0 then set a; declare hash h(dataset:'a'); h.definekey('name','codes'); h.definedata('codevalues'); h.definedone(); end; set b; array x{*} $ Crayon--ChargeTax ; do i=1 to dim(x); name=vname(x{i});codes=x{i}; rc=h.find(); if rc=0 then x{i}=coalescec(codevalues,x{i}); end; drop NAME CODES CODEVALUES rc i; run;
Xia Keshan
Dsrountree I agree with Tom but as you are new to SAS the concepts of formats is a hurdle.
- In the concetps manual it is not made clear. SAS(R) 9.4 Language Reference: Concepts, Second Edition (variable attributes)
- in the formats manual you are just having a list SAS(R) 9.4 Formats and Informats: Reference
- the "proc format" is showing how to define those Base SAS(R) 9.4 Procedures Guide, Second Edition using a dataset table as input is just one of the methods. SAS options are to make formats accessible for usage.
Go for some proceeding papers. I think this one does some explanation: http://www2.sas.com/proceedings/forum2008/174-2008.pdf
The idea:
Build a format librarie out of table A or a collection of A-child tables using sas sources (phase definition)
This can be repeated as long new updates are coming in. There are 3 objects - 1/ table input 2/ sas-code 3/ formats in catalogs
Only the members of type 3/ will be use at execution time processing all tables of B-type.
Prcessing tables of B-type you can find:
- wrong type of values or a new one within those tables. Just using that related format
- reusing a format when a same type of variable is used more often.
The format is reacting a dynamic table lookup having the allowed values being defined without actually changing any data.
It is all a presentation view to data.
Thanks Everyone - The problem is the input file that determines DATA A was provided to me by CMS (Government) so I can't change the input source.
This file (DATA A) is currently being validated by an internal analyst looking to validate which variables to include in the report.
I like both approaches, but it was easier to program the array since I have a list that will keep changing until she is done updating the template.
Thanks for the input Tom, Jaap and Ksharp.
Wait until I ask the next question - I'll post in a little while. "How to Flip/Transpose" my final data set
ID | Crayon | PayFine | Fruit | ChargeTax |
12345 | Blue | Yes | Watermelon | Yes |
67891 | Yellow | No | Watermelon | No |
00001 | Red | Apple | Yes |
ID | ID | ID |
12345 | 67891 | 00001 |
Crayon | Crayon | Crayon |
Blue | Yellow | Red |
PayFine | PayFine | PayFine |
Yes | No | |
Fruit | Fruit | Fruit |
Watermelon | Watermelon | Apple |
ChargeTax | ChargeTax | ChargeTax |
Yes | No | Yes |
Dsrountree I was already thinking on your directions that is why the formats concept can be that important.
It is the same setting of standards not being able to change by users afterwards. Should be something SAS institute is promoting./ knowing / aware off..
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.