Good day
I want to merge the following datasets, what is the easiest way to end up having the final output:
Table 1:
Identity | GRADE | KEY |
1 | None | T1 |
2 | Mild | T1 |
3 | Moderate | T1 |
4 | Severe | T1 |
Table 2:
ID | GR | KEY |
1 | 0 | T2 |
2 | 1 | T2 |
3 | 2 | T2 |
4 | 3 | T2 |
Where GR 0=None 1=Mild 2=Moderate 3=Severe
Table 3:
UID | GRAD | KEY |
1 | 0(None) | T3 |
2 | 1(Mild) | T3 |
3 | 2(Moderate) | T3 |
4 | 3(Severe) | T3 |
Final Table:
Identity | GRADE | KEY |
1 | None | T1 |
2 | Mild | T1 |
3 | Moderate | T1 |
4 | Severe | T1 |
1 | None | T2 |
2 | Mild | T2 |
3 | Moderate | T2 |
4 | Severe | T2 |
1 | None | T3 |
2 | Mild | T3 |
3 | Moderate | T3 |
4 | Severe | T3 |
Please advise.
Thank you.
I would lean toward a DATA step:
data want;
set table_1
table_2 (rename=(id=identity gr=grade))
table_3 (rename=(uid=identity grad=grade));
if grade =: '0' then grade='None';
else if grade =: '1' then grade='Mild';
else if grade =: '2' then grade='Moderate';
else if grade =: '3' then grade='Severe';
run;
Being able to use =: makes the coding a little simpler.
Post test data in the form of a datastep. I can't tell from what is posted what the datatypes are, is GR in the second numeric? Also you say you want to merge these - but have not provided a join mechanism and from the text it looks like you want to set them together one under the other? If so:
proc sql; create table want as select identity, grade, key from table1 union all select id as identity, case when gr=0 then "None" when gr=1 then "Mild" when gr=2 then moderate else "Severe" end as grade, key from table2 union all select uid as identity, case when grad="0(None)" then "None" when grad=.... end as grade, /* Note I haven't type them all out, expand! */ key from table3; quit;
I wanted to set them together. Thank you for the response it will assist.
I created an example based on the datasets I am working on in a project, I haven't merged the datasets on the post per se they are used as an example based on what I have in the current project working on.
I used a datastep in my project and renamed the columns then used DO LOOPS and ARRAYS for formatting the values. Below is one of the statements I have in my datastep:
array Grad{5} $40 ('1' '2' '3' '4' '5');
array Tgrad{5} $40 ('GRADE 1 (MILD)' 'GRADE 2 (MODERATE)' 'GRADE 3 (SEVERE)' 'GRADE 4 (LIFE-THREATENING)' 'GRADE 5 (DEATH)');
array GR{5} $40 ('GRADE I (MILD)' 'GRADE II (MODERATE)' 'GRADE III (SEVERE)' 'GRADE IV (POTENTIALLY LIFE-THREATENING)' 'GRADE V (DEATH)');
If AESEV in ('1', '2','3','4','GRADE 1 (MILD)', 'GRADE 2 (MODERATE)' ,'GRADE 3 (SEVERE)' ,'GRADE 4 (LIFE-THREATENING)', 'GRADE 5 (DEATH)') then do;
do i=1 to 5;
*Creating numeric fields;
if AESEV = Grad[i] then AESEVC = GR[i];
ELSE if AESEV = Tgrad[i] then AESEVC = GR[i];
end;
IF AESEV in ('1', '2','3','4') THEN AESEVCN = input(AESEV,best12.);
End;
I would lean toward a DATA step:
data want;
set table_1
table_2 (rename=(id=identity gr=grade))
table_3 (rename=(uid=identity grad=grade));
if grade =: '0' then grade='None';
else if grade =: '1' then grade='Mild';
else if grade =: '2' then grade='Moderate';
else if grade =: '3' then grade='Severe';
run;
Being able to use =: makes the coding a little simpler.
Thank you for the response. Mind explaining the function of equal sign with ':' ?
The colon can follow any character comparison (not just an equal sign). It addresses what should happen when the two character strings being compared have different lengths. The normal action (without a colon) is to pad the shorter string with blanks so that the two strings have equal length before making the comparison. The colon reverses that: truncate the longer string down to the length of the shorter string before making the comparison.
Thanks so much for the response
If the purpose is to display a specific text value for a given numeric value then a common method in SAS is to create a custom display format.
Example:
proc format library=work; value react 0 = 'None' 1 = 'Mild' 2 = 'Moderate' 3 = 'Severe' ; run; data example; do x= 0 to 3; output; end; run; proc print data=example; format x react.; run;
Formats have a couple of advantages over merging data for lookups. First once the format is available it can be applied to multiple variables that have the same code meanings. See:
data example2; do x= 0 to 3; do y = 1 to 3; output; end; end; run; proc print data=example2; format x y react.; run;
Another example is that a change in format can create a different grouping for analysis. Suppose you want to get counts of None vs Some reaction (one interpretation of the mild/moderate/severe)
proc format library=work; value reactnone 0='None' other='Some' ; run; proc freq data=example2; tables x*y /nocol norow nopercent; format x y reactnone.; run;
Which completely avoids another merge step and/or adding new variables everytime you want look at different combinations of the values. Minor caution: other in the Proc format code will assing missing to 'Some', if you have missing data it is usually a good idea to consider a specific assignment with formats.
The information in your first two tables could be combined to create a data set that can be read by proc format with the cntlin option.
Thank you so much
It is good practice to mark the person who posted the correct answer as the correct answer, not your own response.
Good day
I see what you mean, I actually thought I mark on the responder. Can I edit this to mark correct one?
No probs, I see you got it. Just keeps the answer up with the question that way, and of course rewards the poster.
Thanks for highlighting, it helps, I am new in the community :). I rectified it.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.