DATA Step, Macro, Functions and more

Merging 3 datasets with different data formats

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Merging 3 datasets with different data formats

Good day

 

I want to merge the following datasets, what is the easiest way to end up having the final output:

 

Table 1:

Identity

GRADE

KEY

1

None

T1

2

Mild

T1

3

Moderate

T1

4

Severe

T1

 

Table 2:

ID

GR

KEY

1

0

T2

2

1

T2

3

2

T2

4

3

T2

Where GR 0=None 1=Mild 2=Moderate 3=Severe

 

 

Table 3:

UID

GRAD

KEY

 1

 0(None)

 T3

 2

 1(Mild)

 T3

 3

 2(Moderate)

 T3

 4

 3(Severe)

 T3

 

Final Table:

Identity

GRADE

KEY

1

None

T1

2

Mild

T1

3

Moderate

T1

4

Severe

T1

1

None

T2

2

Mild

T2

3

Moderate

T2

4

Severe

T2

1

None

T3

2

Mild

T3

3

Moderate

T3

4

Severe

T3

 

 

Please advise.

Thank you.


Accepted Solutions
Solution
‎04-12-2018 10:47 AM
Super User
Posts: 6,780

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

I would lean toward a DATA step:

 

data want;

set table_1

   table_2 (rename=(id=identity gr=grade))

   table_3 (rename=(uid=identity grad=grade));

if grade =: '0' then grade='None';

else if grade =: '1' then grade='Mild';

else if grade =: '2' then grade='Moderate';

else if grade =: '3' then grade='Severe';

run;

 

Being able to use =: makes the coding a little simpler.

View solution in original post


All Replies
Super User
Super User
Posts: 9,599

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

Post test data in the form of a datastep.  I can't tell from what is posted what the datatypes are, is GR in the second numeric?  Also you say you want to merge these - but have not provided a join mechanism and from the text it looks like you want to set them together one under the other?  If so:

proc sql;
  create table want as
  select identity,
         grade,
         key
  from   table1
  union all
  select id as identity,
         case when gr=0 then "None"
              when gr=1 then "Mild"
              when gr=2 then moderate
              else "Severe" end as grade,
         key
  from   table2
  union all
  select uid as identity,
         case when grad="0(None)" then "None"
              when grad=....  end as grade,   /* Note I haven't type them all out, expand! */ 
         key
  from   table3;
quit;
Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

I wanted to set them together. Thank you for the response it will assist.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

I created an example based on the datasets I am working on in a project, I haven't merged the datasets on the post per se they are used as an example based on what I have in the current project working on.

I used a datastep in my project and renamed the columns then used DO LOOPS and ARRAYS for formatting the values. Below is one of the statements I have in my datastep:

 

array Grad{5} $40 ('1' '2' '3' '4' '5');
  array Tgrad{5} $40 ('GRADE 1 (MILD)' 'GRADE 2 (MODERATE)' 'GRADE 3 (SEVERE)' 'GRADE 4 (LIFE-THREATENING)' 'GRADE 5 (DEATH)');
  array GR{5} $40 ('GRADE I (MILD)' 'GRADE II (MODERATE)' 'GRADE III (SEVERE)' 'GRADE IV (POTENTIALLY LIFE-THREATENING)' 'GRADE V (DEATH)');

  If AESEV in ('1', '2','3','4','GRADE 1 (MILD)', 'GRADE 2 (MODERATE)' ,'GRADE 3 (SEVERE)' ,'GRADE 4 (LIFE-THREATENING)', 'GRADE 5 (DEATH)') then do;
        do i=1 to 5;
      *Creating numeric fields;
         if AESEV = Grad[i] then AESEVC = GR[i];
    ELSE if AESEV = Tgrad[i] then AESEVC = GR[i];
       end;
  IF AESEV in ('1', '2','3','4') THEN AESEVCN = input(AESEV,best12.);
  End;

 

Solution
‎04-12-2018 10:47 AM
Super User
Posts: 6,780

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

I would lean toward a DATA step:

 

data want;

set table_1

   table_2 (rename=(id=identity gr=grade))

   table_3 (rename=(uid=identity grad=grade));

if grade =: '0' then grade='None';

else if grade =: '1' then grade='Mild';

else if grade =: '2' then grade='Moderate';

else if grade =: '3' then grade='Severe';

run;

 

Being able to use =: makes the coding a little simpler.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

Posted in reply to Astounding

Thank you for the response. Mind explaining the function of equal sign with ':' ?

Super User
Posts: 6,780

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

The colon can follow any character comparison (not just an equal sign).  It addresses what should happen when the two character strings being compared have different lengths.  The normal action (without a colon) is to pad the shorter string with blanks so that the two strings have equal length before making the comparison.  The colon reverses that:  truncate the longer string down to the length of the shorter string before making the comparison.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

Posted in reply to Astounding

Thanks so much for the response

Super User
Posts: 13,561

Re: Merging 3 datasets with different data formats

[ Edited ]
Posted in reply to Dinkepile

If the purpose is to display a specific text value for a given numeric value then a common method in SAS is to create a custom display format.

Example:

proc format library=work;
value react
0 = 'None'
1 = 'Mild'
2 = 'Moderate'
3 = 'Severe'
;
run;

data example;
   do x= 0 to 3;
   output;
   end;
run;
proc print data=example;
format x react.;
run;

Formats have a couple of advantages over merging data for lookups. First once the format is available it can be applied to multiple variables that have the same code meanings. See:

 

data example2;
   do x= 0 to 3;
      do y = 1 to 3;
         output;
      end;
   end;
run;
proc print data=example2;
format x y react.;
run;

Another example is that a change in format can create a different grouping for analysis. Suppose you want to get counts of None vs Some reaction (one interpretation of the mild/moderate/severe)

 

proc format library=work;
value reactnone
0='None'
other='Some'
;
run;
proc freq data=example2;
   tables x*y /nocol norow nopercent;
   format x y reactnone.;
run;

Which completely avoids another merge step and/or adding new variables everytime you want look at different combinations of the values. Minor caution: other in the Proc format code will assing missing to 'Some', if you have missing data it is usually a good idea to consider a specific assignment with formats.

 

The information in your first two tables could be combined to create a data set that can be read by proc format with the cntlin option.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

Thank you so much

Super User
Super User
Posts: 9,599

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

It is good practice to mark the person who posted the correct answer as the correct answer, not your own response.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

Good day

 

I see what you mean, I actually thought I mark on the responder. Can I edit this to mark correct one?

Super User
Super User
Posts: 9,599

Re: Merging 3 datasets with different data formats

Posted in reply to Dinkepile

No probs, I see you got it.  Just keeps the answer up with the question that way, and of course rewards the poster.

Occasional Contributor
Posts: 8

Re: Merging 3 datasets with different data formats

 Thanks for highlighting, it helps, I am new in the community Smiley Happy. I rectified it.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 157 views
  • 4 likes
  • 4 in conversation