SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

"Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Accepted Solution Solved
Reply
Super Contributor
Posts: 413
Accepted Solution

"Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Hi,

I have one file with names and categories, the categories ranging from 1 to 4:

name category
a1
b2
c3
d4
e1
f2

and I want to merge it with another file that contains the names, but here there are 4 category columns each for every category:

namecategory1category2category3category4
a1000
b0100
c0010
d0001
e1000
f0100

like this in the second table I get dummy variables and then do a regression analysis. So each category for each name in the first file I put a 1 into the column that corresponds to that category and zero in the other categories, for the same name.

Please note that the number of categories can change and therefore the code should be flexible to accommodate this

Thank you


Accepted Solutions
Solution
‎11-30-2014 07:55 PM
Respected Advisor
Posts: 3,124

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Then a little bit Hash would be handy:

data have;

     input

           name $     category;

     cards;

a 1

a 3

b 2

c 3

c 4

d 4

e 1

f 2

a 2

;

/*otherwise, this could do*/

data _null_;

     if _n_=1 then

           do;

                dcl hash h(ordered:'a');

                h.definekey('name');

                h.definedata('name', 'category1', 'category2', 'category3', 'category4');

                h.definedone();

           end;

     set have end=last;

     array cat(4) category1-category4;

     do i=1 to dim(cat);

           cat(i)=0;

     end;

     rc=h.find();

     cat(category)=1;

     rc=h.replace();

     if last then

           h.output(dataset:'want');

run;


Haikuo

View solution in original post


All Replies
Respected Advisor
Posts: 3,124

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

data have;

     input

           name $     category;

     cards;

a 1

b 2

c 3

d 4

e 1

f 2

;

/*as a simple starter, if no duplicates existing for name, try the following*/

data want;

     set have;

     array cat category1-category4;

     do over cat;

           if _i_=category then

                cat=1;

           else cat=0;

     end;

run;

Super Contributor
Posts: 413

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Hi Hai and thanks for the quick reply!!!

Actually, you mentioned a very important point that I overlooked in the description and that is that I do have name duplicates in the second file because the second file contains the board of directors for a given company for a period of 10 years and therefore in most cases many names will be present for all the years of the given company.

Also, some directors can be present for different companies in the same year.

Thank you

Solution
‎11-30-2014 07:55 PM
Respected Advisor
Posts: 3,124

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Then a little bit Hash would be handy:

data have;

     input

           name $     category;

     cards;

a 1

a 3

b 2

c 3

c 4

d 4

e 1

f 2

a 2

;

/*otherwise, this could do*/

data _null_;

     if _n_=1 then

           do;

                dcl hash h(ordered:'a');

                h.definekey('name');

                h.definedata('name', 'category1', 'category2', 'category3', 'category4');

                h.definedone();

           end;

     set have end=last;

     array cat(4) category1-category4;

     do i=1 to dim(cat);

           cat(i)=0;

     end;

     rc=h.find();

     cat(category)=1;

     rc=h.replace();

     if last then

           h.output(dataset:'want');

run;


Haikuo

Respected Advisor
Posts: 3,124

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Also, if your output table is too large to fit into your memory (where Hash table resides), then the following will do the same thing by adding some steps:

data have;

     input

           name $     category;

     cards;

a 1

a 3

b 2

c 3

c 4

d 4

e 1

f 2

a 2

;

data h2;

     set have;

     array cat(4) category1-category4;

     cat(category)=1;

run;

proc sort data=h2 out=h3;

     by name;

run;

data want;

     update h3(obs=0) h3;

     by name;

     drop category;

run;

Super Contributor
Posts: 413

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Hi Hai,

I ran the code of #3 but unfortunately got an error message:

31   data _null_;

32        if _n _=1 then

                -

                22

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, -, /, ;, <, <=, <>, =, >, ><, >=, AND, EQ,

              GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, [, ^=, {, |, ||, ~=.

33              do;

34                   dcl hash h(ordered:'a');

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.

Respected Advisor
Posts: 3,124

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

For some reason, there is an additional blank inserted into the variable name as _n _, where it should be _n_:

if _n_=1 then

Super Contributor
Posts: 413

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

Hi Hai,

actually I managed to do the merge based on the very first code that you put because I realized that the categories file from which I had to make the new 4 category columns didn't have any duplicated since I DISTINCTed it in the very beggining, so I guess the first code is the best answer.

Thanks a lot for helping me!!!

SAS Employee
Posts: 15

Re: "Merging" 2 tables by transferring values of a column of a table into multiple columns of the other table

proc sql;

create table want as

select name,

category=1 as category1,

category=2 as category2,

category=3 as category3,

category=4 as category4

from have;

quit;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 423 views
  • 0 likes
  • 3 in conversation