BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serena13lee
Quartz | Level 8

I am trying to join the following two data sets: 

data testA;
input categorical $3. value;
*order = _n_;
datalines;
Dog. 
M  7
F  5
Cat.
M  4
F  2
;
run;

data testA;
set testA;
order=_n_;
run;

data testB;
input categorical $2. value;
datalines;
Dog. 
F  3
Cat.
M  1
F  2
;
run;

proc sql;
create table final as
select a.*,b.* from testA a left join testB b on a.categorical=b.categorical
order by order;
quit;

My desire output is the following: 

data testA;
input categorical $ value value2;
datalines;
Dog . .
M 7 .
F 5 3
Cat . .
M 4 1
F 2 2
;
run;

 The problems I've encountered is that 1) the 'categorical' id is not sorted alphabetically and I do not want to change its order 2) Since there are two Ms and Fs I don't know how to join without renaming the M F so that it's unique 3) It could be an inner join since what may be in value may not be in value2

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21
/* Modify the way you read your data:  */
data testA;
retain cat1;
input cat2 $3. value;
if missing(value) then cat1 = cat2;
else output;
datalines;
Dog. 
M  7
F  5
Cat.
M  4
F  2
;

data testB;
retain cat1;
input cat2 $3. value;
if missing(value) then cat1 = cat2;
else output;
datalines;
Dog. 
F  3
Cat.
M  1
F  2
;

/* Then you can use:  */
proc sql;
create table final as
select 
    a.*,
    b.value as value2 
from 
    testA a left join 
    testB b on a.cat1=b.cat1 and a.cat2=b.cat2
order by cat1 desc, cat2 desc;
quit;
PG

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Must it be using proc sql? Coz any proc sql join would be product of keys in both tables and that means will do many to many based on key value and not on a position. The ideal is a merge. 

serena13lee
Quartz | Level 8
@novinosrin No it doesn't have to be proc sql! I just thought that would be a good method to start.
novinosrin
Tourmaline | Level 20
data testA;
input categorical $3. value;
order = _n_;
datalines;
Dog . 
M  7
F  5
Cat .
M   4
F   2
;
run;
data testB;
input categorical $3. value;
datalines;
Dog . 
F   3
Cat .
M   1
F   2
;
run;

proc sort data=testa out=_testa;
by categorical;
run;
proc sort data=testb out=_testb;
by categorical;
run;


data temp;
merge _testa _testb(rename=(value=value2));
by categorical;
run;

proc sort data=temp out=want(drop=order);
by order;
run;
novinosrin
Tourmaline | Level 20

Also look at modify statement, your datasets appear to be a good case for applying modify statement  as master dataset does not have to sorted.

If my understanding is right, modify statement with key categorical variable indexed look up would be a sound approach. 

serena13lee
Quartz | Level 8

Thanks so much for your reply! After running, everything looks except there is a 1 under value2 Dog (M). Can I ask how I can remove that in the case of a blank?

ShiroAmada
Lapis Lazuli | Level 10

Try this

data testA;
input categorical: $3. value: 8.;
*order = _n_;
datalines;
Dog . 
M  7
F  5
Cat .
M  4
F  2
;
run;


data testA;
set testA;
order=_n_;
run;

proc sort data=testa;
  by categorical order;
run;

data testa;
  set testa;
  by categorical;
  if first.categorical then 
  id=0;
  id+1;
run;

data testB;
input categorical: $3. value: 8.;
datalines;
Dog . 
F 3
Cat .
M  1
F  2
;
run;

data testb;
set testb;
order=_n_;
run;

proc sort data=testb;
  by categorical order;
run;

data testb;
  set testb;
  by categorical;
  if first.categorical then id=0;
  id+1;
run;

proc sql;
create table final as
select 
  coalescec(a.categorical,b.categorical) as categorical,
  a.value as valuea,
  b.value as valueb,
  a.order
  from 
 testA a left join testB b on a.categorical=b.categorical
 and a.id=b.id
order by a.order;
quit;

Double-check your desired output.  I think there are errors in it.

 

34reqrwe
Quartz | Level 8

Try restructuring your original datasets by adding a species(dog/cat) and gender (m/f) columns  . It will make your life easier .

PGStats
Opal | Level 21
/* Modify the way you read your data:  */
data testA;
retain cat1;
input cat2 $3. value;
if missing(value) then cat1 = cat2;
else output;
datalines;
Dog. 
M  7
F  5
Cat.
M  4
F  2
;

data testB;
retain cat1;
input cat2 $3. value;
if missing(value) then cat1 = cat2;
else output;
datalines;
Dog. 
F  3
Cat.
M  1
F  2
;

/* Then you can use:  */
proc sql;
create table final as
select 
    a.*,
    b.value as value2 
from 
    testA a left join 
    testB b on a.cat1=b.cat1 and a.cat2=b.cat2
order by cat1 desc, cat2 desc;
quit;
PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1428 views
  • 4 likes
  • 5 in conversation