BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alex_philby
Obsidian | Level 7

Hello,

 

It should be pretty simple but I can't find a way to do this quickly.

 

I have table 1 that looks like this :

 

 abc
plumber21a121a121a1
fireman47b147b247b3
policeman47b147b247b3

 

and table 2 that looks like this :

 

occupationlevel
plumbera
plumbera
firemanc
policemanb
policemana

 

And I would like to add a column to table 2, like this :

 

occupationlevelcode
plumbera21a1
plumbera21a1
firemanc47b3
policemanb47b2
policemana47b1

 

I was thinking of a complicated solution with a merge but I'm sure there is some more efficient solution. Also, in my table 1, I have way more columns than just a, b, c. So i'm looking for some function that can read columns names and put it in relation with the modalities of the "level" variable in table 2.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

First step: transpose table1 to a long format:

proc sort data=table1;
by occupation;
run;

proc transpose
  data=table1
  out=table1_trans (
    rename=(col1=code _name_=level)
    where=(level ne 'occupation')
  )
;
by occupation notsorted;
var _all_;
run;

From this, you can either

  • merge with table2 by occupation and level (sort table2 by occupation and level)
  • load a hash table in a straight data step (no sorting of table 2 needed)
  • create a format for the combined values of occupation and level, and apply this on a combined value of occupation !! level in table2 (also no sorting needed)

A hash solution will look like this:

data table1;
input occupation :$15. a $ b $ c $;
cards;
plumber 21a1 21a1 21a1
fireman 47b1 47b2 47b3
policeman 47b1 47b2 47b3
;
run;

data table2;
input occupation :$15. level $;
cards;
plumber a
plumber a
fireman c
policeman b
policeman a
;
run;

proc sort data=table1;
by occupation;
run;

proc transpose
  data=table1
  out=table1_trans (
    rename=(col1=code _name_=level)
    where=(level ne 'occupation')
  )
;
by occupation;
var _all_;
run;

data want;
set table2;
length code $4;
if _n_ = 1
then do;
  declare hash codes (dataset:'table1_trans');
  rc = codes.definekey('occupation','level');
  rc = codes.definedata('code');
  rc = codes.definedone();
end;
rc = codes.find();
drop rc;
run;

Note the datasteps for the example datasets. They make it much easier for the helpers to recreate data.

 

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data table1;
length occupation $20;
input occupation$ a$ b$ c$;
datalines;
plumber 21a1 21a1 21a1
fireman 47b1 47b2 47b3
policeman 47b1 47b2 47b3
;

data table2;
length occupation $20;
input occupation $ level $;
datalines;
plumber a
plumber a
fireman c
policeman b
policeman a
;

data table1_long(keep=occupation level code);
   set table1;
   array levels{*} a b c;
   do i=1 to dim(levels);
      level=vname(levels[i]);
      code=levels[i];
      output;
   end;
run;

proc sql;
   create table want as
   select a.*
         ,b.code
   from table2 as a, table1_long as b
   where a.occupation=b.occupation and a.level=b.level;
quit;

  

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data as a datastep!  I am not typing in test data.  As such this is untested:

data want;
  merge a b;
  by occupation;
  array v{3} a b c;
  do i=1 to 3;
    if level=vname(v{i}) then code=v{i};
  end;
run;

Thats using an array.  A simpler, more effective solution overall and for all your data storage needs is to store data in normalised form, i.e. more rows, less columns.  So change table 1 to look like:
plumber a  21a1

plumber b 21a1

...

 

then it is simply merging by occupation level.  You can simply transpose table 1 to get this format.  E.g

data table1;
  input occupation $ a $ b $ c $;
datalines;
plumber   21a1   21a1   21a1
fireman   47b1   47b2   47b3
policeman   47b1   47b2   47b3
;
run;

proc sort data=table1;
  by occupation;
run;

proc transpose data=table1 out=table1;
  by occupation;
  var a b c;
run;
Kurt_Bremser
Super User

First step: transpose table1 to a long format:

proc sort data=table1;
by occupation;
run;

proc transpose
  data=table1
  out=table1_trans (
    rename=(col1=code _name_=level)
    where=(level ne 'occupation')
  )
;
by occupation notsorted;
var _all_;
run;

From this, you can either

  • merge with table2 by occupation and level (sort table2 by occupation and level)
  • load a hash table in a straight data step (no sorting of table 2 needed)
  • create a format for the combined values of occupation and level, and apply this on a combined value of occupation !! level in table2 (also no sorting needed)

A hash solution will look like this:

data table1;
input occupation :$15. a $ b $ c $;
cards;
plumber 21a1 21a1 21a1
fireman 47b1 47b2 47b3
policeman 47b1 47b2 47b3
;
run;

data table2;
input occupation :$15. level $;
cards;
plumber a
plumber a
fireman c
policeman b
policeman a
;
run;

proc sort data=table1;
by occupation;
run;

proc transpose
  data=table1
  out=table1_trans (
    rename=(col1=code _name_=level)
    where=(level ne 'occupation')
  )
;
by occupation;
var _all_;
run;

data want;
set table2;
length code $4;
if _n_ = 1
then do;
  declare hash codes (dataset:'table1_trans');
  rc = codes.definekey('occupation','level');
  rc = codes.definedata('code');
  rc = codes.definedone();
end;
rc = codes.find();
drop rc;
run;

Note the datasteps for the example datasets. They make it much easier for the helpers to recreate data.

 

alex_philby
Obsidian | Level 7

Thank you for the explanations, I think I get the idea. Your program great because I don't have to type all the names of the columns in table 1.

 

 

Kurt_Bremser
Super User

proc transpose is one of the most useful tools to beat data into easily usable form. It allows you to have code that needs no explicit variable names where those are undetermined.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1898 views
  • 4 likes
  • 4 in conversation