BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

What is the way to create data set want from data set have via proc transpose(or other way)?

data have;
input tbl $ field $;
cards;
tbl1 x
tbl1 y
tbl1 z
tbl2 w
tbl2 x
tbl2 v
tbl2 y
;
run;

data want;
input tbl1 $ tbl2 $ ;
cards;
x w
y x
z v
. y
;
Run;
12 REPLIES 12
Ksharp
Super User
data have;
input tbl $ field $;
cards;
tbl1 x
tbl1 y
tbl1 z
tbl2 w
tbl2 x
tbl2 v
tbl2 y
;
run;
proc sort data=have;
by tbl;
run;
proc freq data=have order=data noprint;
by tbl;
table field/out=temp;
run;
proc transpose data=temp out=temp2;
by tbl;
var field;
run;
proc transpose data=temp2 out=want(drop=_:);
var col:;
id tbl;
run;
ballardw
Super User

Are your real variable values for tbl sequentially numbered with common name? If not do you want the actual text of the "tbl" to be the name of the resulting variable? And if that is the case have you verified than all of the values of "tbl" are valid variable names prior to this step?

 

If so this may be closer to what you want: (Basically the same as @Ksharp but without a step that forces the numeric naming).

This assumes that the values of "field" are grouped by the actual "tbl", as in the likely output from requesting the names of all the tables and variables from a library from the dictionary tables.  Note that depending on the setting of VALIDMEMNAME and VALIDVARNAME when that library was created and this code is submitted you might have table names that are not acceptable as variable names.

proc transpose data=have out=trans1
   ;
  by tbl notsorted;
  var field;
run;

proc transpose data=trans1 out=want (drop=_name_)
     ;
  by _name_;
  var col: ;
  id tbl;
run;
Tom
Super User Tom
Super User

I would skip the transpose steps and just do my own code generation.

filename code temp;
data _null_;
  set have end=eof;
  by tbl;
  file code;
  if _n_=1 then put 'merge';
  if first.tbl then put
   '  have(rename=(field=' tbl ') where=(' tbl =:$quote. '))'
  ;
  if eof then put ';' ;
run;

data want;
%include code / source2;
  drop tbl ;
run;

Here is the step that is generated for this example:

13   data want;
14   %include code / source2;
NOTE: %INCLUDE (level 1) file CODE is (system-specific pathname).
15  +merge
16  +  have(rename=(field=tbl1 ) where=(tbl="tbl1" ))
17  +  have(rename=(field=tbl2 ) where=(tbl="tbl2" ))
18  +;
NOTE: %INCLUDE (level 1) ending.
19     drop tbl ;
20   run;

NOTE: The data set WORK.WANT has 4 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
Tom
Super User Tom
Super User

If you wanted to use PROC TRANSPOSE you would need to add the missing row identifier that uniquely identifies how to pair the values for field for TBL1 with the values of field for TBL2.  And then reorder the data by this row identifier.

data step1;
  do row=1 by 1 until(last.tbl);
    set have;
    by tbl;
    output;
  end;
run;

proc sort data=step1;
  by row tbl ;
run;

Result

OBS    row    tbl     field

 1      1     tbl1      x
 2      1     tbl2      w
 3      2     tbl1      y
 4      2     tbl2      x
 5      3     tbl1      z
 6      3     tbl2      v
 7      4     tbl2      y

Now you can transpose it.

proc transpose data=step1 out=want(drop=_name_);
  by row;
  id tbl;
  var field;
run;

To get

OBS    row    tbl1    tbl2

 1      1      x       w
 2      2      y       x
 3      3      z       v
 4      4              y
Ksharp
Super User

Tom,

No  need to make a SEQ variable, proc transpose would take care of it.

 

data have;
input tbl $ field $;
cards;
tbl1 x
tbl1 y
tbl1 z
tbl2 w
tbl2 x
tbl2 v
tbl2 y
;
run;

proc transpose data=have out=temp2;
by tbl;
var field;
run;
proc transpose data=temp2 out=want(drop=_:);
var col:;
id tbl;
run;

Ksharp_0-1747617596624.png

 

Tom
Super User Tom
Super User

Does it work when there are more than 10? 

Does the fact that the COL variables will be named COL1 ... COL9 COL10 COL11 (and so will sort into COL1 COL10 COL11 COL2 COL3 ... ) matter?

 

 

Ksharp
Super User

Tom,

I don't understand what you mean.

COLn has nothing to do with sorting 'COL1', 'COL2', ......

Here my last proc transpose is unlike ballardw 's code . I did not use "BY _NAME_;".

Check this:

data have;
input tbl $ field $;
cards;
tbl1 c
tbl1 b
tbl1 b
tbl1 a
tbl1 a
tbl1 a
tbl2 w
tbl2 x
tbl2 d
tbl2 a
tbl2 w
tbl2 x
tbl2 d
tbl2 a
tbl2 w
tbl2 x
tbl2 d
tbl2 a
tbl2 w
tbl2 x
tbl2 d
tbl2 a
;
run;

proc transpose data=have out=temp2;
by tbl;
var field;
run;
proc transpose data=temp2 out=want(drop=_:);
var col:;
id tbl;
run;
Kurt_Bremser
Super User

This is for the data you posted:

data want;
length tbl1 tbl2 $8;
call missing(of tbl:);
merge
  have (where=(tbl="tbl1") rename=(field=tbl1))
  have (where=(tbl="tbl2") rename=(field=tbl2))
;
drop tbl;
run;

For more values of tbl, run a preliminary step to get the list of values, and use macro %DO loops to create the code dynamically:

%macro want;
%local tables tbl i;

proc sql noprint;
select distinct tbl into :tables separated by " "
from have;
quit;
data want;
length
&tables
$8
;
call missing(of tbl:);
merge
%do i = 1 %to %sysfunc(countw(&tables));
  %let tbl = %scan(&tables,&i);
  have (where=(tbl="&tbl.") rename=(field=&tbl.))
%end;
;
drop tbl;
run;

%mend;

%want
mkeintz
PROC Star

If you already know the fixed values of variable TBL, then a merge of subsets of HAVE (one subset per TBL value), each with a rename, makes this into a single pass DATA step:

 

data have;
input tbl $ field $;
cards;
tbl1 x
tbl1 y
tbl1 z
tbl2 w
tbl2 x
tbl2 v
tbl2 y
run;

data want;
  merge have (where=(tbl='tbl1')  rename=(field=tlb1))
        have (where=(tbl='tbl2')  rename=(field=tlb2)) ;
  drop tbl;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
LinusH
Tourmaline | Level 20
I can't see that you can do this without losing data consistency. It seems now that you put values on the same row based on the input data order.
What is the meaning of each row in the new table?
Data never sleeps
PaigeMiller
Diamond | Level 26

@LinusH wrote:
I can't see that you can do this without losing data consistency. It seems now that you put values on the same row based on the input data order.
What is the meaning of each row in the new table?

Good point. It is hard to make sense of the new table, without additional explanation.

--
Paige Miller
Kathryn_SAS
SAS Employee

Here is another possible way to do this:

data have;
input tbl $ field $;
cards;
tbl1 x
tbl1 y
tbl1 z
tbl2 w
tbl2 x
tbl2 v
tbl2 y
;
run;

data have;
set have;
by tbl;
if first.tbl then flag=0;
flag+1;
run;

proc sort data=have;
by flag tbl;
run;

proc transpose data=have out=want(drop=flag _name_);
by flag;
id tbl;
var field;
run;

proc print data=want;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2138 views
  • 7 likes
  • 9 in conversation