BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
eawhit10
Fluorite | Level 6

I have a rather large dataset in long format that I need wide. I tried proc transpose, but all my numeric variables were changed to character. 1) I don't even know how to change them back and don't want to do so individually anyway as there are a lot of variables 2) is there a way to transpose without this happening? 

 

Please see data (in reality there are about 63 variables, character and numeric, like vascular measures,  yes/no questions, more physical measures, etc). Any solution I find involving an array seems to require typing out all variables....

event_namestidincomeeducsexracehealthphtwtsbpdbpsmoke
baseline1001211566140.1144981
year_21001211466150.1144981
baseline2003324362.5130.2121732
year_22003324262.5125.8120862
baseline3001412164150.1129772
year_23001412364145.6114742

/* trying proc transpose*/

proc transpose data=longsort out=out1;
by stID event_name; 
var income -- dbp;
run;
proc transpose data=out1 delimiter=_ out=new2(drop=_name_);
by stid;
var col1;
id _name_ event_name;
run;
/* proc contents shows all are now character variables even the ones supposed to be numeric */

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So to go from semi-wide to really wide you can use a double transpose.  If you have mixed numeric/character data then do it in two part and then merge them.

data have;
  input event_name :$32. stid $ income $ educ $ sex $ race $ healthp $ ht wt sbp dbp smoke $;
cards;
baseline 100 1 2 1 1 5 66 140.1 144 98 1
year_2 100 1 2 1 1 4 66 150.1 144 98 1
baseline 200 3 3 2 4 3 62.5 130.2 121 73 2
year_2 200 3 3 2 4 2 62.5 125.8 120 86 2
baseline 300 1 4 1 2 1 64 150.1 129 77 2
year_2 300 1 4 1 2 3 64 145.6 114 74 2
;
proc sort data=have;
  by stid event_name;
run;

proc transpose data=have out=num;
  by stid event_name ; 
  var _numeric_;
run;
proc transpose data=num out=num_wide delimiter=_;
  by stid ;
  where lowcase(_name_) not in ('stid'  'event_name');
  id _name_ event_name;
  var col1;
run;

proc transpose data=have out=char;
  by stid event_name ; 
  var _character_;
run;
proc transpose data=char out=char_wide delimiter=_;
  by stid ;
  where lowcase(_name_) not in ('stid'  'event_name');
  id _name_ event_name;
  var col1;
run;

data want;
  merge num_wide char_wide ;
  by stid ; 
  drop _name_;
run;

image.png

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Are you showing us the transposed data or the un-transposed data?

 

I have a rather large dataset in long format that I need wide

 

Usually converting long to wide is not a good idea, although there are exceptions. Why do you need it to be wide?

--
Paige Miller
eawhit10
Fluorite | Level 6

I thought that in order to run some longitudinal tests, I needed it in wide data? A simple example would be like to compare weight from baseline and year 2, I'd need 2 columns, instead of observation at baseline in 1 row and row 2 is the observation at year 2.

eawhit10
Fluorite | Level 6

I'm showing un-transposed. I wanted new columns with like ht_baseline, ht_year_2, bp_baseline, bp_year_2, etc.  But the transpose code i originally posted changes all numeric vars to character.

data_null__
Jade | Level 19
var firstvar-character-lastvar;
var firstvar-numeric-lastvar;

One TRANSPOSE for character and another for numeric. Read up on "SAS Variable Lists".

Tom
Super User Tom
Super User

But you are already starting with a WIDE dataset.  

What is it you want to create?  Something WIDER?  How?

Tom
Super User Tom
Super User

So to go from semi-wide to really wide you can use a double transpose.  If you have mixed numeric/character data then do it in two part and then merge them.

data have;
  input event_name :$32. stid $ income $ educ $ sex $ race $ healthp $ ht wt sbp dbp smoke $;
cards;
baseline 100 1 2 1 1 5 66 140.1 144 98 1
year_2 100 1 2 1 1 4 66 150.1 144 98 1
baseline 200 3 3 2 4 3 62.5 130.2 121 73 2
year_2 200 3 3 2 4 2 62.5 125.8 120 86 2
baseline 300 1 4 1 2 1 64 150.1 129 77 2
year_2 300 1 4 1 2 3 64 145.6 114 74 2
;
proc sort data=have;
  by stid event_name;
run;

proc transpose data=have out=num;
  by stid event_name ; 
  var _numeric_;
run;
proc transpose data=num out=num_wide delimiter=_;
  by stid ;
  where lowcase(_name_) not in ('stid'  'event_name');
  id _name_ event_name;
  var col1;
run;

proc transpose data=have out=char;
  by stid event_name ; 
  var _character_;
run;
proc transpose data=char out=char_wide delimiter=_;
  by stid ;
  where lowcase(_name_) not in ('stid'  'event_name');
  id _name_ event_name;
  var col1;
run;

data want;
  merge num_wide char_wide ;
  by stid ; 
  drop _name_;
run;

image.png

eawhit10
Fluorite | Level 6

Thank you! This is what I was looking for. I know it seems excessively wide, but I didn't know how else to prepare the data to compare baseline measures vs year 2.

Tom
Super User Tom
Super User

@eawhit10 wrote:

Thank you! This is what I was looking for. I know it seems excessively wide, but I didn't know how else to prepare the data to compare baseline measures vs year 2.


Change from baseline calculations are easier in the original format.

data want;
  set have;
  by id;
  if visit='Baseline' then wt_baseline=wt;
  if first.id the call missing(wt_baseline);
  retain wt_baseline;
  if not missing(wt_baseline) then wt_change=wt-wt_baseline;
run;
Ksharp
Super User
data have;
  input event_name :$32. stid $ income $ educ $ sex $ race $ healthp $ ht wt sbp dbp smoke $;
cards;
baseline 100 1 2 1 1 5 66 140.1 144 98 1
year_2 100 1 2 1 1 4 66 150.1 144 98 1
baseline 200 3 3 2 4 3 62.5 130.2 121 73 2
year_2 200 3 3 2 4 2 62.5 125.8 120 86 2
baseline 300 1 4 1 2 1 64 150.1 129 77 2
year_2 300 1 4 1 2 3 64 145.6 114 74 2
;
proc sort data=have;
  by stid event_name;
run;
proc transpose data=have(obs=0) out=vname;
var _all_;
run;
proc sql;
create table level as
select *
 from (select distinct event_name from have),
 (select * from vname where lowcase(_name_) not in ('stid'  'event_name'))
 order by 1;
quit;
data _null_;
 set level end=last;
 by event_name;
 if _n_=1 then call execute('data want;merge ');
 if first.event_name then call execute(catt('have(where=(event_name="',event_name,'")  rename=('));
 call execute(catt(_name_,'=',_name_,'_',event_name));
 if last.event_name then call execute('))');
 if last then call execute(';by stid;drop event_name;run;');
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 4380 views
  • 4 likes
  • 5 in conversation