BookmarkSubscribeRSS Feed
nbora
Calcite | Level 5

Hi,

my have data is the following:

data have;
     input col_a col_b $ col_c :date9.;
datalines;
123 123456789ABC . 
123 123456789DEF 15AUG2014
123 123456789GHI 30SEP2014 
456 987654321ABC 20DEC2013
456 987654321DEF 31JAN2014
456 987654321GHI 02FEB2014
;​

I would like to reshape this data where the values of col_b are transposed to new columns and the minimum of the date column per col_a is taken as date column. I tried multiple solution suggestions I found here in the communities but they didn't work out.

data want;
     input col_a col_b_1 $ col_b_2 $ col_b_3 $ col_c :date9.;
datalines;
123 123456789ABC 123456789DEF 123456789GHI 15AUG2014
456 987654321ABC 987654321DEF 987654321GHI 20DEC2013
;​


Any help would be highly appreciated.

3 REPLIES 3
ballardw
Super User

First thing is do you really need that new data set? As in, how do you intend to use the result?

 

Often changing from "long" to "wide" is not a good choice for most analysis or reporting in SAS.

 

This works for your example data:

Proc transpose data=have out=trans (drop=_name_)
   prefix=Col_b_;
   by col_a;
   var col_b;
run;
proc summary data=have nway;
   by col_a;
   var col_c;
   output out=summary(drop=_:) min=;
run;

data want;
   merge trans summary ;
   by col_a;
run;

However there might be some issues around desired results if any of the Col_b are duplicated within the value of col_a of if the order of the Col_b values in the output needs to be in a specific order different than order of appearance in the Have dataset.

 

The Proc summary has a drop dataset option because there would be other variables included in the output set

PaigeMiller
Diamond | Level 26

Reshaping the data like this only make programming the next step(s) harder. Don't do it.

 

Tell us what you are going to do with this data after you re-shape it, and we can give you better ideas about how to handle this data.

--
Paige Miller
Tom
Super User Tom
Super User

PROC SUMMARY can do that directly (as long as the number of values per BY group is small enough).

proc summary data=have ;
  by col_a;
  output out=want(drop=_type_ _freq_)  idgroup(out[3] (col_b)=) min(col_c)=;
run;

If you don't know if 3 is the right number of new COL_B variables then count first and put the number into macro variable.

proc sql noprint;
 select max(nobs) into :nobs trimmed
   from (select count(*) as nobs from have group by col_a)
 ;
quit;

proc summary data=have ;
  by col_a;
  output out=want(drop=_type_ _freq_)  idgroup(out[&nobs] (col_b)=) min(col_c)=;
run;

Read this classic paper:  https://support.sas.com/resources/papers/proceedings10/102-2010.pdf

 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3073 views
  • 2 likes
  • 4 in conversation