BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello

What is the way to create data set "wanted"  from dataset "have" ?


Data have;
input id serial  Revenue  profit;
cards;
1 1 10 2
1 2 20 5
1 3 30 7
1 4 40 9
2 1 5 3
2 2 10 8
2 3 . .
2 4 . .
;
Run;

data wanted;
input ID Revenue1 Revenue2 Revenue3 Revenue4 profit1 profit2 profit3 profit4;
cards;
1 10 20 30 40 2 5 7 9
2 5 10 . . 3 8 . .
;
Run;
8 REPLIES 8
GraphGuy
Meteorite | Level 14

If you're looking for a solution for just this one specific case (and not trying to generalize it), then you could use brute force:

 

Data have;
input id serial  Revenue  profit;
cards;
1 1 10 2
1 2 20 5
1 3 30 7
1 4 40 9
2 1 5 3
2 2 10 8
2 3 . .
2 4 . .
;
Run;

data wanted (drop=serial revenue profit); set have;
retain revenue1 revenue2 revenue3 revenue4 profit1 profit2 profit3 profit4;
if serial=1 then do;
 revenue1=revenue;
 profit1=profit;
 end;
if serial=2 then do;
 revenue2=revenue;
 profit2=profit;
 end;
if serial=3 then do;
 revenue3=revenue;
 profit3=profit;
 end;
if serial=4 then do;
 revenue4=revenue;
 profit4=profit;
 end;
if serial=4 then output;
run;

proc print data=wanted; run;

table.png

Ronein
Onyx | Level 15

Thanks,

The real data including 20 million rows in long structure and 20 variables so I am looking for a more efficient solution

 

PaigeMiller
Diamond | Level 26

@Ronein wrote:

 

The real data including 20 million rows in long structure and 20 variables so I am looking for a more efficient solution


More efficient solution: work with the data in the long structure.

--
Paige Miller
Reeza
Super User

@Ronein wrote:

Thanks,

The real data including 20 million rows in long structure and 20 variables so I am looking for a more efficient solution

 


Array method is slightly more efficient, single pass of the data then. The tutorial below shows exactly what you need. 

But why? 

 

Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/


Reeza
Super User
Two proc transposes plus a merge/join.
Or an array method.
acordes
Rhodochrosite | Level 12

this trick seen first time from @Ksharp 

 

proc sql;
select max(n) into :n from (select count(*) as n from have group by id);
quit;

proc summary data=have;
by id;
output out=wanted(drop=_:) idgroup(out[&n] (revenue profit)=);
run;
maguiremq
SAS Super FREQ
proc transpose
	data = have (keep = id revenue:)
	out = want_rev (drop = _name_)
	prefix = revenue;
	by id;
		var revenue;
run;

proc transpose
	data = have (keep = id profit:)
	out = want_prof (drop = _name_)
	prefix = profit;
	by id;
		var profit;
run;

data want;
	merge want_rev want_prof;
	by id;
run;
tc
Lapis Lazuli | Level 10 tc
Lapis Lazuli | Level 10

If your data's in a massively-parallel database like Oracle, Teradata, Snowflake, DB2, check out the LONG2WIDE macro in this old 2008 SGF paper:

It's a Bird, It's a Plane, It's SQL Transpose!
https://support.sas.com/resources/papers/proceedings/pdfs/sgf2008/089-2008.pdf

 

An SQL CASE statement approach still work with SAS datasets, but I'm thinking a DATA step using arrays or hard-coding would be faster, although SAS SQL has gotten much better at parallel SQL execution over the years.

 

Good luck!

 

p.s. As an alternative to using a SAS macro to generate SQL, I've also seen people use Excel formulas to generate transpose CASE statement logic from a list of column names and copy-and-paste that into their queries!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1980 views
  • 6 likes
  • 7 in conversation