BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello

In real life I have a big data set 17 million rows and many offer_ID's.

Is there a better way to perform convertion from long to wide .

Please see the current code


data have;
input offerID degem amnt;
cards;
111123 200 1000 
111123 300 2000
111123 400 3000
;
run;
PROC TRANSPOSE DATA=have OUT=wide1(DROP=_name_) prefix=degem_;
BY offerID; 
VAR degem;
RUN;
PROC TRANSPOSE DATA=have OUT=wide2(DROP=_name_) prefix=Amnt_;
BY offerID; 
VAR degem;
RUN;
Data want_Wide;
merge wide1 wide2;
BY offerID; 
Run;
9 REPLIES 9
PaigeMiller
Diamond | Level 26

IMHO

The best thing to do with this data set is to leave it as a long data set, and perform any analyses on the long data set. But you probably knew that already. The less processing you have to do on 17 million records, the better off you will be.

 

Please explain why you want it to be a wide data set.

--
Paige Miller
Ronein
Onyx | Level 15
At my work have task to create a panel data which has one row ( wide structure) instead of long structure ( multiple rows per offer id). I know it is strange but for most of people it is easier to look at wide structure data where each offer id has one row
PaigeMiller
Diamond | Level 26

And what analysis will you be doing?


When I look at the examples in PROC PANEL they are all long data sets, not wide data sets. Most SAS data analysis PROCs are designed to use long data sets.

--
Paige Miller
Kurt_Bremser
Super User

If people need to look at it, it's a report, not a dataset. Hand them the PROC REPORT code so they can run it on the intended subset (nobody will manually inspect 17 million observations in a single lifetime). 

Kurt_Bremser
Super User

And another thing: if there's just one ID with lots of observations, the resulting dataset will be enormous and consist mainly of missing values.

Kathryn_SAS
SAS Employee

This still requires multiple steps, but here is another idea.

data have;
input offerID degem amnt;
cards;
111123 200 1000 
111123 300 2000
111123 400 3000
;
run;

data have;
set have ;
by offerid;
if first.offerid then count=0;
count+1;
run;

proc transpose data=have out=wide;
by count offerid;
var degem amnt;
run;

data wide;
set wide;
varname=catx('_',_name_,count);
run;

proc transpose data=wide out=wide1(drop=_name_);
by offerid;
id varname;
var col1;
run;

proc print;
run;
data_null__
Jade | Level 19

Proc transpose has features that allow you to skip the step where you create VARNAME. 

 

proc transpose data=wide out=wide1(drop=_name_) delim=_;
   by offerid;
   *id varname;
   id _name_ count;
   var col1;
   run;
Tom
Super User Tom
Super User

If you know the maximum number of DEGEMs or AMTs per OFFERID then it is simple using ARRAYs.

data have;
  input offerID degem amnt;
cards;
111123 200 1000 
111123 300 2000
111123 400 3000
;

data want;
  set have;
  array _degem degem1-degem3 ;
  array _amnt amnt1-amnt3;
  do _n_=1 by 1 until(last.offerid);
    set have;
    by offerid;
    _degem[_n_]=degem;
    _amnt[_n_]=amnt;
  end;
  drop degem amnt;
run;

If you don't know the number you have two choices.  Just pick something that you know is too large.

Or calculate it.  For example you could use PROC SQL to create a macro variable.

proc sql noprint;
select max(count) into :max_count trimmed
from (select offerid,count(*) as count 
       from have group by offerid)
;
quit;

Then use that macro variable to generate the arrays' upper bounds.

  array _degem degem1-degem&max_count ;
  array _amnt amnt1-amnt&max_count ;

 

Ksharp
Super User

1)The most simple way is using PROC SUMMARY, but the limited thing is the max number of row in each 'offerID' group is 100 .


data have;
input offerID degem amnt;
cards;
111123 200 1000 
111123 300 2000
111123 400 3000
;
run;

proc sql noprint;
select max(n) into :n
 from (select count(*) as n from have group by offerID);
quit;

proc summary data=have;
by offerID;
output out=want(drop=_freq_ _type_) idgroup(out[&n.]  (degem amnt)=);
run;

 

 

 

 

 

2)My favorite way is using MERGE skill , but that need you to familiar with sas programming.

data have;
input offerID degem amnt;
cards;
111123 200 1000 
111123 300 2000
111123 400 3000
;
run;

data temp;
 set have;
 by offerID;
 if first.offerID then n=0;
 n+1;
run;

proc sql;
create table level as
select distinct n from temp;
quit;

data _null_;
 set level end=last;
 if _n_=1 then call execute('data want;merge ');
 call execute(catt('temp(where=(n=',n,') rename=(degem=degem_',n,' amnt=amnt_',n,'))'));
 if last then call execute(';by offerID; drop n; run;');
run;

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 273 views
  • 13 likes
  • 7 in conversation