## Long to wide conversion

Solved
Occasional Contributor
Posts: 7

# Long to wide conversion

Dear SAS Users,

How can I achieve the following task in an efficient way? Let's say that I have a long data that have two sales figures for different firms and I would like to make it wide as below. Also please note that the original dataset is a large one.

Thank you in advance for any help.

Have

 Firm Date Sales1 Sales2 A 20/2/2012 100 50 A 21/2/2012 200 70 A 22/2/2012 300 90 B 21/2/2012 120 200 B 22/2/2012 300 50 B 23/2/2012 300 200 C 19/2/2012 50 40 C 20/2/2012 75 50 C 21/2/2012 80 30 C 22/2/2012 90 100

Want

 date A1 A2 B1 B2 C1 C2 19/2/2012 50 40 20/2/2012 100 50 75 50 21/2/2012 200 70 120 200 80 30 22/2/2012 300 90 300 50 90 100 23/2/2012 300 200

Accepted Solutions
Solution
‎02-22-2012 11:49 AM
PROC Star
Posts: 8,164

## Long to wide conversion

There is probably a less convoluted way to do this, but I think that the following achieves the desired result:

data Have;

informat date ddmmyy10.;

format date date9.;

input Firm \$ Date Sales1 Sales2;

cards;

A          20/2/2012          100          50

A          21/2/2012          200          70

A          22/2/2012          300          90

B          21/2/2012          120          200

B          22/2/2012          300          50

B          23/2/2012          300          200

C          19/2/2012          50          40

C          20/2/2012          75          50

C          21/2/2012          80          30

C          22/2/2012          90          100

;

proc sort data=have;

by date firm;

run;

data fortranspose(keep = date _name_ sales) / view=fortranspose;

set have;

by date;

array ss(2) sales1-sales2;

if first.date then obsnum = 0;

do i = 1 to 2;

_name_ = trim(firm) || put(i,1.);

sales = ss(i);

output;

end;

run;

proc transpose data=fortranspose out=want(drop=_name_);

by date;

var sales;

run;

data want;

retain date a1 a2 b1 b2 c1 c2;

set want;

run;

All Replies
Super User
Posts: 6,781

## Long to wide conversion

JohnK,

In theory, it's not so hard.  In practice, though, you mention that your data set is long.  I suspect that "Firm" is not really as simple as A, B, and C, but takes on many more complex values that don't easily translate into unique variable names.  For example, if your data contained both of these Firms:

George Washington Carver High School North

George Washington Carver High School South

I doubt that you could come up with a good way to assign what the new variable names would be.

If you were willing to come up with a modified solution, it might look look like this.  Instead of variable names like A1, A2, B1, B2, use names like Sales0001_1, Sales0001_2, Sales0002_1, Sales0002_2, etc.  The first 4 digits of the new variables indicate which Firm it is, and you would need a separate method to translate from the arbitrary firm number to the firm name.  (This would not be difficult using a format.)

Does this sound like an acceptable solution?

Solution
‎02-22-2012 11:49 AM
PROC Star
Posts: 8,164

## Long to wide conversion

There is probably a less convoluted way to do this, but I think that the following achieves the desired result:

data Have;

informat date ddmmyy10.;

format date date9.;

input Firm \$ Date Sales1 Sales2;

cards;

A          20/2/2012          100          50

A          21/2/2012          200          70

A          22/2/2012          300          90

B          21/2/2012          120          200

B          22/2/2012          300          50

B          23/2/2012          300          200

C          19/2/2012          50          40

C          20/2/2012          75          50

C          21/2/2012          80          30

C          22/2/2012          90          100

;

proc sort data=have;

by date firm;

run;

data fortranspose(keep = date _name_ sales) / view=fortranspose;

set have;

by date;

array ss(2) sales1-sales2;

if first.date then obsnum = 0;

do i = 1 to 2;

_name_ = trim(firm) || put(i,1.);

sales = ss(i);

output;

end;

run;

proc transpose data=fortranspose out=want(drop=_name_);

by date;

var sales;

run;

data want;

retain date a1 a2 b1 b2 c1 c2;

set want;

run;

Super User
Posts: 6,781

## Long to wide conversion

Art,

I like it.  So if you have a variable called _NAME_, is it implied that PROC TRANSPOSE should use it as an ID variable?

PROC Star
Posts: 8,164

## Long to wide conversion

Astounding: from the documentation: If you do not use an ID statement, PREFIX= option, or the SUFFIX= option, then PROC TRANSPOSE looks for an input variable called _NAME_ to get the names of the transposed variables.

Occasional Contributor
Posts: 7

## Long to wide conversion

Hi Art,

Would it be not efficient to complete the task by using macros? Algorithm is something like this:

1. Collect the firm names from the input file by using proc freq into the file names.

2. Read the names.sas7bdat and assign each firm name to a macro, so the names will be &firm1, &firm2 etc.

3. By using a loop extract data for each firm from the input file. For example

%do i=1 %to &maxfirmno;

data &&firm&i;

set inputfile;

where firm="&&firm&i";

run;

%end;

4. merge all &&firm&i data files by date.

Occasional Contributor
Posts: 7

## Re: Long to wide conversion

Art,

Two more questions if you don't mind?

What is the use of this line in your solution?

if first.date then obsnum=0

In the original data set (i.e., have), let's assume that we have another variable named temperature, which is the mean temperature for each date. How can I copy this variable to the output dataset?

Ksharp,

Can your code be modified the include this?

Thanks.

data test;

informat date ddmmyy10.;

format date date9.;

input Firm \$ Date Sales1 Sales2 temp;

cards;

A          20/2/2012          100          50        40

A          21/2/2012          200          70       50

A          22/2/2012          300          90       60

B          21/2/2012          120          200      50

B          22/2/2012          300          50      60

B          23/2/2012          300          200      62

C          19/2/2012          50          40      30

C          20/2/2012          75          50      40

C          21/2/2012          80          30      50

C          22/2/2012          90          100      60

;

run;

J.

PROC Star
Posts: 8,164

## Long to wide conversion

John,

The lines:

if first.date then obsnum=0

by date;

were not needed in that section of code.  They were artifacts of another approach I was trying.

As for adding one or more variables, probably the easiest one to code would be to include them with a 'copy' statement.  However, that also requires getting rid of extra lines that the procedure will produce as a result of including a copy statement.

I've shown update code, using the copy statement, below:

data Have;

informat date ddmmyy10.;

format date date9.;

input Firm \$ Date Sales1 Sales2 temp;

cards;

A          20/2/2012          100         50  40

A          21/2/2012          200         70  50

A          22/2/2012          300         90  60

B          21/2/2012          120        200  50

B          22/2/2012          300         50  60

B          23/2/2012          300        200  62

C          19/2/2012          50          40  30

C          20/2/2012          75          50  40

C          21/2/2012          80          30  50

C          22/2/2012          90          100 60

;

proc sort data=have;

by date firm;

run;

data fortranspose(keep = date _name_ sales temp)

/ view=fortranspose;

set have;

array ss(*) sales1-sales2;

do i = 1 to dim(ss);

_name_ = trim(firm) || put(i,1.);

sales = ss(i);

output;

end;

run;

proc transpose data=fortranspose out=want(drop=_name_);

by date;

var sales;

copy temp;

run;

data want;

retain date a1 a2 b1 b2 c1 c2 temp;

set want;

by date;

if first.date;

run;

Super User
Posts: 10,784

## Long to wide conversion

```data Have;
informat date ddmmyy10.;
format date date9.;
input Firm \$ Date Sales1 Sales2;
cards;
A          20/2/2012          100          50
A          21/2/2012          200          70
A          22/2/2012          300          90
B          21/2/2012          120          200
B          22/2/2012          300          50
B          23/2/2012          300          200
C          19/2/2012          50          40
C          20/2/2012          75          50
C          21/2/2012          80          30
C          22/2/2012          90          100
;
run;
proc sql ;
select distinct cats('have(where=(firm="',firm,'") rename=(sales1=',firm,'1 sales2=',firm,'2))') from have;
select distinct cats('have(where=(firm="',firm,'") rename=(sales1=',firm,'1 sales2=',firm,'2))') into : list1 - : list&sqlobs.
from have;
quit;

%macro merge;
data want(drop=firm);
merge %do i=1 %to &sqlobs.;
&&list&i
%end;
;
by date;
run;
%mend merge;
%merge

```

Ksharp

Super User
Posts: 10,784

## Long to wide conversion

Easy.

```data have;
informat date ddmmyy10.;
format date date9.;
input Firm \$ Date Sales1 Sales2 temp;
cards;
A          20/2/2012          100          50        40
A          21/2/2012          200          70       50
A          22/2/2012          300          90       60
B          21/2/2012          120          200      50
B          22/2/2012          300          50      60
B          23/2/2012          300          200      62
C          19/2/2012          50          40      30
C          20/2/2012          75          50      40
C          21/2/2012          80          30      50
C          22/2/2012          90          100      60
;
run;

proc sql ;
select distinct cats('have(where=(firm="',firm,'") rename=(sales1=',firm,'1 sales2=',firm,'2 temp=',firm,'_temp))') from have;
select distinct cats('have(where=(firm="',firm,'") rename=(sales1=',firm,'1 sales2=',firm,'2 temp=',firm,'_temp))') into : list1 - : list&sqlobs.
from have;
quit;

%macro merge;
data want(drop=firm);
merge %do i=1 %to &sqlobs.;
&&list&i
%end;
;
by date;
run;
%mend merge;
%merge

```

Ksharp

🔒 This topic is solved and locked.