## help with code

Solved
Frequent Contributor
Posts: 96

# help with code

Dear All,

I have a dataset with company name, year, dummy variable, and average_age. The dummy variable is 1 for period before 2003 and 0 for period after 2003. I want to create a new dataset which includes only those firms whose average age was below 0.3 when dummy variable was 0 and above 0.7 when dummy variable was 1. My dataset looks as follows:

Firm      Dummy     Avg_Age

A          0               0.2

A          1               0.76

B          0               0.5

B          1               0.8

C          0               0.2

C          1               0.25

D          0               0.1

D          1               0.6

E          0               0.15

E          1               0.75

Expected Output

Firm      Dummy     Avg_Age     Dumm0_Ave_age     Dumm1_Avg_age

A          0               0.2               0.2                              0.6

E          0               0.15              0.15                         0.75

I would greatly appreciate if someone help me with the code. Thank you.

Accepted Solutions
Solution
‎11-23-2012 08:14 PM
Super Contributor
Posts: 1,636

## Re: help with code

data have;

input Firm \$     Dummy     Avg_Age;

cards;

A          0               0.2

A          1               0.76

B          0               0.5

B          1               0.8

C          0               0.2

C          1               0.25

D          0               0.1

D          1               0.6

E          0               0.15

E          1               0.75

;

data temp1;

set have;

by firm notsorted;

if (dummy=0 and avg_age<0.3) or (dummy=1 and avg_age>0.7);

if first.firm then count=0;

count+1;

if count=2;

keep firm;

data temp2;

merge have(in=a) temp1(in=b);

by firm;

if a and b;

proc transpose data=temp2 out=want(drop=_ prefix=ave_age;

by firm;

var avg_age;

id dummy;

proc print;run;

All Replies
Solution
‎11-23-2012 08:14 PM
Super Contributor
Posts: 1,636

## Re: help with code

data have;

input Firm \$     Dummy     Avg_Age;

cards;

A          0               0.2

A          1               0.76

B          0               0.5

B          1               0.8

C          0               0.2

C          1               0.25

D          0               0.1

D          1               0.6

E          0               0.15

E          1               0.75

;

data temp1;

set have;

by firm notsorted;

if (dummy=0 and avg_age<0.3) or (dummy=1 and avg_age>0.7);

if first.firm then count=0;

count+1;

if count=2;

keep firm;

data temp2;

merge have(in=a) temp1(in=b);

by firm;

if a and b;

proc transpose data=temp2 out=want(drop=_ prefix=ave_age;

by firm;

var avg_age;

id dummy;

proc print;run;

PROC Star
Posts: 8,163

## Re: help with code

Similar to Linlin's suggestion, but does both in one step:

data have;

input Firm \$ Dummy Avg_Age;

cards;

A          0               0.2

A          1               0.76

B          0               0.5

B          1               0.8

C          0               0.2

C          1               0.25

D          0               0.1

D          1               0.6

E          0               0.15

E          1               0.75

;

data want (drop=got;

do until (last.firm);

set have;

by Firm;

if first.firm then do;

gotone=0;

gottwo=0;

end;

if Dummy eq 0 and Avg_Age lt 0.3 then gotone=1;

else if Dummy eq 1 and Avg_Age gt 0.7 then gottwo=1;

end;

do until (last.firm);

set have;

by Firm;

if gotone and gottwo then do;

if Dummy eq 0 then Dumm0_Avg_age=Avg_Age;

else if Dummy eq 1 then Dumm1_Avg_age=Avg_Age;

if last.firm then output;

end;

end;

run;

Super Contributor
Posts: 644

## Re: help with code

Another variation on the theme (but untested code)

Proc transpose

data = have

out  = resp (where = (col1 < 0.3 and col2 > 0.7)

;

By    Firm ;

Run ;

Data want ;

Merge want (keep = firm dummy)

resp (in = In_resp)

;

By  Firm ;

If  First.Firm ;

If  In_resp ;

Rename Col1 = Dumm0_Ave_age

Col1 = Dumm1_Avg_age

;

Drop _Name_ ;

Run ;

Richard in Oz

Super User
Posts: 8,069

## Re: help with code

You can do that directly with PROC TRANSPOSE provided your data is sorted.

data have;

input Firm \$ Dummy Avg_Age @@ ;

cards;

A 0 0.2 A 1 0.76 B 0 0.5 B 1 0.8 C 0 0.2 C 1 0.25

D 0 0.1 D 1 0.6 E 0 0.15 E 1 0.75

run;

proc transpose data=have prefix=Dumm suffix=_Avg_Age

out=want (drop=_: where=(dumm0_avg_age <.3 and dumm1_avg_age>.7));

by firm;

var avg_age ;

id dummy;

copy dummy avg_age ;

run;

proc print data=want; run;

Dumm0_     Dumm1_

Obs    Firm    Dummy    Avg_Age    Avg_Age    Avg_Age

1      A        0        0.20       0.20       0.76

2      E        0        0.15       0.15       0.75

Super Contributor
Posts: 644

## Re: help with code

Nobody (myself included) asked the question why the columns dummy and avg_age (both redundant) are required in the solution.  Leaving them out would simplify the code.

Richard in Oz

🔒 This topic is solved and locked.