BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Calcite | Level 5

## Help with a creation of variable

Hello All,

I need your help. I have a pooled dataset with more than 5000 firm observations with the following information: year (yr) firm identifier (id), and Expert. EXPERT consists of three values (1, 2, 3). EXPERT takes value 1 if the CEO has a business degree, 2 if the CEO has an engineering degree, and 3 if the CEO has a medical degree.

I want to create a new variable Status that takes the following values:

(1) The first year of the firm and until the change in CEO, Status = “bb” if the EXPERT =1, “ee” if the EXPERT =2, and “mm” if the EXPERT =3.

(2) In the year of CEO change and the following years until any new change in CEO, Status takes the following values:

Status is “be” if the current year EXPERT = 2 and previous year EXPERT = 1.

Status is “bm” if the current year EXPERT = 3 and previous year EXPERT = 1.

Status is “eb” if the current year EXPERT = 1 and previous year EXPERT =2.

Status is “em” if the current year EXPERT = 3 and previous year EXPERT =2.

Status is “mb” if the current year EXPERT = 1 and previous year EXPERT =3.

Status is “me” if the current year EXPERT = 2 and previous year EXPERT =3.

data have;

input yr id \$ Expert;

cards;

1980 a 1

1981 a 1

1982 a 2

1983 a 2

1984 a 2

1985 a 2

1986 a 2

1987 a 3

1988 a 3

1980 b 2

1981 b 2

1982 b 2

1984 b 3

1985 b 3

1986 b 3

1987 b 3

1981 c 3

1982 c 3

1983 c 3

1984 c 3

1985 c 1

1986 c 1

1987 c 1

1988 c 1

1989 c 3

1990 c 3

1992 c 3

1993 c 3

1994 c 2

1995 c 2

1996 c 2

The final sample I want is the following:

yr    id Expert status

1980 a     1     bb

1981 a     1     bb

1982 a     2     be

1983 a     2     be

1984 a     2     be

1985 a     2     be

1986 a     2     be

1987 a     3     em

1988 a     3     em

1980 b     2     ee

1981 b     2     ee

1982 b     2     ee

1984 b     3     em

1985 b     3     em

1986 b     3     em

1987 b     3     em

1981 c     3     mm

1982 c     3     mm

1983 c     3     mm

1984 c     3     mm

1985 c     1     mb

1986 c     1     mb

1987 c     1     mb

1988 c     1     mb

1989 c     3     bm

1990 c     3     bm

1992 c     3     bm

1993 c     3     bm

1994 c     2     me

1995 c     2     me

1996 c     2     me

I would appreciate it if someone provides me with the codes.

Thank you for your time.

Best regards,

Gokul

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

## Re: Help with a creation of variable

Here you go.

``````data have;
input yr id \$ Expert;
cards;
1980 a 1
1981 a 1
1982 a 2
1983 a 2
1984 a 2
1985 a 2
1986 a 2
1987 a 3
1988 a 3
1980 b 2
1981 b 2
1982 b 2
1984 b 3
1985 b 3
1986 b 3
1987 b 3
1981 c 3
1982 c 3
1983 c 3
1984 c 3
1985 c 1
1986 c 1
1987 c 1
1988 c 1
1989 c 3
1990 c 3
1992 c 3
1993 c 3
1994 c 2
1995 c 2
1996 c 2
;

data want(drop=_:);
set have;
by id yr;
length status \$2;
retain status;
_lag_expert=lag(expert);
if first.id then
do;
if expert=1 then status='bb';
else if expert=2 then status='ee';
else if expert=3 then status='mm';
end;
else
do;
if EXPERT = 2 and _lag_expert = 1 then status='be';
else if EXPERT = 3 and _lag_expert = 1 then status='bm';
else if EXPERT = 1 and _lag_expert = 2 then status='eb';
else if EXPERT = 3 and _lag_expert = 2 then status='em';
else if EXPERT = 1 and _lag_expert = 3 then status='mb';
else if EXPERT = 2 and _lag_expert = 3 then status='me';
end;
run;``````
4 REPLIES 4
Opal | Level 21

## Re: Help with a creation of variable

Here you go.

``````data have;
input yr id \$ Expert;
cards;
1980 a 1
1981 a 1
1982 a 2
1983 a 2
1984 a 2
1985 a 2
1986 a 2
1987 a 3
1988 a 3
1980 b 2
1981 b 2
1982 b 2
1984 b 3
1985 b 3
1986 b 3
1987 b 3
1981 c 3
1982 c 3
1983 c 3
1984 c 3
1985 c 1
1986 c 1
1987 c 1
1988 c 1
1989 c 3
1990 c 3
1992 c 3
1993 c 3
1994 c 2
1995 c 2
1996 c 2
;

data want(drop=_:);
set have;
by id yr;
length status \$2;
retain status;
_lag_expert=lag(expert);
if first.id then
do;
if expert=1 then status='bb';
else if expert=2 then status='ee';
else if expert=3 then status='mm';
end;
else
do;
if EXPERT = 2 and _lag_expert = 1 then status='be';
else if EXPERT = 3 and _lag_expert = 1 then status='bm';
else if EXPERT = 1 and _lag_expert = 2 then status='eb';
else if EXPERT = 3 and _lag_expert = 2 then status='em';
else if EXPERT = 1 and _lag_expert = 3 then status='mb';
else if EXPERT = 2 and _lag_expert = 3 then status='me';
end;
run;``````
Calcite | Level 5

## Re: Help with a creation of variable

Thank you to both of you for providing me with the codes.

PROC Star

## Re: Help with a creation of variable

Let's reframe your task description.   You want a two-letter STATUS code, in which the first letter comes from the first obs for each ID, and the second letter comes from the current obs.  The letter is just a simple translation of the EXPERT code  (1--'b', 2-->'e', 3-->'m'):

``````data want (drop=_:);
set have;
by id expert notsorted;

if expert=1 then _single_char='b'; else
if expert=2 then _single_char='e'; else
if expert=3 then _single_char='m';

retain _first_single_char ' ';
if first.id then _first_single_char=_single_char;

length status \$2;
status=cats(_first_single_char,_single_char);
run;``````

But why not

1. Make a variable (EXPERT1) representing EXPERT for the first obs of an ID.
2. Make a second variable (EXPERT_PRIOR) represent the EXPERT value of the prior obs

Then by comparing EXPERT to EXPERT_PRIOR and/or EXPERT1 you have a precise indicator of type of transition of CEO expertise.

``````data want2 ;
set have;
by id expert notsorted;

if first.id then expert1=expert;
retain expert1;
expert_prior=lag(expert);
if first.id then expert_prior=.;
run;
``````

BTW, what if a given CEO has more than one type of expertise?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Calcite | Level 5

## Re: Help with a creation of variable

The data is arranged in such a way that the CEO can have only one expertise. Thank you for the codes.

Discussion stats
• 4 replies
• 551 views
• 0 likes
• 3 in conversation