BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GokulK
Calcite | Level 5

Hello All,

I need your help. I have a pooled dataset with more than 5000 firm observations with the following information: year (yr) firm identifier (id), and Expert. EXPERT consists of three values (1, 2, 3). EXPERT takes value 1 if the CEO has a business degree, 2 if the CEO has an engineering degree, and 3 if the CEO has a medical degree.

 

I want to create a new variable Status that takes the following values:

(1) The first year of the firm and until the change in CEO, Status = “bb” if the EXPERT =1, “ee” if the EXPERT =2, and “mm” if the EXPERT =3.

 

(2) In the year of CEO change and the following years until any new change in CEO, Status takes the following values:

Status is “be” if the current year EXPERT = 2 and previous year EXPERT = 1.

Status is “bm” if the current year EXPERT = 3 and previous year EXPERT = 1.

Status is “eb” if the current year EXPERT = 1 and previous year EXPERT =2.

Status is “em” if the current year EXPERT = 3 and previous year EXPERT =2.

Status is “mb” if the current year EXPERT = 1 and previous year EXPERT =3.

Status is “me” if the current year EXPERT = 2 and previous year EXPERT =3.

 

 

data have;

input yr id $ Expert;

cards;

1980 a 1

1981 a 1

1982 a 2

1983 a 2

1984 a 2

1985 a 2

1986 a 2

1987 a 3

1988 a 3

1980 b 2

1981 b 2

1982 b 2

1984 b 3

1985 b 3

1986 b 3

1987 b 3

1981 c 3

1982 c 3

1983 c 3

1984 c 3

1985 c 1

1986 c 1

1987 c 1

1988 c 1

1989 c 3

1990 c 3

1992 c 3

1993 c 3

1994 c 2

1995 c 2

1996 c 2

 

The final sample I want is the following:

yr    id Expert status

1980 a     1     bb

1981 a     1     bb

1982 a     2     be

1983 a     2     be

1984 a     2     be

1985 a     2     be

1986 a     2     be

1987 a     3     em

1988 a     3     em

 

1980 b     2     ee

1981 b     2     ee

1982 b     2     ee

1984 b     3     em

1985 b     3     em

1986 b     3     em

1987 b     3     em

 

1981 c     3     mm

1982 c     3     mm

1983 c     3     mm

1984 c     3     mm

1985 c     1     mb

1986 c     1     mb

1987 c     1     mb

1988 c     1     mb

1989 c     3     bm

1990 c     3     bm

1992 c     3     bm

1993 c     3     bm

1994 c     2     me

1995 c     2     me

1996 c     2     me

 

I would appreciate it if someone provides me with the codes.

Thank you for your time.

Best regards,

Gokul

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Here you go.

data have;
  input yr id $ Expert;
  cards;
1980 a 1
1981 a 1
1982 a 2
1983 a 2
1984 a 2
1985 a 2
1986 a 2
1987 a 3
1988 a 3
1980 b 2
1981 b 2
1982 b 2
1984 b 3
1985 b 3
1986 b 3
1987 b 3
1981 c 3
1982 c 3
1983 c 3
1984 c 3
1985 c 1
1986 c 1
1987 c 1
1988 c 1
1989 c 3
1990 c 3
1992 c 3
1993 c 3
1994 c 2
1995 c 2
1996 c 2
;

data want(drop=_:);
  set have;
  by id yr;
  length status $2;
  retain status;
  _lag_expert=lag(expert);
  if first.id then 
    do;
           if expert=1 then status='bb';
      else if expert=2 then status='ee';
      else if expert=3 then status='mm';
    end;
  else 
    do;
           if EXPERT = 2 and _lag_expert = 1 then status='be';
      else if EXPERT = 3 and _lag_expert = 1 then status='bm';
      else if EXPERT = 1 and _lag_expert = 2 then status='eb'; 
      else if EXPERT = 3 and _lag_expert = 2 then status='em';
      else if EXPERT = 1 and _lag_expert = 3 then status='mb';
      else if EXPERT = 2 and _lag_expert = 3 then status='me';
    end;
run;

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

Here you go.

data have;
  input yr id $ Expert;
  cards;
1980 a 1
1981 a 1
1982 a 2
1983 a 2
1984 a 2
1985 a 2
1986 a 2
1987 a 3
1988 a 3
1980 b 2
1981 b 2
1982 b 2
1984 b 3
1985 b 3
1986 b 3
1987 b 3
1981 c 3
1982 c 3
1983 c 3
1984 c 3
1985 c 1
1986 c 1
1987 c 1
1988 c 1
1989 c 3
1990 c 3
1992 c 3
1993 c 3
1994 c 2
1995 c 2
1996 c 2
;

data want(drop=_:);
  set have;
  by id yr;
  length status $2;
  retain status;
  _lag_expert=lag(expert);
  if first.id then 
    do;
           if expert=1 then status='bb';
      else if expert=2 then status='ee';
      else if expert=3 then status='mm';
    end;
  else 
    do;
           if EXPERT = 2 and _lag_expert = 1 then status='be';
      else if EXPERT = 3 and _lag_expert = 1 then status='bm';
      else if EXPERT = 1 and _lag_expert = 2 then status='eb'; 
      else if EXPERT = 3 and _lag_expert = 2 then status='em';
      else if EXPERT = 1 and _lag_expert = 3 then status='mb';
      else if EXPERT = 2 and _lag_expert = 3 then status='me';
    end;
run;
GokulK
Calcite | Level 5

Thank you to both of you for providing me with the codes. 

mkeintz
PROC Star

Let's reframe your task description.   You want a two-letter STATUS code, in which the first letter comes from the first obs for each ID, and the second letter comes from the current obs.  The letter is just a simple translation of the EXPERT code  (1--'b', 2-->'e', 3-->'m'):

 

data want (drop=_:);
  set have;
  by id expert notsorted;

  if expert=1 then _single_char='b'; else
  if expert=2 then _single_char='e'; else
  if expert=3 then _single_char='m'; 

  retain _first_single_char ' ';
  if first.id then _first_single_char=_single_char;

  length status $2;
  status=cats(_first_single_char,_single_char);
run;

But why not

  1. Make a variable (EXPERT1) representing EXPERT for the first obs of an ID.
  2. Make a second variable (EXPERT_PRIOR) represent the EXPERT value of the prior obs

Then by comparing EXPERT to EXPERT_PRIOR and/or EXPERT1 you have a precise indicator of type of transition of CEO expertise.

 

data want2 ;
  set have;
  by id expert notsorted;

  if first.id then expert1=expert;
  retain expert1;
  expert_prior=lag(expert);
  if first.id then expert_prior=.;
run;

 

BTW, what if a given CEO has more than one type of expertise?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
GokulK
Calcite | Level 5

The data is arranged in such a way that the CEO can have only one expertise. Thank you for the codes.

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1028 views
  • 0 likes
  • 3 in conversation