Programming the statistical procedures from SAS

Grab items that fits this condition.

Accepted Solution Solved
Reply
Contributor
Posts: 45
Accepted Solution

Grab items that fits this condition.

Hi all,

I'm quite new to SAS and I have this project that is due in two weeks. While I'm learning to program, what I need to do first is quite a big challenge. So I need your help!

Basically I have a data set that has variables "execid", which is the IDs for company executives, "gvkey", which is the IDs for the companies. I also have a variable "year" indicating different years.

The dataset looks like this:

GVKEYEXECIDYEAR
1055321996
1055321997
1055321998
1055321999
1055322000
13250221998
13250221999
13250222001
107831992
107831993
101361996
101361997
101361998
101361999
101362000
101362001

This says executive #2 works in the company 10553 from 1996 to 2000, he also works in the company 132502 from 1998 to 2001. Executive #3 works in the company 1078 from 1992 to 1993. Executive #6 works in the company 1013 from 1996 to 2001. Also, an executive may work in three or even four companies in the data period: 1992 to 2006.

Now the task is to select those executives that "are employed for at least three years in each of at least two different firms".

So in the case of the above example, executive 2 would qualify. #3 wouldn't qualify since he only works in 1078 for two years. #6 wouldn't qualify either since s/he only worked for one company.

While I can do some simple defining variables, sorting, compute frequencies, etc. This is a little too much for me. My whole data set has over 100,000 items and Iexpect only a small part of them would qualify.

Can someone please help me? I'm desperate.

Many thanks!


Accepted Solutions
Solution
‎08-11-2011 02:35 PM
PROC Star
Posts: 7,416

Grab items that fits this condition.

You are going to have to check a number of cases to ensure I correctly interpreted and coded for the additional requirements.  The following uses what is called a DOW loop:

data have;

  input GVKEY          EXECID          YEAR;

  cards;

10553          2          1996

10553          2          1997

10553          2          1998

10553          2          1999

10553          2          2000

132502          2          1998

132502          2          1999

132502          2          2000

132502          2          2001

1078          3          1992

1078          3          1993

1013          6          1996

1013          6          1997

1013          6          1998

1013          6          1999

1013          6          2000

1013          6          2001

;

data want (keep=gvkey execid year);

  do until(last.execid);

    set have;

    by EXECID;

    last_year=lag(year);

    last_gvkey=lag(gvkey);

    if first.execid then firms=0;

    if gvkey ne last_gvkey then do;

      counter=1;

      if firms eq 1 then do;

        start_year2=year;

        gvkey2=gvkey;

            end;

      else do;

        start_year=year;

        gvkey1=gvkey;

      end;

    end;

    else do;

      if year eq last_year+1 then do;

        counter+1;

        if counter eq 3 then do;

          firms+1;

        end;

        if counter ge 3 then do;

          if firms eq 1 then end_year=year;

          else end_year2=year;

        end;

      end;

      else do;

        counter=1;

        if firms eq 0 then do;

          start_year=year;

                gvkey1=gvkey;

              end;

              else if firms eq 1 then do;

          start_year2=year;

          gvkey2=gvkey;

        end;

      end;

    end;

  end;

  do until(last.execid);

    set have;

    by EXECID;

    if firms eq 2 and

     ((gvkey eq gvkey1 and start_year<=year<=end_year) or

     (gvkey eq gvkey2 and start_year2<=year<=end_year2))

     then output;

  end;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,416

Grab items that fits this condition.

Dear Desperate,

It really isn't that difficult.  However, with the example data you provided, no execs would be selected as Exec 2 was missing a year with the second company.  I added that year in the following data and possible code:

data have;

  input GVKEY          EXECID          YEAR;

  cards;

10553          2          1996

10553          2          1997

10553          2          1998

10553          2          1999

10553          2          2000

132502          2          1998

132502          2          1999

132502          2          2000

132502          2          2001

1078          3          1992

1078          3          1993

1013          6          1996

1013          6          1997

1013          6          1998

1013          6          1999

1013          6          2000

1013          6          2001

;

data want (drop=last_year counter firms);

  set have;

  by EXECID GVKEY;

  last_year=lag(year);

  if first.execid then firms=0;

  if first.gvkey then counter=1;

  else do;

    if year eq last_year+1 then do;

      counter+1;

      if counter eq 3 then do;

         firms+1;

         if firms eq 2 then do;

           output;

           firms=-99;

         end;

      end;

    end;

    else counter=1;

  end;

run;

You, of course, have to review the code to ensure that it really does what you want.  And, moreso, to understand it so that you can modify it to accomodate other conditions you might be asked to evaluate.

Contributor
Posts: 45

Grab items that fits this condition.

Thank you so much for your reply!

I'm still studying your program but I've run it and it did select those executives.

I guess I didn't explain it clearly though. I would need all the years for the executives. So in the case of the executive #2, I would want the years in the firm 10553 from 1996 to 2000, and all the years in the firm 132502 from 1998 to 2001. This is to say, there are nine items for the executive #2 in the "want" data set (actually eight items if using the data set I provided, since #2 has three years in the second company, which satisfies the minimum required years conditionSmiley Happy).

Therefore, "want" would contain all the firm-years for each executive who satisfies the condition.

Sorry I didn't explain clearly. Would you show me how to do this again?

Thank you.

Super User
Posts: 9,769

Grab items that fits this condition.

OK. Just as Art.T said ,It is not too difficult.But I am curious that why this poster will appear at statistical forum. Smiley Happy

data have;
  input GVKEY          EXECID          YEAR;
  cards;
10553          2          1996
10553          2          1997
10553          2          1998
10553          2          1999
10553          2          2000
132502          2          1998
132502          2          1999
132502          2          2000
132502          2          2001
1078          3          1992
1078          3          1993
1013          6          1996
1013          6          1997
1013          6          1998
1013          6          1999
1013          6          2000
1013          6          2001
;
run;
proc sql noprint;
 create table want as
  select *
   from have
    group by execid
     having count(distinct gvkey) ge 2 and count(distinct year) ge 3;
quit;
     

Ksharp

Contributor
Posts: 45

Grab items that fits this condition.

Well this project I'm working on is a statistical one. But I guess you're right, this particular question is not a statistical one. Sorry!:smileysilly:

Solution
‎08-11-2011 02:35 PM
PROC Star
Posts: 7,416

Grab items that fits this condition.

You are going to have to check a number of cases to ensure I correctly interpreted and coded for the additional requirements.  The following uses what is called a DOW loop:

data have;

  input GVKEY          EXECID          YEAR;

  cards;

10553          2          1996

10553          2          1997

10553          2          1998

10553          2          1999

10553          2          2000

132502          2          1998

132502          2          1999

132502          2          2000

132502          2          2001

1078          3          1992

1078          3          1993

1013          6          1996

1013          6          1997

1013          6          1998

1013          6          1999

1013          6          2000

1013          6          2001

;

data want (keep=gvkey execid year);

  do until(last.execid);

    set have;

    by EXECID;

    last_year=lag(year);

    last_gvkey=lag(gvkey);

    if first.execid then firms=0;

    if gvkey ne last_gvkey then do;

      counter=1;

      if firms eq 1 then do;

        start_year2=year;

        gvkey2=gvkey;

            end;

      else do;

        start_year=year;

        gvkey1=gvkey;

      end;

    end;

    else do;

      if year eq last_year+1 then do;

        counter+1;

        if counter eq 3 then do;

          firms+1;

        end;

        if counter ge 3 then do;

          if firms eq 1 then end_year=year;

          else end_year2=year;

        end;

      end;

      else do;

        counter=1;

        if firms eq 0 then do;

          start_year=year;

                gvkey1=gvkey;

              end;

              else if firms eq 1 then do;

          start_year2=year;

          gvkey2=gvkey;

        end;

      end;

    end;

  end;

  do until(last.execid);

    set have;

    by EXECID;

    if firms eq 2 and

     ((gvkey eq gvkey1 and start_year<=year<=end_year) or

     (gvkey eq gvkey2 and start_year2<=year<=end_year2))

     then output;

  end;

run;

Contributor
Posts: 45

Re: Grab items that fits this condition.

It worked perfectly!!!!!!!!!! Thank you sooooo much!!!!

God, I would have never done this by myself.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 170 views
  • 0 likes
  • 3 in conversation