Solved: Re: How do I count elements in one data column with criteria from anot...

ejarquejm · Posted 01-07-2020 04:49 AM

/* I have a dataset with 4 data columns, COL 1: firm identifier 1 COL 2: worker identifier 1 (period 1), COL3: firms identifer 2, COL 4 worker identifier2 (period 2).

It could look like this for 5 firms and 100 workers:

*/

data datain;

input Firmid1 $ w1 $ Firmid2 $ w2 $;

datalines;

030 32 010 11

030 45 010 79

030 64 020 05

040 07 030 21

040 21 030 32

040 72 030 45

050 12 040 07

050 13 040 72

050 87 040 81

050 99 050 13

050 98

;

/* Note that worker 21 switches from firm 4 to firm 3. Worker 64 i only employed in period 1 and worker 79 is only employed in period 2, and firms 1 and 2 do not exist in period 1.

I want to do the following: pick the firm indicator in period 1 (COL1) and count how many workers are attached to that firm in period 1, then go check for the existence of the same firm identifier in COL 3 and look in column 4 how many workers are in this particular firm in both periods.

The output of this routine would be data looking like this: firm i.d., number of workers in period 1, number of workers in period 2, number of workers that stayed in the firm from period 1 to period 2. For the example above (where we are counting also from the perspective of column 3 if we can't find the firm in column 1) we get

*/

data dataout;

input Firmid $ n1 $ n2 $ nstay $;

datalines;

010 0 2 0

020 0 2 0

030 3 3 2

040 3 3 2

050 4 2 1

;

/*

It is easy to do a frequency table for the periods separately. It is the counting of workers that remain across periods that is causing me trouble.

Regards

J

PS (SAS version 9.4)

*/

FreelanceReinh · Posted 01-07-2020 08:14 AM

Hello @ejarquejm and welcome to the SAS Support Communities!

You can derive n1, n2 and (in particular) nstay with PROC SQL queries and join the three resulting tables:

proc sql;
create table dataout as
select firmid1 as firmid, max(n1,0) as n1, max(n2,0) as n2, max(nstay,0) as nstay
from ((select firmid1, count(distinct w1) as n1
       from datain
       where firmid1
       group by firmid1)
         natural full join
      (select firmid2 as firmid1, count(distinct w2) as n2
       from datain
       where firmid2
       group by firmid2))
         natural left join
     (select firmid1, count(w1) as nstay from
       (select *
        from (select firmid1, w1 from datain)
               intersect
             (select firmid2, w2 from datain))
      group by firmid1);
quit;

I found it more natural to create n1, n2 and nstay as numeric variables, but feel free to convert them to character in the SELECT statement (using the PUT function), if needed.

Note that n2=1 (not 2) for firmid '020'.

View solution in original post

FreelanceReinh · Posted 01-07-2020 08:14 AM

Hello @ejarquejm and welcome to the SAS Support Communities!

You can derive n1, n2 and (in particular) nstay with PROC SQL queries and join the three resulting tables:

proc sql;
create table dataout as
select firmid1 as firmid, max(n1,0) as n1, max(n2,0) as n2, max(nstay,0) as nstay
from ((select firmid1, count(distinct w1) as n1
       from datain
       where firmid1
       group by firmid1)
         natural full join
      (select firmid2 as firmid1, count(distinct w2) as n2
       from datain
       where firmid2
       group by firmid2))
         natural left join
     (select firmid1, count(w1) as nstay from
       (select *
        from (select firmid1, w1 from datain)
               intersect
             (select firmid2, w2 from datain))
      group by firmid1);
quit;

I found it more natural to create n1, n2 and nstay as numeric variables, but feel free to convert them to character in the SELECT statement (using the PUT function), if needed.

Note that n2=1 (not 2) for firmid '020'.

ejarquejm · Posted 01-07-2020 09:35 AM

brilliant. i will give it a go and get back with the result.

ejarquejm · Posted 01-08-2020 04:56 AM

It of course works. Thank you.

How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

Re: How do I count elements in one data column with criteria from another column of data?

SAS Innovate 2025: Save the Date