Solved
Contributor
Posts: 30

# dummy variable counting

hi guys,

I have a simple problem of creating dummy variables. here is my dataset and an example of what i want (dummy column). I have to create a variable describing total months of experience for each worker given by ID, i have the months they showed up for work, for example the first worker (ID=1) came for 21 months in 2 years ( out of 24),so the dummy column just counts from 1 to 21. I want to do the same for the entire dataset. thanks guys !

 id year month month_d dummy 1 2009 1 1 1 1 2009 2 2 2 1 2009 4 4 3 1 2009 5 5 4 1 2009 6 6 5 1 2009 7 7 6 1 2009 8 8 7 1 2009 9 9 8 1 2009 10 10 9 1 2009 11 11 10 1 2010 1 13 11 1 2010 2 14 12 1 2010 3 15 13 1 2010 4 16 14 1 2010 5 17 15 1 2010 7 19 16 1 2010 8 20 17 1 2010 9 21 18 1 2010 10 22 19 1 2010 11 23 20 1 2010 12 24 21 24 2009 6 6 1 24 2009 7 7 2 24 2009 8 8 3 24 2009 9 9 4 24 2009 10 10 5 24 2009 11 11 6 24 2010 1 13 7 24 2010 2 14 8 44 2009 1 1 1 44 2009 2 2 2 44 2009 4 4 3 44 2009 8 8 4 44 2009 9 9 5 44 2009 10 10 6 44 2009 11 11 7 44 2010 1 13 8 44 2010 2 14 9 44 2010 3 15 10 44 2010 4 16 11 44 2010 5 17 12 44 2010 6 18 13 44 2010 7 19 14 44 2010 8 20 15 44 2010 9 21 16 44 2010 10 22 17 44 2010 12 24 18

jessica

Accepted Solutions
Solution
‎01-12-2015 01:07 AM
Super Contributor
Posts: 324

## Re: dummy variable counting

Posted in reply to Jessica98

Another way:

Your data set is assumed to be SORTED by ID.

data want;

do dummy = 1 by 1 until(last.id);

set have;

by id;

output;

end;

run;

All Replies
Contributor
Posts: 37

## Re: dummy variable counting

Posted in reply to Jessica98

Hi

You could use a counter to increment for each record of a particular id

data

have

;

input id year mnth mnth_d;

cards ;

1 2009 1 1

1 2009 2 2

1 2009 4 4

1 2009 5 5

1 2009 6 6

1 2009 7 7

1 2009 8 8

1 2009 9 9

1 2009 10 10

1 2009 11 11

1 2010 1 13

1 2010 2 14

1 2010 3 15

1 2010 4 16

1 2010 5 17

1 2010 7 19

1 2010 8 20

1 2010 9 21

1 2010 10 22

1 2010 11 23

1 2010 12 24

24 2009 6 6

24 2009 7 7

24 2009 8 8

24 2009 9 9

24 2009 10 10

24 2009 11 11

24 2010 1 13

24 2010 2 14

44 2009 1 1

44 2009 2 2

44 2009 4 4

44 2009 8 8

44 2009 9 9

44 2009 10 10

44 2009 11 11

44 2010 1 13

44 2010 2 14

44 2010 3 15

44 2010 4 16

44 2010 5 17

44 2010 6 18

44 2010 7 19

44 2010 8 20

44 2010 9 21

44 2010 10 22

44 2010 12 24

;

run;

proc sort data = have ; by id year mnth; run;

data

want

;

set

have

;

by id year mnth ;

retain dummy 0;

if first.id then dummy = 1 ;

else dummy = dummy + 1 ;

run;

Thanks

Solution
‎01-12-2015 01:07 AM
Super Contributor
Posts: 324

## Re: dummy variable counting

Posted in reply to Jessica98

Another way:

Your data set is assumed to be SORTED by ID.

data want;

do dummy = 1 by 1 until(last.id);

set have;

by id;

output;

end;

run;

Posts: 3,167

## Re: dummy variable counting

Posted in reply to Jessica98

If your data is not SORTED, but CLUSTERED, meaning same ID stays together, then an option added to 's code should do:

data want;

do dummy = 1 by 1 until(last.id);

set have;

by id NOTSORTED;

output;

end;

run;

If your data is neither SORTED, nor CLUSTERED, here is a one step Proc SQL approach:

proc sql;

create table want as

select *, (select count(*) from have where id=a.id and mnth_d <= a.mnth_d) as dummy

from have a;

quit;

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 3 replies
• 751 views
• 3 likes
• 4 in conversation