BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
DocMartin
Quartz | Level 8

I have a data set WORKERS. The variables are the worker's ID, maximum number of days on job (MaxDays) and days that the worker showed up to work (DayAtWork). Here's some trivial SAS code:

data workers; input id maxDays DayAtWork;
cards;
125 8 1
125 8 2
125 8 5
125 8 6
125 8 8
150 3 1
150 3 2
150 3 3
162 4 2
;
run;

Here's the data set I'd like to get:

ID   Y  N

125 5 3

150 3 0

162 1 3 

 

The tricky part comes in because in my real data set each worker could have 100s of days with various combinations of days in which they worked and days that they were absent.

 

Any help much appreciated!

 

Andrew

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

The simplest is probably an SQL solution:

proc sql;     
  create table want as                                                                                                                          
  select id,y,max-y as n                                                                                                                
  from(select id,count(distinct DayAtWork) as y,max(maxDays) as max from workers group by id);                                          
quit;

I used COUNT(DISTINCT Dayatwork) in case the same day is repeated. The MAX(maxDays) is just to get a single value.

 

You can do it in a datastep (data must be sorted) like this:

data want;                                                                                                                              
  do until(last.id);                                                                                                                    
    set workers;                                                                                                                        
    by id DayAtWork;                                                                                                                    
    y=sum(y,first.DayAtWork);                                                                                                           
    end;                                                                                                                                
  n=maxDays-y;                                                                                                                          
  keep id y n;                                                                                                                          
run;

 

 

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

I think I see what you want:

data want;
set have (rename=(maxdays=y));
by id;
retain n;
if first.id
then n = 1;
else n + 1;
if last.id;
n = y - n;
keep id y n;
run;

Untested, posted from my tablet.

DocMartin
Quartz | Level 8

I don't think so. Take the following two examples:
One guy works days 1,2, and 5 of 5 days. The other works days 1, 2, 3, 4, 5 of 5 days. 

Kurt_Bremser
Super User

Ah, we also need to re-calculate y:

data want;
set have (rename=(maxdays=y));
by id;
retain n;
if first.id
then n = 1;
else n + 1;
if last.id;
n = y - n;
y = y - n;
keep id y n;
run;
s_lassen
Meteorite | Level 14

The simplest is probably an SQL solution:

proc sql;     
  create table want as                                                                                                                          
  select id,y,max-y as n                                                                                                                
  from(select id,count(distinct DayAtWork) as y,max(maxDays) as max from workers group by id);                                          
quit;

I used COUNT(DISTINCT Dayatwork) in case the same day is repeated. The MAX(maxDays) is just to get a single value.

 

You can do it in a datastep (data must be sorted) like this:

data want;                                                                                                                              
  do until(last.id);                                                                                                                    
    set workers;                                                                                                                        
    by id DayAtWork;                                                                                                                    
    y=sum(y,first.DayAtWork);                                                                                                           
    end;                                                                                                                                
  n=maxDays-y;                                                                                                                          
  keep id y n;                                                                                                                          
run;

 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 830 views
  • 0 likes
  • 3 in conversation