<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How many days of employment for every ID in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232700#M54742</link>
    <description>Wow! So short and simple and amazingly fast! Thanks</description>
    <pubDate>Mon, 02 Nov 2015 09:26:13 GMT</pubDate>
    <dc:creator>C_andrade</dc:creator>
    <dc:date>2015-11-02T09:26:13Z</dc:date>
    <item>
      <title>How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232571#M54714</link>
      <description>&lt;P&gt;Hi, first time posting here! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't know if this is the right location for the post...&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset of&amp;nbsp;34 mio. records on a monthly basis.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For every record there is ID, payment, startdate and enddate. There are 3 mio. unique ID's. Startdates usually go from e.g. 1/1 to 31/1, 5/1 to 5/30 and the like and cover for what period the individual was paid. So an individual who was employed all year will have 12 records, one for every month and the dates will all in all cover the whole year. Some individuals can have more than 12 because they had jobs on the side or multiple employments - and there may be overlaps. Some have less than 12 because they were unemployed for some time or for other reasons, but may still have overlaps.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would like to know for how many days in a year every indivdual was in employment.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So it is about counting the days of every individual.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My first solution was to create a dummy for every day of the year for every record.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I made a macro in the form of:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro days;

%do i=1 %to 365;
%let day=%eval(’1jan12’d + &amp;amp;i.)

data want;
set have;
if startdate le day le enddate then v&amp;amp;i.=1; else v&amp;amp;i.=0;
run;

%end;
%mend;
%days;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;But this is very time consuming and inefficient. Going through 34 mio. records creating&amp;nbsp;365 new variables would be something like 2-3 hours if not more.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any ideas?&lt;/P&gt;</description>
      <pubDate>Sun, 01 Nov 2015 08:55:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232571#M54714</guid>
      <dc:creator>C_andrade</dc:creator>
      <dc:date>2015-11-01T08:55:19Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232576#M54715</link>
      <description>&lt;P&gt;To count dates, on each record the &lt;A href="http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#p1md4mx2crzfaqn14va8kt7qvfhr.htm" target="_self"&gt;INTCK function&lt;/A&gt; could be used — or simple substraction&lt;/P&gt;
&lt;P&gt;days_worked=end_date–start_date+1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then add up all the days worked by each individual (by year if desired) using PROC SUMMARY.&lt;/P&gt;</description>
      <pubDate>Sat, 31 Oct 2015 17:55:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232576#M54715</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2015-10-31T17:55:16Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232577#M54716</link>
      <description>Yes, the macro approach is quite ineffective. &lt;BR /&gt;Even if I can't see any easy solution, there are other ways.&lt;BR /&gt;Have the data sorted by id,  start and end date. Use a data step w by and first. A&lt;BR /&gt;and last. processing. Retain previous observations dates and adjust the current observation start date,  and conditionally output. The to remove any overlapping. Then you can use SQL to sum up all dates for each id. &lt;BR /&gt;&lt;BR /&gt;Another option which is probably easier to code, but less efficient is to use the data step to explicit output one observation for each date on the original observation. The again use SQL to sum up for each id,  but now you must use the count (distinct date) construct.</description>
      <pubDate>Sat, 31 Oct 2015 18:00:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232577#M54716</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2015-10-31T18:00:03Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232624#M54730</link>
      <description>&lt;P&gt;Hi PaigeMiller,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your answer!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The&amp;nbsp;&lt;SPAN&gt;days_worked=end_date–start_date+1 I already did. That was in fact my first and earliest go actually and I have the code giving me the variable of&amp;nbsp;the number of days in the period for every observation, but problem comes up for&amp;nbsp;individuals who have worked in overlapping periods. Say, someone worked 2/1 to 2/10, but also worked 2/5&amp;nbsp;to 2/15. That would give 10 + 10 days, but he really only worked for 15 days that month. And that is what I am interested in: how many days of the month in question is he actually employed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;But maybe I could return to this approach&amp;nbsp;if I could somehow collapse the two records with days overlapping to one that says 1/2 to 15/2. So for every ID (individuals) there exist overlapping time periods the earliest start day and latest end date will be the output. That would mean maybe going from 34 mio. to 30 mio. and then simply just counting the days.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 01 Nov 2015 09:08:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232624#M54730</guid>
      <dc:creator>C_andrade</dc:creator>
      <dc:date>2015-11-01T09:08:06Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232625#M54731</link>
      <description>&lt;P&gt;Hi LinusH,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your answer.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;First solution sounds like something that is&amp;nbsp;workable. So if I understand it correctly you suggest finding all workperiods of consecutive days. An individual with 12 records may end up with 3 records that consist of exactly the days covering the year&amp;nbsp;he was in employment indicated by a startdate and a enddate for that period?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sounds a bit tricky to code.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Second solution sound like the macro approach, but without a macro, right? But that would give me 365 outputs? Not sure with using SQL to summing up for each ID, would that be faster than having SAS do it?&lt;/P&gt;</description>
      <pubDate>Sun, 01 Nov 2015 08:26:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232625#M54731</guid>
      <dc:creator>C_andrade</dc:creator>
      <dc:date>2015-11-01T08:26:10Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232626#M54732</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;BR /&gt;ID	Payment		Startdate	Enddate
1	  100		1/1-2012	1/31-2012
1	  100		2/1-2012	2/29-2012
1	  100		3/1-2012	3/31-2012
1	  120		4/1-2012	4/30-2012
1	  120		5/1-2012	5/31-2012
1	  80		6/1-2012	6/30-2012
1	  90		7/1-2012	7/31-2012
1	  100		8/1-2012	8/31-2012
1	  100		9/1-2012	9/30-2012
1	  100		10/1-2012      10/31-2012
1	  100		11/1-2012      11/30-2012
1	  200		12/1-2012      12/31-2012

2	  50	        3/3-2012	3/31-2012
2	  55		4/2-2012        4/30-2012
2	  55		5/1-2012	5/31-2012
2	  25		8/1-2012        8/31-2012

3	  17	        5/1-2012	5/17-2012
3	  13		5/5-2012	5/10-2012
3	  18	 	5/20-2012       5/28-2012
3	  20	        6/1-2012	6/22-2012
3         55            6/17-2012       6/26-2012&lt;BR /&gt;3         10            6/28-2012       6/30-2012 &lt;/PRE&gt;&lt;P&gt;Just to give you an idea of the data. 34 mio. records and main problem - so far - is the overlapping periods.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So either...&lt;/P&gt;&lt;P&gt;1. some way of counting the days but taking care of overlapping days of employment. So ID 3 would have 26 days in may and not 31 days, 29&amp;nbsp;days in june and not 35...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Or&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. some way of collapsing records with consecutive days into on observation either on a) month basis or b) year basis: So ID 1 would be unleft or have a period running from 1/1-2012 to 12/31-2012. ID 3 then:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;3	  17	        5/1-2012	5/17-2012
3	  18	 	5/20-2012       5/28-2012
3	  20	        6/1-2012	6/26-2012
3         10            6/28-2012       6/30-2012 &lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 01 Nov 2015 09:35:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232626#M54732</guid>
      <dc:creator>C_andrade</dc:creator>
      <dc:date>2015-11-01T09:35:57Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232672#M54737</link>
      <description>&lt;P&gt;The easiest way is to run a tally of overlappnig periods to see if you are employed or not.&lt;/P&gt;&lt;P&gt;So you need to read the data&amp;nbsp;sorted by event date.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
  input ID  START : mmddyy.   END : mmddyy.;
  format START END date9.;
  cards;
3	  5/1/2012	5/17/2012
3	  5/5/2012	5/10/2012
3	 	5/15/2012 5/28/2012    &amp;lt;= I added another overlap here
3	  6/1/2012	6/22/2012
3   6/17/2012 6/26/2012
3   6/28/2012 6/30/2012 
run;
data SORTED(index=(A=(ID DATE)));
  set HAVE;
  by ID;
  DATE=START; PERIODS=+1; output;
  DATE=END  ; PERIODS=-1; output;
run;
data WANT;
  retain DAY_START TOTAL_DAYS 0;
  set SORTED;
  by ID DATE;      
  format DAY_START DATE date9.;
      *putlog ID DATE  date9. ' ' STATUS;   * uncomment to see intermediate data;
      *drop START END PERIODS STATUS;       * uncomment to see intermediate data;
  STATUS+PERIODS;
  if DAY_START=0 and STATUS=1 then DAY_START=DATE;
  if STATUS=0 then do;
    TOTAL_DAYS+DATE-DAY_START+1;
    DAY_START=0;
  end;
  if last.ID then do;
    output;                                 * comment out to see intermediate data;
    DAY_START =0;&lt;BR /&gt;    TOTAL_DAYS=0;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hth.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Nov 2015 02:43:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232672#M54737</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2015-11-02T02:43:10Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232675#M54739</link>
      <description>&lt;P&gt;A simple and quick solution involves an array:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
length d1 d2 $10;
input ID Payment d1 d2;
startDate = input(translate(d1,"-","/"), mmddyy10.);
endDate = input(translate(d2,"-","/"), mmddyy10.);
format startDate endDate yymmdd10.;
drop d1 d2;
datalines;
1	  100		1/1-2012	1/31-2012
1	  100		2/1-2012	2/29-2012
1	  100		3/1-2012	3/31-2012
1	  120		4/1-2012	4/30-2012
1	  120		5/1-2012	5/31-2012
1	  80		6/1-2012	6/30-2012
1	  90		7/1-2012	7/31-2012
1	  100		8/1-2012	8/31-2012
1	  100		9/1-2012	9/30-2012
1	  100		10/1-2012      10/31-2012
1	  100		11/1-2012      11/30-2012
1	  200		12/1-2012      12/31-2012
2	  50	        3/3-2012	3/31-2012
2	  55		4/2-2012        4/30-2012
2	  55		5/1-2012	5/31-2012
2	  25		8/1-2012        8/31-2012
3	  17	        5/1-2012	5/17-2012
3	  13		5/5-2012	5/10-2012
3	  15		5/15-2012	5/28-2012      &amp;lt;- ADDED
3	  18	 	5/20-2012       5/28-2012
3	  20	        6/1-2012	6/22-2012
3         55            6/17-2012       6/26-2012
3         10            6/28-2012       6/30-2012 
;

proc sql noprint;
select min(startDate), max(endDate) into :fromDate, :toDate
from have;
run;

data want;
array d{&amp;amp;fromDate. : &amp;amp;toDate.} _temporary_;
set have; by id;
if first.id then call missing(of d{*});
do i = startDate to endDate;
    d{i} = 1;
    end;
if last.id then do;
    workedDays = sum(of d{*});
    output;
    end;
keep id workedDays;
run;

proc print data=want noobs; run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Nov 2015 04:07:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232675#M54739</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2015-11-02T04:07:44Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232677#M54740</link>
      <description>&lt;P&gt;Nice!&lt;/P&gt;</description>
      <pubDate>Mon, 02 Nov 2015 04:45:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232677#M54740</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2015-11-02T04:45:48Z</dc:date>
    </item>
    <item>
      <title>Re: How many days of employment for every ID</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232700#M54742</link>
      <description>Wow! So short and simple and amazingly fast! Thanks</description>
      <pubDate>Mon, 02 Nov 2015 09:26:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-many-days-of-employment-for-every-ID/m-p/232700#M54742</guid>
      <dc:creator>C_andrade</dc:creator>
      <dc:date>2015-11-02T09:26:13Z</dc:date>
    </item>
  </channel>
</rss>

