Solved: Re: How to calculate the number of days person belonged in a group in ...

viollete · Posted 04-11-2017 06:49 AM

Hello,

I have this type of data:

ID start_study end_study

1 01APR2010 20OCT2010

2 01APR2010 01APR2014

3 01Nov2010 20OCT2015

4 01APR2010 20OCT2016

5 01APR2010 20APR2010

.......

The period I am looking in is from 01APR2010 till 01APR2017. I want to create a variable for each month: 84 varibles.

The start_study is the date when a person joined a group and end_study is the date when he left it. I want to calculate for each person how many days per month a person belonged for a group, for example:

ID=1 in first month (april 2010) was in group for 30 days, in second month (may 2010) was for 31 days , ...., in seventh month (octorber 2010) was for 20 days and after would be 0 cause he left the group.

what would be the best way to do this?

thank you

Kurt_Bremser · Posted 04-11-2017 07:08 AM

Instead of creating variables for months, it's better to write a record per month. Never keep data (months) in structure (variables/columns).

data have;
input ID start_study :date9. end_study :date9.;
format start_study end_study date9.;
cards;
1 01APR2010 20OCT2010
2 01APR2010 01APR2014
3 01Nov2010 20OCT2015
4 01APR2010 20OCT2016
5 01APR2010 20APR2010
;
run;

data want (keep=id year month yearmonth days);
set have;
length yearmonth $6;
do until (start_study > end_study);
  year = year(start_study);
  month = month(start_study);
  yearmonth = put(year,z4.) !! put(month,z2.);
  days = min(intnx('month',start_study,0,'end'),end_study) - start_study + 1;
  output;
  start_study = intnx('month',start_study,1,'begin');
end;
run;

Also note how I presented example data in a data step. This allows your assumed helpers to recreate your dataset with a simple copy/paste and run. See this as a simple courtesy toward those you want to help you.

In case you absolutely need columns per month, yearmonth can be used in a proc transpose to name the columns.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 04-11-2017 07:08 AM

Instead of creating variables for months, it's better to write a record per month. Never keep data (months) in structure (variables/columns).

data have;
input ID start_study :date9. end_study :date9.;
format start_study end_study date9.;
cards;
1 01APR2010 20OCT2010
2 01APR2010 01APR2014
3 01Nov2010 20OCT2015
4 01APR2010 20OCT2016
5 01APR2010 20APR2010
;
run;

data want (keep=id year month yearmonth days);
set have;
length yearmonth $6;
do until (start_study > end_study);
  year = year(start_study);
  month = month(start_study);
  yearmonth = put(year,z4.) !! put(month,z2.);
  days = min(intnx('month',start_study,0,'end'),end_study) - start_study + 1;
  output;
  start_study = intnx('month',start_study,1,'begin');
end;
run;

Also note how I presented example data in a data step. This allows your assumed helpers to recreate your dataset with a simple copy/paste and run. See this as a simple courtesy toward those you want to help you.

In case you absolutely need columns per month, yearmonth can be used in a proc transpose to name the columns.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

viollete · Posted 04-11-2017 08:22 AM

Thanks! It works fine for small data size, but mine is more than 7 million of observations (sorry that i didnt mentioned that before!)

Kurt_Bremser · Posted 04-11-2017 08:31 AM

@viollete wrote:

Thanks! It works fine for small data size, but mine is more than 7 million of observations (sorry that i didnt mentioned that before!)

And?

7 Million * 38 bytes (4*8 for the numerics and 6 for yearmonth) equates to roughly 270 MB, which is puny.

And you can even reduce the number of bytes needed:

length year month days 3;

as the small numbers don't need more precision.

And keep in mind that forcing SAS to have columns for all months when most of the times the values will be missing is a colossal waste of resources.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

viollete · Posted 04-11-2017 08:49 AM

At the end it work (it took some time to calculate). Thanks

How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Registration is open

How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Re: How to calculate the number of days person belonged in a group in each month?

Registration is open

SAS Training: Just a Click Away