Working with SAS date to create different data

Accepted Solution Solved
Reply
Super Contributor
Posts: 328
Accepted Solution

Working with SAS date to create different data

[ Edited ]

Hello -,

 

I have this data:

birthdate     Age

6/22/99       18.9

7/6/99          18.9

.                    .

.                    .

.                      .   

9/15/05       12.6      

 

run;

 

This are my question.

 

I want to create data sets where the different in age from the youngest and oldest is not more that 3.5 years using the birthdate. You can use the first birthday as first date

 

Thank you,

DA

 

 


Accepted Solutions
Solution
‎05-08-2018 05:08 PM
Esteemed Advisor
Posts: 5,541

Re: Working with SAS date to create different data

Posted in reply to desireatem

Here is a way to find the largest set of birthdates with a span <= 6 months

 

data have;
input BirthDate :mmddyy. Age;
id = _n_; /* You must have some kind of person id */
format birthdate yymmdd10.;
datalines; 
 6/22/99 18.9 
 7/6/99 18.8 
 6/30/99 18.8 
 7/8/99 18.8 
 7/8/99 18.8 
 6/21/99 18.9 
 7/15/99 18.8 
 7/12/99 18.8 
 4/25/99 19.0 
 8/6/99 18.7 
 7/9/99 18.8 
 8/20/99 18.7 
 7/22/99 18.8 
 3/2/99 19.2 
 7/11/99 18.8 
 8/2/99 18.7 
 7/9/99 18.8 
 8/22/99 18.7 
 6/16/99 18.9 
 7/29/99 18.8 
 8/17/99 18.7 
 8/17/99 18.7 
 1/27/99 19.3 
 5/29/99 18.9 
 7/16/99 18.8 
 8/23/99 18.7 
 9/2/99 18.7 
 2/10/99 19.2 
 8/7/99 18.7 
 1/19/99 19.3 
 9/24/99 18.6 
 7/10/99 18.8 
 5/31/99 18.9 
 9/25/99 18.6 
 6/3/99 18.9 
 5/20/99 19.0 
 6/26/99 18.9 
 8/28/99 18.7 
 9/27/99 18.6 
 9/9/99 18.6 
 7/18/99 18.8 
 8/6/99 18.7 
 10/17/99 18.5 
 4/22/99 19.0 
 1/29/99 19.3 
 10/15/99 18.5 
 10/18/99 18.5 
 7/15/99 18.8 
 10/21/99 18.5 
 7/29/99 18.8 
 8/20/99 18.7 
 8/20/99 18.7 
 5/25/99 18.9 
 9/15/99 18.6 
 8/14/99 18.7 
 9/28/99 18.6 
 10/28/99 18.5 
 5/9/99 19.0 
 10/24/99 18.5 
 8/11/99 18.7 
 5/4/99 19.0 
 10/2/99 18.6 
 11/26/99 18.4 
 11/9/99 18.5 
 4/17/99 19.0 
;

proc sql;
create table groups as
select
    a.id, 
    a.birthdate as startDate, 
    intnx("month",a.birthdate,6,"same") as endDate,
    count(*) as grpSize
from
    have as a inner join
    have as b 
        on b.birthdate between a.birthdate and intnx("month",a.birthdate,6,"same")
group by a.id, startDate, endDate
order by grpSize desc;
quit;

data want;
if _n_ = 1 then set groups;
set have;
if startDate <= birthdate <= endDate;
keep id birthdate age;
run; 

I don't understand your second request.

 

PG

View solution in original post


All Replies
PROC Star
Posts: 1,283

Re: Working with SAS date to create different data

Posted in reply to desireatem

I don't get this. What if the birth dates are spread out so that there are like 20 years from the first to the last one? Then two data sets will not cut it if there are not supposed to be more than 3,5 years difference between them?

 

Please be more specific and post some sample data and what you want the desired outcome to look like.

Super Contributor
Posts: 328

Re: Working with SAS date to create different data

[ Edited ]

Below is the birthdate and  age . I want to create a data such that the difference in  birthday between the youngest and oldest person in that data should not be more than 3.5 years. This is just a partial data, the entire data has age difference between the youngest and oldest at 18 years

 

Birth Date Age
 6/22/99 18.9
 7/6/99 18.8
 6/30/99 18.8
 7/8/99 18.8
 7/8/99 18.8
 6/21/99 18.9
 7/15/99 18.8
 7/12/99 18.8
 4/25/99 19.0
 8/6/99 18.7
 7/9/99 18.8
 8/20/99 18.7
 7/22/99 18.8
 3/2/99 19.2
 7/11/99 18.8
 8/2/99 18.7
 7/9/99 18.8
 8/22/99 18.7
 6/16/99 18.9
 7/29/99 18.8
 8/17/99 18.7
 8/17/99 18.7
 1/27/99 19.3
 5/29/99 18.9
 7/16/99 18.8
 8/23/99 18.7
 9/2/99 18.7
 2/10/99 19.2
 8/7/99 18.7
 1/19/99 19.3
 9/24/99 18.6
 7/10/99 18.8
 5/31/99 18.9
 9/25/99 18.6
 6/3/99 18.9
 5/20/99 19.0
 6/26/99 18.9
 8/28/99 18.7
 9/27/99 18.6
 9/9/99 18.6
 7/18/99 18.8
 8/6/99 18.7
 10/17/99 18.5
 4/22/99 19.0
 1/29/99 19.3
 10/15/99 18.5
 10/18/99 18.5
 7/15/99 18.8
 10/21/99 18.5
 7/29/99 18.8
 8/20/99 18.7
 8/20/99 18.7
 5/25/99 18.9
 9/15/99 18.6
 8/14/99 18.7
 9/28/99 18.6
 10/28/99 18.5
 5/9/99 19.0
 10/24/99 18.5
 8/11/99 18.7
 5/4/99 19.0
 10/2/99 18.6
 11/26/99 18.4
 11/9/99 18.5
 4/17/99 19.0

 

Super User
Posts: 13,583

Re: Working with SAS date to create different data

Posted in reply to desireatem

What is the actual smallest value of age in your data? The largest?

Are all of you values rounded to one decimal place?

 

Generally almost any process splitting data on values of a single variable are likely to be more difficult.

Suppose your data has the minimum age of 13.4 this time. Then the "first" data set, which you have not given any naming rule for would contain ages of 13.4 to 16.9.

Now suppose you are repeating this process with another data set later where the youngest age is 9.8. Then the "first" data set would have ages 9.8 to 13.3. Which has nothing in common with the first data set from the first process. So every bit of logic would have to change to arrange results differently, select the "correct" data set for any specific task.

 

What is the actual purpose behind splitting the data? Why is 3.5 critical?

Super Contributor
Posts: 328

Re: Working with SAS date to create different data

[ Edited ]

Let make it  easy, let us not split that data. Let just create a data set where the difference between the oldest person and the youngest person is not more than 3.5 years

Solution
‎05-08-2018 05:08 PM
Esteemed Advisor
Posts: 5,541

Re: Working with SAS date to create different data

Posted in reply to desireatem

Here is a way to find the largest set of birthdates with a span <= 6 months

 

data have;
input BirthDate :mmddyy. Age;
id = _n_; /* You must have some kind of person id */
format birthdate yymmdd10.;
datalines; 
 6/22/99 18.9 
 7/6/99 18.8 
 6/30/99 18.8 
 7/8/99 18.8 
 7/8/99 18.8 
 6/21/99 18.9 
 7/15/99 18.8 
 7/12/99 18.8 
 4/25/99 19.0 
 8/6/99 18.7 
 7/9/99 18.8 
 8/20/99 18.7 
 7/22/99 18.8 
 3/2/99 19.2 
 7/11/99 18.8 
 8/2/99 18.7 
 7/9/99 18.8 
 8/22/99 18.7 
 6/16/99 18.9 
 7/29/99 18.8 
 8/17/99 18.7 
 8/17/99 18.7 
 1/27/99 19.3 
 5/29/99 18.9 
 7/16/99 18.8 
 8/23/99 18.7 
 9/2/99 18.7 
 2/10/99 19.2 
 8/7/99 18.7 
 1/19/99 19.3 
 9/24/99 18.6 
 7/10/99 18.8 
 5/31/99 18.9 
 9/25/99 18.6 
 6/3/99 18.9 
 5/20/99 19.0 
 6/26/99 18.9 
 8/28/99 18.7 
 9/27/99 18.6 
 9/9/99 18.6 
 7/18/99 18.8 
 8/6/99 18.7 
 10/17/99 18.5 
 4/22/99 19.0 
 1/29/99 19.3 
 10/15/99 18.5 
 10/18/99 18.5 
 7/15/99 18.8 
 10/21/99 18.5 
 7/29/99 18.8 
 8/20/99 18.7 
 8/20/99 18.7 
 5/25/99 18.9 
 9/15/99 18.6 
 8/14/99 18.7 
 9/28/99 18.6 
 10/28/99 18.5 
 5/9/99 19.0 
 10/24/99 18.5 
 8/11/99 18.7 
 5/4/99 19.0 
 10/2/99 18.6 
 11/26/99 18.4 
 11/9/99 18.5 
 4/17/99 19.0 
;

proc sql;
create table groups as
select
    a.id, 
    a.birthdate as startDate, 
    intnx("month",a.birthdate,6,"same") as endDate,
    count(*) as grpSize
from
    have as a inner join
    have as b 
        on b.birthdate between a.birthdate and intnx("month",a.birthdate,6,"same")
group by a.id, startDate, endDate
order by grpSize desc;
quit;

data want;
if _n_ = 1 then set groups;
set have;
if startDate <= birthdate <= endDate;
keep id birthdate age;
run; 

I don't understand your second request.

 

PG
Esteemed Advisor
Posts: 5,541

Re: Working with SAS date to create different data

Here is a faster way to do the same, without SQL:

 

proc sort data=have out=sortedDates(keep=birthDate); by birthdate; run;

data grps;
set sortedDates(rename=birthDate=startDate);
endDate = intnx("month",startDate,6,"same");
do while(birthDate <= endDate and not done);
    set sortedDates end=done;
    last + 1;
    end;
grpSize = last - _n_ + 1;
output;
if done then stop;
keep startDate endDate grpSize;
run;

proc means data=grps noprint;
output out=maxGrp maxid(grpSize(startDate endDate))=startDate endDate;
run;

data want;
if _n_ = 1 then set maxGrp;
set have;
if startDate <= birthdate <= endDate;
keep id birthdate age;
run; 
PG
Super Contributor
Posts: 328

Re: Working with SAS date to create different data

I try to use the faster approach with sql but had issue;

 

proc sort data=DFW out=sortedDates; by birthdate; run;

data grps;
set sortedDates(rename=birthDate=startDate);
endDate = intnx("month",startDate,42,"same");
do while(birthDate <= endDate and not done);
set sortedDates end=done;
last + 1;
end;
grpSize = last - _n_ + 1;
output;
if done then stop;
*keep startDate endDate grpSize;
run;
data grps (rename=(PtMRN=id));
run;

proc means data=grps noprint;
output out=maxGrp maxid(grpSize(startDate endDate))=startDate endDate;
run;

data want;
if _n_ = 1 then set maxGrp;
set have;
if startDate <= birthdate <= endDate;
*keep PtMRN birthdate age;
run;

 

129 data grps (rename=(PtMRN=id));
130 run;

WARNING: The variable PtMRN in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: The data set WORK.GRPS has 1 observations and 0 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


131
132 proc means data=grps noprint;
133 output out=maxGrp maxid(grpSize(startDate endDate))=startDate endDate;
ERROR: Variable GRPSIZE not found.
ERROR: Variable STARTDATE not found.
ERROR: Variable ENDDATE not found.
134 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MAXGRP may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.MAXGRP was not replaced because this step was stopped.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

Esteemed Advisor
Posts: 5,541

Re: Working with SAS date to create different data

Posted in reply to desireatem

You are not missing by much. Try it this way:

 

proc sort data=DFW out=sortedDates(keep=birthDate); by birthdate; run;

data grps;
set sortedDates(rename=birthDate=startDate);
endDate = intnx("month",startDate,42,"same");
do while(birthDate <= endDate and not done);
 set sortedDates end=done;
 last + 1;
 end;
grpSize = last - _n_ + 1;
output;
if done then stop;
keep startDate endDate grpSize;
run;

proc means data=grps noprint;
output out=maxGrp maxid(grpSize(startDate endDate))=startDate endDate;
run;

data want;
if _n_ = 1 then set maxGrp;
set DFW;
if startDate <= birthdate <= endDate;
keep PtMRN birthdate age;
run;
PG
Super Contributor
Posts: 328

Re: Working with SAS date to create different data

Thank you, works well!!!!

Super Contributor
Posts: 328

Re: Working with SAS date to create different data

So if I want the largest set of birthday with 3.5 years, i will replace 6 by 42 in this statements:

 

intnx("month",a.birthdate,6,"same") as endDate

 

and

 

on b.birthdate between a.birthdate and intnx("month",a.birthdate,6,"same")

???? 

Esteemed Advisor
Posts: 5,541

Re: Working with SAS date to create different data

Posted in reply to desireatem

Yes. I used 6 months as an example, since your sample data didn't cover 3.5 years.

PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 173 views
  • 2 likes
  • 4 in conversation