I have a data similar to one below
ID | Date |
---|---|
1 | 11jul2009 |
1 | 06jul2010 |
1 | 01jul2012 |
2 | 17jul2008 |
2 | 01jul2001 |
3 | 06jul2001 |
3 | 11jul2005 |
3 | 01jul2011 |
4 | 17jul2011 |
4 | 01jul2012 |
5 | 11jul2006 |
6 | 01jul2000 |
6 | 17jul2003 |
6 | 01jul2009 |
7 | 17jul2004 |
I am trying to create a data that looks like below
ID | Dt1 | Dt2 |
---|---|---|
1 | 11jul2009 | 01jul2012 |
2 | 17jul2008 | 01jul2011 |
3 | 06jul2001 | 01jul2011 |
4 | 17jul2011 | 01jul2012 |
5 | 11jul2006 | 11jul2006 |
6 | 01jul2000 | 01jul2009 |
7 | 17jul2004 | 17jul2004 |
I short I am trying to summarize the minimum and maximum dates by id. Note: the number of times an id repeats is not same. ID1 has 3 rows, ID2 has 2 rows, ID5 has 1 row. I tried the lag function but it's not giving the expected results.
Thanks.
hi ... maybe SQL ...
data x;
input id date :date9. @@;
format date date9.;
datalines;
1 11jul2009 1 06jul2010 1 01jul2012
2 17jul2008 2 01jul2001
3 06jul2001 3 11jul2005 3 01jul2011
4 17jul2011 4 01jul2012
5 11jul2006
6 01jul2000 6 17jul2003 6 01jul2009
7 17jul2004
;
proc sql;
create table minmax as
select id, min(date) as dt1 format=date9., max(date) as dt2 format=date9.
from x
group by id;
quit;
Thanks. But my data has dates in rows not in multiple columns.
Hi ... my solution creates a data set with dates in one variable named DATE (the way the raw data looks in the data step has no relationship to the way the data set looks)
(curious ... do you know how an INPUT statement that ends with "@@" works)
also, MISSING values would present no problem with the SQL solution I posted
data x;
input id date :date9. @@;
format date date9.;
datalines;
1 11jul2009 1 06jul2010 1 01jul2012
2 17jul2008 2 01jul2001
3 06jul2001 3 11jul2005 3 01jul2011
4 17jul2011 4 01jul2012
5 11jul2006
6 01jul2000 6 17jul2003 6 01jul2009
7 17jul2004
;
proc print data=x;
run;
Obs id date
1 1 11JUL2009
2 1 06JUL2010
3 1 01JUL2012
4 2 17JUL2008
5 2 01JUL2001
6 3 06JUL2001
7 3 11JUL2005
8 3 01JUL2011
9 4 17JUL2011
10 4 01JUL2012
11 5 11JUL2006
12 6 01JUL2000
13 6 17JUL2003
14 6 01JUL2009
15 7 17JUL2004
MikeZdeb,
My mistake. I overlooked the @@ in the INPUT statement. It worked.
Since you're using the lag function, I presume you are using a DATA step, so in that vein here is the classic DATA step solution ("classic" in the sense that it's fhe first technique I became acquainted with):
data want (drop=date);
set have ;
by id;
retain mindate;
if first.id then mindate=date;
if last.id then do;
maxdate=date;
output;
end;
run;
Note this assumes there are no missing date values and the data is sorted by ID DATE.
It worked. Lukcily I do not have missing data. Any suggestons if my data has missing values? Just curious?
Thanks.
Certainly. Works with missing numbers too:
data have;
input id date :date9.;
datalines;
1 11jul2009
1 06jul2010
1 01jul2012
2 17jul2008
2 01jul2001
3 .
3 .
3 .
4 17jul2011
4 .
5 11jul2006
6 01jul2000
6 .
6 .
7 17jul2004
;
run;
RESULTS:
1 | 11JUL2009 | 01JUL2012 |
2 | 17JUL2008 | 01JUL2001 |
3 | . | . |
4 | 17JUL2011 | . |
5 | 11JUL2006 | 11JUL2006 |
6 | 01JUL2000 | . |
7 | 17JUL2004 | 17JUL2004 |
If you have a missing value for date, and if the data are sorted by ID then DATE, it would mean that the first date(s) is missing (all missing values compare as lower than the most negative valid number).
So you could change the single line
if first.id then mindate=date;
to two lines:
if first.id then mindate=date;
if mindate=. then mindate=date;
...and a Hash solution:
data _null_;
if (1=2) then set have;
format date1 date2 date9.; call missing(date1,date2);
declare hash dt (ordered:'a');
dt.defineKey("id");
dt.defineData("id","date1","date2");
dt.defineDone();
do until(done);
set have end=done;
by id notsorted;
if first.id then do;
date1=date;
rc=dt.replace();
end;
if last.id then do;
date2=date;
rc=dt.replace();
end;
end;
dt.output(dataset:"want");
stop;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.