DATA Step, Macro, Functions and more

proc sql and group to separate out instances chronologically

Reply
Contributor
Posts: 40

proc sql and group to separate out instances chronologically

Consider the following SAS code:

data test;
format date date9.
ctry_cd $2.
sn $2.;
input ctry_cd sn date;
datalines;
US 1 20000
US 1 20001
US 1 20002
CA 1 20003
CA 1 20004
US 1 20005
US 1 20006
US 1 20007
ES 2 20001
ES 2 20002
;
run;

proc sql;
create table check as
select
sn,
ctry_cd,
min(date) as begin_dt,
max(date) as end_dt
from test
group by sn, ctry_cd;
quit;

This returns:

1 CA 07OCT2014 08OCT2014
1 US 04OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014

I would like for the `proc sql` distinguish between the country moves; that is, return

1 US 04OCT2014 06OCT2014
1 CA 07OCT2014 08OCT2014
1 US 09OCT2014 11OCT2014
2 ES 05OCT2014 06OCT2014

So it still groups the instances by sn and ctry_nm but pays attention to the date so I have a timeline.

 

Super User
Posts: 19,862

Re: proc sql and group to separate out instances chronologically

Proc means to the rescue!

The NOTSORTED option will allow it to process the data in the order it encounters it, so this does assume your data is in the order you want it. 

 

proc means data=test min max;
by sn ctry_cd notsorted;
var date;
ods output summary=want;
run;

proc print data=want;
format date: date9.;
run;

Another option would be to create a group identifier for each group within the dataset, using a data step and then apply a proc sql step. 

Respected Advisor
Posts: 4,932

Re: proc sql and group to separate out instances chronologically

You need a data step or a data view to deal with sequential aspect of your processing. Here is how it can be done with a data view:

 

data test;
format date date9.
ctry_cd $2.
sn $2.;
input ctry_cd sn date;
datalines;
US 1 20000
US 1 20001
US 1 20002
CA 1 20003
CA 1 20004
US 1 20005
US 1 20006
US 1 20007
ES 2 20001
ES 2 20002
;

data testSeq / view=testSeq;
set test; by ctry_cd notsorted;
seq + first.ctry_cd;
run;

proc sql;
create table check as
select 
    sn, 
    ctry_cd,
    min(date) as begin_dt format=date9.,
    max(date) as end_dt format=date9.
from testSeq
group by seq, sn, ctry_cd;
drop view testSeq;
quit;
PG
Ask a Question
Discussion stats
  • 2 replies
  • 202 views
  • 0 likes
  • 3 in conversation