BookmarkSubscribeRSS Feed
s_manoj
Quartz | Level 8

Hi all,

 here's my dataset

 

data have;
input id event$ start end;
informat start end ddmmyy10.;
format start end date10.;
cards;
1 Headache 30-01-2017 08-02-2017
1 Headache 25-01-2017 03-02-2017
1 vomiting 04-02-2017 05-02-2017 4 Nausea 20-11-2017 31-12-2017 4 Nausea 03-11-2017 05-11-2017 ; run;

i am trying to get cartesian type product within a group i.e combination of all rows within a group(BY id event;).

 

 

Expecting output by using DATA STEP, NOT BY PROC SQL

 

id

Event

Start

End

Start_date

End_date

1

Headache

30JAN2017

08FEB2017

30JAN2017

08FEB2017

1

Headache

30JAN2017

08FEB2017

25JAN2017

03FEB2017

1

Headache

25JAN2017

03FEB2017

30JAN2017

08FEB2017

1

Headache

25JAN2017

03FEB2017

25JAN2017

03FEB2017

1

Vomiting 

04FEB2017

05FEB2017

04FEB2017

05FEB2017

4

Nausea

20NOV2017

31DEC2017

20NOV2017

31DEC2017

4

Nausea

20NOV2017

31DEC2017

03NOV2017

05NOV2017

4

Nausea

03NOV2017

05NOV2017

20NOV2017

31DEC2017

4

Nausea

03NOV2017

05NOV2017

03NOV2017

05NOV2017

 

Thank you in advance,

 

 

7 REPLIES 7
ballardw
Super User

Why do you want to specify the tool when Proc SQL does this much easier?

art297
Opal | Level 21

I agree with @ballardw! It would be much easier using proc sql. However, here is one way you could do it using a datastep:

 

data have;
  input id event $ start end;
  informat start end ddmmyy10.;
  format start end date10.;
  cards;
1 Headache 30-01-2017 08-02-2017
1 Headache 25-01-2017 03-02-2017
1 vomiting 04-02-2017 05-02-2017
4 Nausea 20-11-2017 31-12-2017
4 Nausea 03-11-2017 05-11-2017
;

data every_combination;
  do until (last.event);
    set have;
    by id event;
    i+1;
    if first.event then j=i;
    if last.event then n=i;
  end;
  do until (last.event);
    set have;
    by id event;
    output;
  end;
run;

data every_combination (drop=i j n);
  set every_combination;
  do k=j to n;
    set have (rename=(start=start_date end=end_date)) point=k;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
informat id $1. event $15. start end ddmmyy10.;
input id event  start end;
format start end date10.;
cards;
1 Headache 30-01-2017 08-02-2017
1 Headache 25-01-2017 03-02-2017 
1 vomiting 04-02-2017 05-02-2017
4 Nausea 20-11-2017 31-12-2017
4 Nausea 03-11-2017 05-11-2017
;
run;

data want;
set have;
do _n_=1 to nobs;
set have(rename=(id=_id event=_event start=start_date end=end_date)) nobs=nobs point=_n_;
if id=_id and event=_event then output;
end;
drop _:;
run;
art297
Opal | Level 21

@novinosrin: Methinks: bad suggestion! While your suggested solution is definitely less complex than the one I offered, it doesn't meet an acceptable performance level (unless the file only contains the five example records posted).

On a relatively small file (50,000) records your code runs 5,053 times slower than the code I proposed. That time grows exponentially with even more records in the dataset.

 

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20

Thank you Sir @art297, will this one compete with yours?

 

data have;
infile cards truncover;
informat id $1. event $15. start end ddmmyy10.;
input id event  start end;
format start end date10.;
cards;
1 Headache 30-01-2017 08-02-2017
1 Headache 25-01-2017 03-02-2017 
1 vomiting 04-02-2017 05-02-2017
4 Nausea 20-11-2017 31-12-2017
4 Nausea 03-11-2017 05-11-2017
;
run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(start=start_date end=end_date));
 dcl hash h(dataset:'have(rename=(start=start_date end=end_date))', multidata: 'y', ordered: 'y');
 h.definekey('id','event');
 h.definedata(all:'y');
 h.definedone();
 end;
 set have;
 by id event;
 do while(h.do_over(key:id,key:event) eq 0);
  output;
end;
run;
art297
Opal | Level 21

Much better! Runs slightly slower than the DOW approach (.06 seconds vs .05 seconds).

 

FWIW: Here is the code I ran to compare the three methods:

data have;
  input id event $ start end;
  informat start end ddmmyy10.;
  format start end date10.;
  if id=1 then do id=1 to 10000;
    output;
  end;
  else if id=4 then do id=10001 to 20000;
    output;
  end;
  cards;
1 Headache 30-01-2017 08-02-2017
1 Headache 25-01-2017 03-02-2017
1 vomiting 04-02-2017 05-02-2017
4 Nausea 20-11-2017 31-12-2017
4 Nausea 03-11-2017 05-11-2017
;

proc sort data=have;
  by id event;
run;

data every_combination;
  do until (last.event);
    set have;
    by id event;
    i+1;
    if first.event then j=i;
    if last.event then n=i;
  end;
  do until (last.event);
    set have;
    by id event;
    output;
  end;
run;

data every_combination (drop=i j n);
  set every_combination;
  do k=j to n;
    set have (rename=(start=start_date end=end_date)) point=k;
    output;
  end;
run;

data want;
  set have;
  do _n_=1 to nobs;
  set have(rename=(id=_id event=_event start=start_date end=end_date)) nobs=nobs point=_n_;
  if id=_id and event=_event then output;
  end;
  drop _:;
run;

data want;
if _n_=1 then do;
if 0 then set have(rename=(start=start_date end=end_date));
 dcl hash h(dataset:'have(rename=(start=start_date end=end_date))', multidata: 'y', ordered: 'y');
 h.definekey('id','event');
 h.definedata(all:'y');
 h.definedone();
 end;
 set have;
 by id event;
 do while(h.do_over(key:id,key:event) eq 0);
  output;
end;
run;

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20

@art297 Thank you Sir for the interactive feedback. I am afraid  at Depaul college lab, we can't operate(test) with large datasets and so your inputs(if and when you have time) does matter big time.  But I think 50,000 records should be possible for me to test which I will do so in future.  Thanks as always. I can't appreciate enough

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1399 views
  • 1 like
  • 4 in conversation