BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
stataq
Obsidian | Level 7

Hello,

I have a have dataset, I want to create a group seq based on:

if id is the same;

if yn=1 and seq are connected to each other, then they belong to same group;

How can I create new_seq? 

 

stataq_0-1704985902761.png

 

sample data:

data have ;
input id seq yn ;
cards;
1 1 .
1 2 .
1 3 1
1 4 1
1 5 .
1 6 .
1 7 .
1 8 1
1 9 1
1 10 1
1 11 .
1 12 .
1 13 1
1 14 1
1 15 .
1 16 1
1 17 1
2 1 1
2 2 .
2 3 1
2 4 1
2 5 .
2 6 .
2 7 1
2 8 1
2 9 .

;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

If you can trust that your data are sorted by ID and seq, then you can can use by-group processing to detect the start of each group with yn=1, and increment a counter, e.g.:

 

data have ;
  input id seq yn ;
cards;
1 1 .
1 2 .
1 3 1
1 4 1
1 5 .
1 6 .
1 7 .
1 8 1
1 9 1
1 10 1
1 11 .
1 12 .
1 13 1
1 14 1
1 15 .
1 16 1
1 17 1
2 1 1
2 2 .
2 3 1
2 4 1
2 5 .
2 6 .
2 7 1
2 8 1
2 9 .
;         

data want ;
  set have ;
  by id yn notsorted ;
  if first.id then _seqnum=0 ;
  if first.yn and yn=1 then _seqnum++1 ;
  if yn=1 then new_seq=_seqnum ;
  * drop _: ;
run ;

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26
data want;
    set have;
    by id seq;
    prev_yn=lag(yn);
    prev_id=lag(id);
    if first.id then new_seq=.;
    if id=prev_id and yn=1 and prev_yn=. then new_seq+1;
    if yn=. then new_seq_plus_missing=.;
    else new_seq_plus_missing=new_seq;
    drop prev:;
run;
--
Paige Miller
Quentin
Super User

Hi @PaigeMiller ,

Looks like that misses the first record for ID2, which has yn=1 so is a new seq.  Maybe replace

    if first.id then new_seq=.;

with

    if first.id then new_seq=(yn=1);

?

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Quentin
Super User

If you can trust that your data are sorted by ID and seq, then you can can use by-group processing to detect the start of each group with yn=1, and increment a counter, e.g.:

 

data have ;
  input id seq yn ;
cards;
1 1 .
1 2 .
1 3 1
1 4 1
1 5 .
1 6 .
1 7 .
1 8 1
1 9 1
1 10 1
1 11 .
1 12 .
1 13 1
1 14 1
1 15 .
1 16 1
1 17 1
2 1 1
2 2 .
2 3 1
2 4 1
2 5 .
2 6 .
2 7 1
2 8 1
2 9 .
;         

data want ;
  set have ;
  by id yn notsorted ;
  if first.id then _seqnum=0 ;
  if first.yn and yn=1 then _seqnum++1 ;
  if yn=1 then new_seq=_seqnum ;
  * drop _: ;
run ;

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ballardw
Super User

"Connected" should be defined. If you mean the variable Seq appears in sequential numeric order then say so.

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 440 views
  • 1 like
  • 4 in conversation