BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
joshCRF
Fluorite | Level 6

I'm wanting to create a sequencing variable that can be used to distinguish two different entities within the same grouping (if that makes sense!)

Below is a screenshot of the data I'm using (will try upload some actual datalines when it works) and the column highlighted in yellow is how I would like the sequencing variable to be grouped. So the data is currently grouped by SUBJECT, TERM and START. I'd like the sequencing variable to indicate that there are effectively two different instances within the same group of SUBJECT and TERM.

CFLG indicates where these terms are linked, and as the data below shows there is a linking for the first 3 records, but then doesn't link again till the last two. So would like to indicate that there are two instances within this with the use of the NEW SEQUENCE VAR that I would like to create.

group seq example.JPG

 

The dataset I'm using is currently as follows (understand this is very barebones at the moment):

 

data example2;
set example;
by subject term start;

if first.subject then newseq=0;

run;


I've tried different variations of if first.cflg or first.term etc, just to absolutely no avail. It's probably the simplest thing I'm missing.

 

Any support would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Based on your picture something like below should work (not tested).

data example2;
  set example;
  by subject term start;

  lag_cflg=lag(cflg);

  if first.term then r_newseq=cflg;
  else if cflg=1 and missing(lag(cflg)) then r_newseq+1;

  if cflg=1 then newseq=r_newseq;

  drop lag_cflg r_newseq;
run;

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

Based on your picture something like below should work (not tested).

data example2;
  set example;
  by subject term start;

  lag_cflg=lag(cflg);

  if first.term then r_newseq=cflg;
  else if cflg=1 and missing(lag(cflg)) then r_newseq+1;

  if cflg=1 then newseq=r_newseq;

  drop lag_cflg r_newseq;
run;
mkeintz
PROC Star

It appears to me that observations come in clusters of CFLG=1 and CFLG=..  For each cluster with CFLG=1 you want a constant NEW_SEQ_VAR value for every obs in the cluster.   You want the first such cluster to have NEW_SEQ_VAR=1, the second to have NEW_SEQ_VAR=2, etc.

 

And apparently, with each new SUBJECT, you want to reset the first CLFG=1 group to restart with NEW_SEQ_VAR=1.

 

If that's the case, then the BY statement needs only SUBJECT and CFLG, as below (untested in the absence of a working DATA step with sample data):

 

data want (drop=_:);
  set have;
  by subject cflg notsorted;
  if first.subject then _cflg_group=0;
  if first.cflg=1 and cflg=1 then _cflg_group+1;
  if cflg=1 then new_seq_var=_cflg_group;
run;

 

The data may be sorted by TERM and START within SUBJECT, but including TERM and START in the BY statement would be problematic.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
/*That would be better if you post some data by SAS code.*/
data have;
input subject id cflg;
cards;
1 1 1
1 2 1
1 3 1
1 4 .
1 5 .
1 6 .
1 7 .
1 8 .
1 9 .
1 10 .
1 11 1
1 12 1
2 1 1
2 2 1
2 3 1
2 4 .
2 5 .
;
data want(drop=count);
 set have;
 by subject;
 if first.subject then count=0;
 if first.subject or (cflg=1 and missing(lag(cflg)) ) then count+1;
 if cflg=1 then want=count;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 357 views
  • 5 likes
  • 4 in conversation