BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
raja777pharma
Fluorite | Level 6

Hi Team 

 

Assign Sequence Based on Record Count:

  • If there's only 1 record in a group (count = 1), assign seq = 1.
  • If there are 2 records in a group (count = 2), start from 2 and then 1.
  • If there are 3 or more records (count >= 3), start from 3, then 2, and then 1.

Like below 

Obs   group_id   record_id   seq
--------------------------------
1      1         1           3
2      1         2           2
3      1         3           1
4      1         4           2
5      1         5           1
6      2         1           2
7      2         2           1
8      3         1           2
9      3         2           1

Thank you,

Raja

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Below code returns seq_want as per your sample data.

data have;
  input Obs group_id record_id seq;
  datalines;
1 1 1 3
2 1 2 2
3 1 3 1
4 1 4 2
5 1 5 1
6 2 1 2
7 2 2 1
8 3 1 2
9 3 2 1
;
run;

data want(drop=max_record_id);
  do until(last.group_id);
    set have;
    by group_id;
    if last.group_id then
      max_record_id=record_id;
  end;
  
  seq_want=min(3,max_record_id);
  do until(last.group_id);
    set have;
    by group_id;
    output;
    if seq_want=1 then
      seq_want=min(3, max_record_id-record_id);
    else
      seq_want=seq_want-1;
  end;
run;

proc print data=want;
run;

Patrick_0-1740881483794.png

 

@raja777pharma And about providing a SAS data step that creates the sample data below how I converted what you shared using Copilot. The data have step I used above is a copy/paste from the Copilot response.

Patrick_0-1740881907909.png

 

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Please provide the sample data in a usable form, as shown in these instructions and examples, and as @Ksharp showed in your last thread. We shouldn't have to do this for you. Screen captures or copy and paste from Excel are not acceptable.

--
Paige Miller
ballardw
Super User

You need to expand on your rules about why you have a second (or apparently many  more) 2 and 1 in the Seq values for Groupid. Your rule as stated does not say anything about repeating 2 and 1 and whether you might have a 2 without a 1 such as with only 4 values of groupid:

Obs   group_id   record_id   seq
--------------------------------
1      1         1           3
2      1         2           2
3      1         3           1
4      1         4           2

Or if something other than 2 would be the Seq value for Obs 4. Complete rules are needed to get desired results.

 

Also, is your data already grouped by the values of Groupid? And is your Record_id actually a sequential value as shown all the time and sorted as shown? This question goes to one way to get the "count" without using an external procedure to generate the counts and merge back to the data.

 


@raja777pharma wrote:

Hi Team 

 

Assign Sequence Based on Record Count:

  • If there's only 1 record in a group (count = 1), assign seq = 1.
  • If there are 2 records in a group (count = 2), start from 2 and then 1.
  • If there are 3 or more records (count >= 3), start from 3, then 2, and then 1.

Like below 

Obs   group_id   record_id   seq
--------------------------------
1      1         1           3
2      1         2           2
3      1         3           1
4      1         4           2
5      1         5           1
6      2         1           2
7      2         2           1
8      3         1           2
9      3         2           1

Thank you,

Raja


 

Patrick
Opal | Level 21

Below code returns seq_want as per your sample data.

data have;
  input Obs group_id record_id seq;
  datalines;
1 1 1 3
2 1 2 2
3 1 3 1
4 1 4 2
5 1 5 1
6 2 1 2
7 2 2 1
8 3 1 2
9 3 2 1
;
run;

data want(drop=max_record_id);
  do until(last.group_id);
    set have;
    by group_id;
    if last.group_id then
      max_record_id=record_id;
  end;
  
  seq_want=min(3,max_record_id);
  do until(last.group_id);
    set have;
    by group_id;
    output;
    if seq_want=1 then
      seq_want=min(3, max_record_id-record_id);
    else
      seq_want=seq_want-1;
  end;
run;

proc print data=want;
run;

Patrick_0-1740881483794.png

 

@raja777pharma And about providing a SAS data step that creates the sample data below how I converted what you shared using Copilot. The data have step I used above is a copy/paste from the Copilot response.

Patrick_0-1740881907909.png

 

Tom
Super User Tom
Super User

You appear to want to create SUBGROUPs with a maximum size of 3.

 

First let's convert your LISTING into an actual dataset.  Let's use a different name for the variable you wanted to create, say EXPECT,

data have;
  input group_id record_id expect;
cards;
1 1 3
1 2 2
1 3 1
1 4 2
1 5 1
2 1 2
2 2 1
3 1 2
3 2 1
;

Now just create SUBGROUPS of maximum size of 3.  We can do it one data step by reading the data twice.  Once to discover the new subgroups and their size and then a second time to read in the data again so we can attach the new variable counting down instead of UP.

data want;
  do _n_ = 1 by 1;
    set have;
    by group_id;
    if first.group_id then subgroup=0;
    if _n_=3 or last.group_id then leave;
  end;
  subgroup+1;
  do seq=_n_ to 1 by -1 ;
    set have;
    output;
  end;
run;

Results:

                   record_
Obs    group_id       id      expect    subgroup    seq

 1         1          1          3          1        3
 2         1          2          2          1        2
 3         1          3          1          1        1
 4         1          4          2          2        2
 5         1          5          1          2        1
 6         2          1          2          1        2
 7         2          2          1          1        1
 8         3          1          2          1        2
 9         3          2          1          1        1
mkeintz
PROC Star

Pass through each GROUP_ID twice.  The first pass to generate a total count, the second pass to generate the descending WANT_SEQ:

 

data have;
  input Obs group_id record_id seq;
  datalines;
1 1 1 3
2 1 2 2
3 1 3 1
4 1 4 2
5 1 5 1
6 2 1 2
7 2 2 1
8 3 1 2
9 3 2 1
run;

data want (drop=_:) ;
  set have (in=firstpass)
      have (in=secondpass);
  by group_id;

  if first.group_id then call missing(_n_remaining,want_seq);
  if firstpass then _n_remaining+1;

  if secondpass;
  if want_seq in(.,1) then want_seq=min(_n_remaining,3);
  else want_seq +(-1);
  _n_remaining+(-1);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1301 views
  • 6 likes
  • 6 in conversation