generate sequence

molla · Posted 11-21-2016 07:21 AM

Hi,

I have the following dataset

student_id

1

2

3

4

1

2

3

1

2

I need the output as

student_id seq_num

1 1

2 2

3 3

4 4

1 1

2 2

3 3

1 1

2 2

3 3

1 1

2 2

Kurt_Bremser · Posted 11-21-2016 07:29 AM

Ahem,

seq_num = student_id;

?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

RW9 · Posted 11-21-2016 07:43 AM

Not sure what your question here is. You have one variable and need to create another variable which is exactly the same as the variable you have? Maybe your required output is not correct, and you need the sequence within each by group? If so:

data want;
  set have;
  do seq_num=1 to 4;
    output;
  end;
run;

mkeintz · Posted 11-21-2016 08:01 AM

I think what you really want is to track the order in which id's first appear, yes? I.e.if you have

id

AAA

A21

BBB

CCC

A21

BBB

AAA

...

then do you wnat id=AAA seq=1, id=A21 seq=2, id=BBB seq=3, id=CCC seq=4.

If so then this would work:

data want (drop=rc);
  set have;
  if _n_=1 then do;
    declare hash id_lookup();
      id_lookup.definekey('id');
      id_lookup.definedata('seq');
      id_lookup.definedone();
  end;
  rc=id_lookup.find();
  if rc^=0 then do;
    seq=id_lookup.num_items+1;
    rc=id_lookup.add();
  end;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Astounding · Posted 11-21-2016 10:33 AM

Here's an idea:

data want;

set have;

seq_num + 1;

if student_id < lag(student_id) then seq_num=1;

run;

It's not 100% clear when to start the SEQ_NUM values over again, but this might be sufficient.

mkeintz · Posted 11-21-2016 10:49 AM

I think the OP wants to set up a lookup table, in which each ID is assigned a unique sequence number, with this additional propert: the sequence number will rank the order of first appearance of the ID.

MK

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

Re: generate sequence

SAS Innovate 2025: Call for Content

Click image to register for webinar

Classroom Training Available!