Enumeration

SASPhile · Posted 09-30-2020 12:18 AM

A subjid can have multiple time points as shown below

Subjid timepoint

1 0

1 0.25

1 1

1 3

So for each time, a refid needs to be assigned

Subjid timepoint Refid

1 0 001

1 0.25 002

1 1 003

1 3 004

But if the timepoint repeats refid needs to be higher value

Subjid timepoint Refid

1 0 001

1 0.25 002

1 1 003

1 3 004

1 0 005

1 1 006

I tried enumeration logic,but did not work.

Reeza · Posted 09-30-2020 12:51 AM

Your last example is still 1 to 6.....

Please show what you’ve tried so far. I would also recommend extending your logic to more than one ID so the solution works for you.

@SASPhile wrote:

A subjid can have multiple time points as shown below

Subjid timepoint

1 0

1 0.25

1 1

1 3

So for each time, a refid needs to be assigned

Subjid timepoint Refid

1 0 001

1 0.25 002

1 1 003

1 3 004

But if the timepoint repeats refid needs to be higher value

Subjid timepoint Refid

1 0 001

1 0.25 002

1 1 003

1 3 004

1 0 005

1 1 006

I tried enumeration logic,but did not work.

SASPhile · Posted 09-30-2020 02:43 PM

timepoints 0,0.25,1,3,24,72,168,336 takes refid's 001 ,002 to 008.

Not all id's can have all timepoints.
Some id's can have same timepoint more than once.

but if a timepoint exists the first instance it should take it's original refid.

As example, For 122005 timepoint 0.25 has 002 as refid , for 122006 there is no timepoint 0 but it has 0.25, it would take 002 as refid for sep1, and for sep4, timepoint 0 will be 006 and for sep6 timepoint 3
will be 007

Id TimePoint Refid Date
122005 0 001 Sep1
122005 0.25 002 Sep2
122005 1 003 Sep3
122005 3 004 Sep4

122006 0.25 002 Sep1
122006 1 003 Sep2
122006 3 004 Sep3
122006 0 005 Sep4
122006 1 006 Sep5
122006 3 007 Sep6

PaigeMiller · Posted 09-30-2020 02:48 PM

I have to admit I am completely befuddled by your explanation.

All I can see in your examples is that Refid is sequential. You have written a lot of words to replace "sequential", and so I still get the feeling that "sequential" is not what you want, but ...

--
Paige Miller

Kurt_Bremser · Posted 09-30-2020 02:53 PM

So you would first create a ranking of timepoints (spanning over the whole "have" dataset across all id's) from lowest to highest, assign refids according to the ranks, and then use those refids for creating the "want" dataset?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Tom · Posted 09-30-2020 03:00 PM

Sounds like you want to give each distinct value of TIMEPOINT a number.

Do you care what order the numbers are assigned? Do you just want to number them as they are seen in the data? Or do you want to number them based on the sort order of the values of TIMEPOINT?

Let's take your example data (even though the values of REFID don't look like they are what you want).

data have;
  input Id $ TimePoint Refid $ Date $ ;
cards;
122005 0 001 Sep1
122005 0.25 002 Sep2
122005 1 003 Sep3
122005 3 004 Sep4
122006 0.25 002 Sep1
122006 1 003 Sep2
122006 3 004 Sep3
122006 0 005 Sep4
122006 1 006 Sep5
122006 3 007 Sep6
;

We can get the distinct values and number them. If we add an INDEX we use the indexed dataset to look up the new REFID2 values for each value in the original.


proc sort data=have(keep=timepoint) out=timepoints nodupkey;
  by timepoint;
run;

data timepoints(index=(timepoint));
  set timepoints;
  refid2 +1;
  format refid2 z3. ;
run;

data want ;
  set have;
  set timepoints key=timepoint;
run;

results

                  Time
Obs      Id      Point    Refid    Date    refid2

  1    122005     0.00     001     Sep1     001
  2    122005     0.25     002     Sep2     002
  3    122005     1.00     003     Sep3     003
  4    122005     3.00     004     Sep4     004
  5    122006     0.25     002     Sep1     002
  6    122006     1.00     003     Sep2     003
  7    122006     3.00     004     Sep3     004
  8    122006     0.00     005     Sep4     001
  9    122006     1.00     006     Sep5     003
 10    122006     3.00     007     Sep6     004

You could also investigate how to do it in one step using a HASH object instead of the indexed dataset.

SASPhile · Posted 09-30-2020 03:35 PM

Thanks Tom:
This is what is required: "do you want to number them based on the sort order of the values of TIMEPOINT?"

So for id 125006 (obs numbers 8,9,10) the refid2 should be 005,006,007 respectively

Obs Id Point Refid Date refid2

1 122005 0.00 001 Sep1 001
2 122005 0.25 002 Sep2 002
3 122005 1.00 003 Sep3 003
4 122005 3.00 004 Sep4 004
5 122006 0.25 002 Sep1 002
6 122006 1.00 003 Sep2 003
7 122006 3.00 004 Sep3 004
8 122006 0.00 005 Sep4 001
9 122006 1.00 006 Sep5 003
10 122006 3.00 007 Sep6 004

Tom · Posted 09-30-2020 05:19 PM

Why is the second occurrence of 3 re-using the refid of 004 that was created the first time it appeared but the third time it appears it gets a new refid?

SASPhile · Posted 09-30-2020 05:43 PM

My bad, I forgot to update the records, The refid2 should be 005, 006 and 007 respectively

1 122005 0.00 001 Sep1 001
2 122005 0.25 002 Sep2 002
3 122005 1.00 003 Sep3 003
4 122005 3.00 004 Sep4 004
5 122006 0.25 002 Sep1 002
6 122006 1.00 003 Sep2 003
7 122006 3.00 004 Sep3 004
8 122006 0.00 005 Sep4 005
9 122006 1.00 006 Sep5 006
10 122006 3.00 007 Sep6 007

Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

Re: Enumeration

SAS Innovate 2025: Call for Content

Classroom Training Available!