Help using Base SAS procedures

Generating ID but don't want to have an id for blank value rows

Accepted Solution Solved
Reply
Super Contributor
Posts: 398
Accepted Solution

Generating ID but don't want to have an id for blank value rows

I have a data table that I have sorted by a specific field.  I want to create an id for this field so I did the following:

data tbl ;

    length id 8 ;

    set tbl_ ;

    by num ;   

    retain id 0 ;

    if first.num then id = id + 1 ;   

run ;

The first few rows of num are blank and I want them to have a . not 0 as the id.

I'm going to use this to create an look up table so by assigning the ids here I can split up the data sets after.  I just don't want id 1 = to a blank value.

Thank you


Accepted Solutions
Solution
‎09-19-2011 03:05 PM
PROC Star
Posts: 7,360

Re: Generating ID but don't want to have an id for blank value rows

I typed too quickly, twice.  It should have read:

data tbl (drop=counter);

  length id 8 ;

  set tbl_ ;

  retain id . ;

  if not missing(num) then do;

    counter + 1;

    id=counter;

  end;

run;

Plus I have a question about num and why you originally used a by statement and looked for first.num.  Is there anything special you want done for different nums?

First.Num is only a boolean that will be equal to 1 for the first instance of the particular value in Num and zero for all other instances of that particular value.

View solution in original post


All Replies
Occasional Contributor
Posts: 16

Generating ID but don't want to have an id for blank value rows

Is this what you're looking for:

data tbl ;

length id 8 ;

set tbl_ ;

by num ;  

retain id 0 ;

      if not missing(num) then do;

            if first.num then id = id + 1;  

   end;

run;

/hobbes

PROC Star
Posts: 7,360

Generating ID but don't want to have an id for blank value rows

I think you are trying to accomplish something like:

data tbl (drop=counter);

  length id 8 ;

  set tbl_ ;

  retain id . ;

  if not missing(num) then do;

    counter + 1;

    num=counter;

end;

run;

Super Contributor
Posts: 398

Re: Generating ID but don't want to have an id for blank value rows

Thanks guys but neither worked.

Art,

shouldn't the last line of code set id=counter not num?

This is how I have it working

data tbl (drop=counter);

  length id 8 ;

  set tbl_ ;

  by num ;

  retain id . ;

  if not missing(num) then do;   

    if first.num then do ;

    counter + 1;

    id = counter;

     end ;

  end;

run;

Thank you

Solution
‎09-19-2011 03:05 PM
PROC Star
Posts: 7,360

Re: Generating ID but don't want to have an id for blank value rows

I typed too quickly, twice.  It should have read:

data tbl (drop=counter);

  length id 8 ;

  set tbl_ ;

  retain id . ;

  if not missing(num) then do;

    counter + 1;

    id=counter;

  end;

run;

Plus I have a question about num and why you originally used a by statement and looked for first.num.  Is there anything special you want done for different nums?

First.Num is only a boolean that will be equal to 1 for the first instance of the particular value in Num and zero for all other instances of that particular value.

Frequent Contributor
Posts: 81

Generating ID but don't want to have an id for blank value rows

Assuming num is a numeric variable. Try this...

data tbl ;

    length id 8 ;

    set tbl_ ;

    by num ;  

    if first.num then do;

      if nmiss(num) then id = .;

      if n(num) then id+1;

          end;

run;

Frequent Contributor
Posts: 104

Re: Generating ID but don't want to have an id for blank value rows

data tbl ;
    length id 8 ;
    set tbl_ ;
    by num ;   
    retain id 0 ;

    if num = . then id = .;
    else if first.num then id + 1 ;   
run ;

Output on first column as input:

.       .

.       .

10     1

10     1

20     2

20     2

20     2

30     3

30     3

30     3

30     3

But is there any difference between using NUM or ID as ID field?  The only difference is that ID is sequential, but if you only need them as keys, then there's really no difference.

Super Contributor
Posts: 398

Re: Generating ID but don't want to have an id for blank value rows

Thank you everyone for your help.

Arts last post helped a lot. 

Valued Guide
Posts: 765

Re: Generating ID but don't want to have an id for blank value rows

hi ... you already got an answer, but here's another suggestion ...

data y;

set x;

by num;

if ^missing(num) then id+first.num;

else id = .;

run;

Valued Guide
Posts: 2,174

Generating ID but don't want to have an id for blank value rows

great code - so simple! :smileyshocked:

(even without the "else ... " )

Valued Guide
Posts: 765

Re: Generating ID but don't want to have an id for blank value rows

hi ... nicer without the "else..." but unfortunately that generates a id=0 (rather than missing) for the missing num rows given how the sum statement works (with an initial value of zero)

without the "else" ... but retain is also an extra statement

data y;

retain id .;

set x;

by num;

if ^missing(num) then id+first.num;

run;

the use of +first.<var> for a group counter comes from posts I saw ... and remembered, that's the hard part ... by Howard Schreier a long time ago

Howard is the elegant code master

Valued Guide
Posts: 2,174

Generating ID but don't want to have an id for blank value rows

no q about Howard's status!

As to the ELSE  - my mistake!

I was surprised to see the default of an accumulator is zero rather than missing!

Cannot believe the ELSE statement would ever use measureably more cpu than the RETAIN id .; !

Super Contributor
Posts: 1,636

Generating ID but don't want to have an id for blank value rows

Besides the variable positions in the output dataset, what is the difference put "retain id ." before and after "set x"?  Thank you!

data y;

retain id .;

set x;

by num;

if ^missing(num) then id+first.num;

run;

/**********************/

data y;

set x;

retain id .;

by num;

if ^missing(num) then id+first.num;

run;

Valued Guide
Posts: 2,174

Generating ID but don't want to have an id for blank value rows

not only position, but also data type (and length for character variables) will be defined when the retain statement comes first.

When it comes after the SET, these would be inherited from the first ID variable in any of the data sets on the SET statement.

When ID is not in any of the data sets on the SET statement, the only difference is position.

Super Contributor
Posts: 1,636

Generating ID but don't want to have an id for blank value rows

Thank you for your detailed explanation Peter!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 314 views
  • 7 likes
  • 9 in conversation