DATA Step, Macro, Functions and more

Keeping counting variable in the SAS table.

Reply
N/A
Posts: 0

Keeping counting variable in the SAS table.

I have a table with the numeric variable ID among several others. The table is sorted by ID.
I want to add a new variable, COUNT, that is incremented for each new ID.

Like this:

ID COUNT
12 1
12 1
14 2
23 3
24 4
24 4
35 5

For the first data step I could compare ID to an impossible ID value of - 1.

I declare %let TEMp=-1, and compare in each data step.

if ID=&temp, COUNT should not be incremented.

if ID~=TEMP, then I increment COUNT and assign the new ID to TEMP.

But it didn't work the expected way.

My table looked like this:

ID COUNT
12 1
12 2
14 3
23 4
24 5
24 6
35 7

I am not sure how to assign TEMP when there was a new ID.

&TEMP=ID

or

TEMP=ID.

For the first step the if expression must be: If ID~=&TEMP , with the ampersand.

Conflict between numeric and character variable or what have I done wrong?
SAS Super FREQ
Posts: 8,743

Re: Keeping counting variable in the SAS table.

Hi:
I'm not sure why you have mixed macro variables into this need to get a COUNT variable in a table.

If you use BY group processing, SAS creates some automatic variables that signal when your program is reading the first of a BY variable group or the last of a BY variable group. This happens automatically in a DATA step program. You might investigate FIRST.byvar and/or LAST.byvar processing.

For example, consider this data:
[pre]
12 alan 100 member
12 anne 200 spouse
14 bob 111 member
23 carla 112 member
24 dana 113 member
24 dave 114 spouse
35 edna 115 member
[/pre]

When a SAS program encounters this data, if you turn BY processing ON for the ID variable, then just looking at ID and name, the values of FIRST.ID and LAST.ID for every observation would be:
[pre]
ID NAME FIRST.ID LAST.ID
12 alan 1 0
12 anne 0 1
14 bob 1 1
23 carla 1 1
24 dana 1 0
24 dave 0 1
35 edna 1 1
[/pre]

The FIRST. and LAST. variables are not written to your final dataset, they are just available for processing and testing. This allows you to do certain things (like increment a COUNT variable, for a certain condition, such as:
[pre]
if first.id = 1 then count = count + 1;
OR
if first.id then count+1;
[/pre]

Those last 2 statements are equivalent statements. When a variable has a value of 0 or 1, you can consider 0= false or 1=true. So the "boolean" logic way to do a test is to write a shortcut condition, such as "if first.id". The only other statement that you need to read about is that RETAIN statement, because normally, variable values are "reset" or reinitialized between each execution of the Data step program. Using the RETAIN statement instructs SAS that there is a variable you want to have it "remember" and not reset between each execution of the DATA step program.

You seem to want to hold the ID value in the TEMP variable so you can test the previous variable value with the current variable value. This could be achieved using the LAG function, but for a simple count, it's not necessary.

Here's a program that reads some fake data and then creates a count variable in a table called "newmember":
[pre]
data info;
infile datalines;
input ID name $ amt typejoin $;
return;
datalines;
12 alan 100 member
12 anne 200 spouse
14 bob 111 member
23 carla 112 member
24 dana 113 member
24 dave 114 spouse
35 edna 115 member
;
run;

proc sort data=info;
by id typejoin;
run;

data newmember;
set info;
by ID;
retain count 0;
if first.ID then count+1;
run;

options nodate nonumber nocenter;

proc print data=newmember;
var id count name typejoin amt;
run;
[/pre]

And the output is shown below.

cynthia
[pre]
Obs ID count name typejoin amt

1 12 1 alan member 100
2 12 1 anne spouse 200
3 14 2 bob member 111
4 23 3 carla member 112
5 24 4 dana member 113
6 24 4 dave spouse 114
7 35 5 edna member 115

[/pre]
Ask a Question
Discussion stats
  • 1 reply
  • 490 views
  • 0 likes
  • 2 in conversation