Help using Base SAS procedures

Simple loop

Reply
Occasional Contributor
Posts: 12

Simple loop

Hi, I have a very simple issue to do but I can't work it out. I have a subject ID column and I want to add a variable (counter) telling me the number of times the ID is repeated or not.
I tried to do a loop like this:

do i= 1 to 28872;
if ID(i) = ID(i-1) then counter=1;
else counter=0;
run;

but this not work, I have a message telling me " unknow ID function"

Anybody can help me? Thanks
Super User
Posts: 5,254

Re: Simple loop

It seems that you are using an array here. Have you defined the array properly? Please show us the full code (and LOG).

/Linus
Data never sleeps
Occasional Contributor
Posts: 12

Re: Simple loop

In fact there is only one column (one variable) I think i's just a vector. I only want to go through it, line by line and keeping the last value in memory, so I can compare the actual lines with the last and write a 1 if it's equal. The only thing I need is asociate the values with an index (first to last) and so I can give the instruction to compare the i value with the i-1 value.
Super Contributor
Posts: 259

Re: Simple loop

... telling me the number of times the ID is repeated or not ...


It seems that you don't know how sas process data, although it is possible to read more than one observation per iteration, I doubt that your problem is solved that way.

Maybe you should have a look at the documentation provided by SAS on http://support.sas.com/documentation/onlinedoc/base/index3.html, especially the "SAS 9.1.3 Language Reference: Concepts" could be of iterest.
Respected Advisor
Posts: 3,886

Re: Simple loop

With an array/do loop you're looping for EACH iteration of the datastep (=for each "line of data", observation). That's something you might want to do if you have a lot of columns.
What you describe is getting information out of passing through the data, line by line. That's something SAS is very good at.

Find below one possibility to get what you want. I used Proc SQL - but it could also be done with Proc Sort and datasteps.

HTH
Patrick

/* create some test data with some duplicate ID 's */
data have;
do i=1 to 1000;
id=ceil(ranuni(1)*500);
output;
end;
run;

/* count per id and write result to variable IDcount */
proc sql;
/* create table want as*/
select l.id, r.IDcount
from have l left join
(select id, count(id) as IDcount from have group by id) r
on l.id=r.id
order by id;
quit;
SAS Super FREQ
Posts: 8,739

Re: Simple loop

Building on Patrick's idea...if all you want is the report on which IDs have more than one occurence in the file, you can also get PROC FREQ to do the work for you.

cynthia
[pre]
data makeid;
do i=1 to 100;
id=ceil(ranuni(1)*50);
output;
end;
run;

ods noproctitle;
ods output onewayfreqs=work.cntID(keep=id frequency);
proc freq data=makeid;
tables id / nopercent nocum ;
run;

proc print data=work.cntID;
title 'List of IDs with more than one occurrence in the file';
where Frequency gt 1;
run;
[/pre]
Occasional Contributor
Posts: 12

Re: Simple loop

Ok, Thank you very much
Occasional Contributor
Posts: 5

Re: Simple loop

> Hi, I have a very simple issue to do but I can't work
> it out. I have a subject ID column and I want to add
> a variable (counter) telling me the number of times
> the ID is repeated or not.
> I tried to do a loop like this:
>
> do i= 1 to 28872;
> if ID(i) = ID(i-1) then counter=1;
> else counter=0;
> run;

Maneco -

Ah, I've made this mistake before. You can't think of SAS datasets as being like arrays -- you can't access each record (row) in a SAS dataset by its observation number. You're thinking like a "normal" programmer rather than like a SAS programmer.

This does what I think you're looking for:

data check;
set have;
checkrepeat=(ID=priorid); * 1 if ID is same as prior value, 0 if not;
PriorID=ID; * creates a new variable to hold the value of ID;
retain PriorID; * keeps PriorID at its _old_ value so you can check it against
the new value of ID;
* if checkrepeat; * optionally restricts the new dataset to repeated values of ID;
run;

The keyword "retain" can be a bit tricky to get used to, but as you can see it does the job,

Oh, there's a simpler way using the lag() function:

data check;
set have;
checkrepeat=(id=lag(id)); * 1 if id is same as prior value, 0 if not;

* if checkrepeat; * optionally restricts the new dataset to repeated values of id;
run;

(I learned to use retain before I found lag(), so it comes more naturally to me. Smiley Embarassed)

Cheers -

Kevin
Occasional Contributor
Posts: 12

Re: Simple loop

Thanks genkiboy, that was that I was looking for
Ask a Question
Discussion stats
  • 8 replies
  • 171 views
  • 0 likes
  • 6 in conversation