BookmarkSubscribeRSS Feed
Maneco
Calcite | Level 5
Hi, I have a very simple issue to do but I can't work it out. I have a subject ID column and I want to add a variable (counter) telling me the number of times the ID is repeated or not.
I tried to do a loop like this:

do i= 1 to 28872;
if ID(i) = ID(i-1) then counter=1;
else counter=0;
run;

but this not work, I have a message telling me " unknow ID function"

Anybody can help me? Thanks
8 REPLIES 8
LinusH
Tourmaline | Level 20
It seems that you are using an array here. Have you defined the array properly? Please show us the full code (and LOG).

/Linus
Data never sleeps
Maneco
Calcite | Level 5
In fact there is only one column (one variable) I think i's just a vector. I only want to go through it, line by line and keeping the last value in memory, so I can compare the actual lines with the last and write a 1 if it's equal. The only thing I need is asociate the values with an index (first to last) and so I can give the instruction to compare the i value with the i-1 value.
andreas_lds
Jade | Level 19
... telling me the number of times the ID is repeated or not ...


It seems that you don't know how sas process data, although it is possible to read more than one observation per iteration, I doubt that your problem is solved that way.

Maybe you should have a look at the documentation provided by SAS on http://support.sas.com/documentation/onlinedoc/base/index3.html, especially the "SAS 9.1.3 Language Reference: Concepts" could be of iterest.
Patrick
Opal | Level 21
With an array/do loop you're looping for EACH iteration of the datastep (=for each "line of data", observation). That's something you might want to do if you have a lot of columns.
What you describe is getting information out of passing through the data, line by line. That's something SAS is very good at.

Find below one possibility to get what you want. I used Proc SQL - but it could also be done with Proc Sort and datasteps.

HTH
Patrick

/* create some test data with some duplicate ID 's */
data have;
do i=1 to 1000;
id=ceil(ranuni(1)*500);
output;
end;
run;

/* count per id and write result to variable IDcount */
proc sql;
/* create table want as*/
select l.id, r.IDcount
from have l left join
(select id, count(id) as IDcount from have group by id) r
on l.id=r.id
order by id;
quit;
Cynthia_sas
SAS Super FREQ
Building on Patrick's idea...if all you want is the report on which IDs have more than one occurence in the file, you can also get PROC FREQ to do the work for you.

cynthia
[pre]
data makeid;
do i=1 to 100;
id=ceil(ranuni(1)*50);
output;
end;
run;

ods noproctitle;
ods output onewayfreqs=work.cntID(keep=id frequency);
proc freq data=makeid;
tables id / nopercent nocum ;
run;

proc print data=work.cntID;
title 'List of IDs with more than one occurrence in the file';
where Frequency gt 1;
run;
[/pre]
Maneco
Calcite | Level 5
Ok, Thank you very much
genkiboy
Calcite | Level 5
> Hi, I have a very simple issue to do but I can't work
> it out. I have a subject ID column and I want to add
> a variable (counter) telling me the number of times
> the ID is repeated or not.
> I tried to do a loop like this:
>
> do i= 1 to 28872;
> if ID(i) = ID(i-1) then counter=1;
> else counter=0;
> run;

Maneco -

Ah, I've made this mistake before. You can't think of SAS datasets as being like arrays -- you can't access each record (row) in a SAS dataset by its observation number. You're thinking like a "normal" programmer rather than like a SAS programmer.

This does what I think you're looking for:

data check;
set have;
checkrepeat=(ID=priorid); * 1 if ID is same as prior value, 0 if not;
PriorID=ID; * creates a new variable to hold the value of ID;
retain PriorID; * keeps PriorID at its _old_ value so you can check it against
the new value of ID;
* if checkrepeat; * optionally restricts the new dataset to repeated values of ID;
run;

The keyword "retain" can be a bit tricky to get used to, but as you can see it does the job,

Oh, there's a simpler way using the lag() function:

data check;
set have;
checkrepeat=(id=lag(id)); * 1 if id is same as prior value, 0 if not;

* if checkrepeat; * optionally restricts the new dataset to repeated values of id;
run;

(I learned to use retain before I found lag(), so it comes more naturally to me. :>)

Cheers -

Kevin
Maneco
Calcite | Level 5
Thanks genkiboy, that was that I was looking for

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1487 views
  • 0 likes
  • 6 in conversation