Re: Check if a row exists in a dataset

Abishekaa · Posted 05-24-2021 04:14 AM

Hi everyone, Thank you in advance!

I have a dataset similar to:

PtID visit

1 4M

1 12M

1 24M

2 8M

2 12M

Each patient has multiple visit data. I want to set a condition if the visit= '24M' observation doesn't exist for a PtID. Eg: In this dataset, PtID 2 does not have a 24M visit. How would I create a flag for this, or set a condition based on the 24M visit not existing for this patient?

nrk1787db1 · Posted 05-24-2021 06:07 AM

DATA A;
	INFILE CARDS DLM=',' TRUNCOVER;
	INPUT @1 PTID :8. VISIT :$4.;
	CARDS;
1,4M
1,8M
1,16M
1,24M
2,24M
3,8M
4,128M
5,48M
5,24M
 ;
RUN;

PROC TRANSPOSE DATA=A OUT=AT;
	BY PTID;
	VAR VISIT;

PROC SQL NOPRINT;
	SELECT MAX(COUNT_PTID) FORMAT=Z8. INTO :NC FROM (SELECT COUNT(PTID) AS 
		COUNT_PTID, PTID FROM A GROUP BY PTID);

DATA C(KEEP=PTID);
	SET AT;
	ARRAY ARAT {&NC} COL1-COL&NC;

	DO I=1 TO &NC;

		IF ARAT[I]='24M' THEN
			OUTPUT;
	END;
RUN;

PROC PRINT DATA=C;

Dataset C contains all PTID values containing 24M

Kurt_Bremser · Posted 05-24-2021 06:14 AM

Where do you want to set the flag?

To simply get one observation per ptid, do this:

data want;
if 0 then set have; /* only to keep variable order */
flag = 0;
do until (last.ptid);
  set have;
  by ptid;
  if visit = '24M' then flag = 1;
end;
keep ptid flag;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Abishekaa · Posted 05-24-2021 09:24 AM

Hi Kurt,

You are correct. There is no place to add a flag. Let me rephrase my question here:

Dataset I have:

PtID visit

1 4M

1 12M

1 24M

2 8M

2 12M

Dataset I would like to create:

PtID visit visit2

1 4M 4M

1 12M 12M

1 24M 24M

2 8M 8M

2 12M 24M

I want to create a new variable 'visit2'. Visit2=visit, except if the 24M visit is missing. When it is missing, the 12M visit will be recoded to '24M' in the visit2 variable. Does this make sense?

Tom · Posted 05-24-2021 09:48 AM

Assuming 24M is the LAST visit you could take advantage of that fact.

data want;
  set have;
  by id;
  visit2=visit;
  if last.id then visit2='24M';
run;

But that assumes the data is sorted by ID and VISIT, but your values of VISIT will not actually sort properly. Do you have another variable that can be used to sort? Perhaps numeric variable? Or a DATE variable?

Abishekaa · Posted 05-24-2021 01:24 PM

Hi Tom, Thanks so much for the reply. However, 24M is not the last visit

Sajid01 · Posted 05-24-2021 03:21 PM

Hello @Abishekaa
In your question , do you expect that every PID will have a 24M visit?

Tom · Posted 05-24-2021 04:02 PM

@Abishekaa wrote:
Hi Tom, Thanks so much for the reply. However, 24M is not the last visit

So I have no idea what your data looks like then. Perhaps you can provide a more complete example.

If you just want to flag whether or not a particular visit exists then you might be able to just do that with dataset options. So if you existing dataset is named HAVE and the two variables of interest are named PTID and VISIT here is a data step that will create a new variable named HAS24M to indicate if the PTID has a VISIT '24M' or not. Note this assumes the data is sorted by PTID.

data want;
  merge have
        have(in=in2 keep=ptid visit rename=(visit=visit24m) where=('24M'=visit24m))
  ;
  by ptid ;
  has24m = in2 ;
  drop visit24m ;
run;

Registration is open

SAS Training: Just a Click Away