Re: how to create a dummy variable based on the number of appearance o...

Viveme789 · Posted 11-01-2017 10:06 PM

Hello!
I wanna ask how to create a dummy variable based on the number of appearance of a certain value ?
For example, I have dataset:

ID . diagnosis hospital
2022 . 1 . A
2022 . 0 B
2023 . 1 A
2023 . 1 . D

Say a certain disease Y is classified positive if it's been diagnosed positive at lease twice. In this class, ID 2023 would be identified having the disease Y. I want to assign another variable called Y as dummy variable:

1 if count (diagnosis = 1) >= 2 by each ID

0 otherwise

How to do it with proc sql? Thanks.

WarrenKuhfeld · Posted 11-01-2017 10:26 PM

I don't know about SQL, but it is easy with MEANS and a DATA step.

data x(drop=miss);
input ID miss diagnosis hospital $;
datalines;
2022 .   1           A
2022 .   0           B
2023 .   1           A
2023 .   1          D
;
proc means; class id; var diagnosis; output out=b sum=sum; run;
proc print; run;

data all(drop=_: sum);
   merge x b(where=(sum >= 2 and n(id)) in=f); by id;
   y = f;
   run;
proc print; run;

Viveme789 · Posted 11-01-2017 10:39 PM

Thanks for the effort! But I still have to do it with proc sql required by the professor.

Reeza · Posted 11-01-2017 11:16 PM

Select *, Sum(diagnosis) as num_diag, case when calculated num_diag>1 then 1 else 0 end as indicator

from have

group by ID;

This may or may not be the approach your instructor was expecting. I would highly suggest reviewing your lecture notes and seeing if a specific example wasn’t provided.

@Viveme789 wrote:

Hello!
I wanna ask how to create a dummy variable based on the number of appearance of a certain value ?
For example, I have dataset:

ID . diagnosis hospital
2022 . 1 . A
2022 . 0 B
2023 . 1 A
2023 . 1 . D

Say a certain disease Y is classified positive if it's been diagnosed positive at lease twice. In this class, ID 2023 would be identified having the disease Y. I want to assign another variable called Y as dummy variable:

1 if count (diagnosis = 1) >= 2 by each ID

0 otherwise

How to do it with proc sql? Thanks.

Viveme789 · Posted 11-01-2017 11:31 PM

Thank you! I have a question regarding the statement:
Sum(diagnosis) as num_diag
is it summing the number of diagnosis for each patient, which is actually the number of rows per patient, or summing (diagnosis = 1)?
And there's no specific example in the lecture notes:/
Thank you.

Miracle · Posted 11-01-2017 11:40 PM

It is an addition of the diagnosis value for each patient not the number of rows for each patient. The latter is done by count(variable).

Reeza · Posted 11-02-2017 10:37 AM

@Viveme789 wrote:
Thank you! I have a question regarding the statement:
Sum(diagnosis) as num_diag
is it summing the number of diagnosis for each patient, which is actually the number of rows per patient, or summing (diagnosis = 1)?
And there's no specific example in the lecture notes:/
Thank you.

If the values are 0/1 its the same thing.....

Ksharp · Posted 11-02-2017 09:18 AM

data x(drop=miss);
input ID miss diagnosis hospital $;
datalines;
2022 .   1           A
2022 .   0           B
2023 .   1           A
2023 .   1          D
;

proc sql;
select *,sum(diagnosis=1)>=2 as flag 
 from x
  group by id;
quit;

how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Re: how to create a dummy variable based on the number of appearance of a certain value ?

Registration is open

SAS Training: Just a Click Away