Solved: Re: Selecting the populated values for unique observations

NickS2 · Posted 06-16-2022 12:28 PM

So I have a dataset with many variables and multiple observations for one ID. I want to keep the unique observations but at the same time populate the non-missing value from that observation. Below is the sample of the dataset:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005

1 ABC 03SEP2005 T10.009

1 ABC 19DEC2006 21DEC2006

1 ABC 19DEC2006 21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

2 XYZ 06OCT2006 10OCT2006

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006 J09.2

3 LMN 04OCT2006 05OCT2006

ADM_DT is always populated in the dataset the issues are DIS_DT and DIAG_CD which are missing at some places. My aim is to get the DIS_DT and DIAG_DT for each patient for each ADM_DT.

Output expected:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005 T10.009

1 ABC 19DEC2006 21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006 05OCT2006 J09.2

Reeza · Posted 06-16-2022 12:40 PM

proc sql;
create table want as
select id, clinic, adm_dt, max(dis_dt) as dis_dt, max(diag_cd) as diag_cd
from have
group by id, clinic, adm_dt;
quit;

Use SQL aggregation functions to take the maximum which will keep the values.

@NickS2 wrote:

So I have a dataset with many variables and multiple observations for one ID. I want to keep the unique observations but at the same time populate the non-missing value from that observation. Below is the sample of the dataset:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005

1 ABC 03SEP2005 T10.009

1 ABC 19DEC2006   21DEC2006

1 ABC 19DEC2006   21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

2 XYZ 06OCT2006 10OCT2006

3 LMN 21JAN2005 25JAN2005

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006     J09.2

3 LMN 04OCT2006   05OCT2006

ADM_DT is always populated in the dataset the issues are DIS_DT and DIAG_CD which are missing at some places. My aim is to get the DIS_DT and DIAG_DT for each patient for each ADM_DT.

Output expected:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005 T10.009

1 ABC 19DEC2006   21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006 05OCT2006 J09.2

View solution in original post

tarheel13 · Posted 06-16-2022 12:36 PM

can you post this as datalines?

Reeza · Posted 06-16-2022 12:40 PM

proc sql;
create table want as
select id, clinic, adm_dt, max(dis_dt) as dis_dt, max(diag_cd) as diag_cd
from have
group by id, clinic, adm_dt;
quit;

Use SQL aggregation functions to take the maximum which will keep the values.

@NickS2 wrote:

So I have a dataset with many variables and multiple observations for one ID. I want to keep the unique observations but at the same time populate the non-missing value from that observation. Below is the sample of the dataset:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005

1 ABC 03SEP2005 T10.009

1 ABC 19DEC2006   21DEC2006

1 ABC 19DEC2006   21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

2 XYZ 06OCT2006 10OCT2006

3 LMN 21JAN2005 25JAN2005

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006     J09.2

3 LMN 04OCT2006   05OCT2006

ADM_DT is always populated in the dataset the issues are DIS_DT and DIAG_CD which are missing at some places. My aim is to get the DIS_DT and DIAG_DT for each patient for each ADM_DT.

Output expected:

ID CLINIC ADM_DT DIS_DT DIAG_CD

1 ABC 03SEP2005 06SEP2005 T10.009

1 ABC 19DEC2006   21DEC2006 F23.091

2 XYZ 06OCT2006 10OCT2006 J09.01

3 LMN 21JAN2005 25JAN2005

3 LMN 04OCT2006 05OCT2006 J09.2

tarheel13 · Posted 06-16-2022 12:49 PM

you should be able to accomplish this with proc sql.

proc sql number;
	select id, clinic, adm_dt, max(dis_dt) as dis_dt format=date9., max(diag_cd) as diag_cd
		from have
		group by id, clinic, adm_dt;
quit;

Selecting the populated values for unique observations

Re: Selecting the populated values for unique observations

Re: Selecting the populated values for unique observations

Re: Selecting the populated values for unique observations

Re: Selecting the populated values for unique observations

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away