ID | DIAGNOSISCODELIST_ID | ISADMITDIAGNOSIS | DIAGNOSISCODE | PXEXTRACTDATETIME |
CR-814420 | 1 | 1 | J96.21 | 03MAY2019:00:00:22.000000 |
CR-814420 | 1 | 1 | J96.21 | 11MAY2019:00:00:22.000000 |
CR-814420 | 2 | . | J44.1 | 03MAY2019:00:00:22.000000 |
CR-814420 | 2 | . | J44.1 | 11MAY2019:00:00:22.000000 |
CR-814420 | 3 | . | A41.9 | 03MAY2019:00:00:22.000000 |
CR-814420 | 3 | . | A41.9 | 11MAY2019:00:00:22.000000 |
CR-814420 | 4 | . | J18.9 | 03MAY2019:00:00:22.000000 |
CR-814420 | 4 | . | J18.9 | 11MAY2019:00:00:22.000000 |
I have a dataset like this and i need to covert it to 1 row per ID like the below. I need to take the max of the PXEXTRACTDATETIME for each ID/DIAGNOSISCODELIST_ID_1. Please advise the best way and i have also attached the example in an example.
Want | ||||||||||||
ID | DIAGNOSISCODELIST_ID_1 | ISADMITDIAGNOSIS_1 | DIAGNOSISCODE_1 | DIAGNOSISCODELIST_ID_2 | ISADMITDIAGNOSIS_2 | DIAGNOSISCODE_2 | DIAGNOSISCODELIST_ID_3 | ISADMITDIAGNOSIS_3 | DIAGNOSISCODE_3 | DIAGNOSISCODELIST_ID_4 | ISADMITDIAGNOSIS_4 | DIAGNOSISCODE_4 |
CR-814420 | 1 | 1 | J96.21 | 2 | . | J44.1 | 3 | . | A41.9 | 4 | . | J18.9 |
Hello @SDasari2
data have;
input ID :$30. DIAGNOSISCODELIST_ID ISADMITDIAGNOSIS DIAGNOSISCODE :$15. PXEXTRACTDATETIME :datetime40.;
format PXEXTRACTDATETIME datetime40.;
cards;
CR-814420 1 1 J96.21 03MAY2019:00:00:22.000000
CR-814420 1 1 J96.21 11MAY2019:00:00:22.000000
CR-814420 2 . J44.1 03MAY2019:00:00:22.000000
CR-814420 2 . J44.1 11MAY2019:00:00:22.000000
CR-814420 3 . A41.9 03MAY2019:00:00:22.000000
CR-814420 3 . A41.9 11MAY2019:00:00:22.000000
CR-814420 4 . J18.9 03MAY2019:00:00:22.000000
CR-814420 4 . J18.9 11MAY2019:00:00:22.000000
;
proc sql;
create table temp as
select *
from have
group by ID,DIAGNOSISCODELIST_ID
having max(PXEXTRACTDATETIME)=PXEXTRACTDATETIME;
quit;
proc sql noprint;
select max(obs) into :obs
from (select count(*) as obs from temp group by ID ) ;
quit;
%put &=obs;
proc summary nway data=temp missing;
class ID ;
output
out = work.wide(drop=_type_ _freq_)
idgroup(out[&obs](DIAGNOSISCODELIST_ID ISADMITDIAGNOSIS DIAGNOSISCODE)=)
;
run;
@SDasari2 wrote:
I have a dataset like this and i need to covert it to 1 row per ID like the below. I need to take the max of the PXEXTRACTDATETIME for each ID/DIAGNOSISCODELIST_ID_1. Please advise the best way and i have also attached the example in an example.
If this is your goal, there's no need to reformat the data at all. The maximum PXEXTRACTDATETIME can be found using PROC SUMMARY.
proc summary data=have nway;
class id diagnosiscodelist_id;
var pxextractdatetime;
output out=want max=;
run;
For most analyses (not all, but most) reformatting a long data set to wide is an unnecessary and time consuming step.
This give the max of pxextractdatetime with the 4 differnt DIAGNOSISCODELIST_ID one below the other. I need all of them in one row for the given ID.
@SDasari2 wrote:
ID DIAGNOSISCODELIST_ID ISADMITDIAGNOSIS DIAGNOSISCODE PXEXTRACTDATETIME CR-814420 1 1 J96.21 03MAY2019:00:00:22.000000 CR-814420 1 1 J96.21 11MAY2019:00:00:22.000000 CR-814420 2 . J44.1 03MAY2019:00:00:22.000000 CR-814420 2 . J44.1 11MAY2019:00:00:22.000000 CR-814420 3 . A41.9 03MAY2019:00:00:22.000000 CR-814420 3 . A41.9 11MAY2019:00:00:22.000000 CR-814420 4 . J18.9 03MAY2019:00:00:22.000000 CR-814420 4 . J18.9 11MAY2019:00:00:22.000000
I have a dataset like this and i need to covert it to 1 row per ID like the below. I need to take the max of the PXEXTRACTDATETIME for each ID/DIAGNOSISCODELIST_ID_1. Please advise the best way and i have also attached the example in an example.
Want ID DIAGNOSISCODELIST_ID_1 ISADMITDIAGNOSIS_1 DIAGNOSISCODE_1 DIAGNOSISCODELIST_ID_2 ISADMITDIAGNOSIS_2 DIAGNOSISCODE_2 DIAGNOSISCODELIST_ID_3 ISADMITDIAGNOSIS_3 DIAGNOSISCODE_3 DIAGNOSISCODELIST_ID_4 ISADMITDIAGNOSIS_4 DIAGNOSISCODE_4 CR-814420 1 1 J96.21 2 . J44.1 3 . A41.9 4 . J18.9
Do you want the max or are you looking to "dedup" the records keeping the latest PXEXTRACTDATETIME record for each ID Diagnosiscodelist_id?
I want the max(pxextractdatetime) for each ID/DIAGNOSISCODELIST_ID.
Hello @SDasari2
data have;
input ID :$30. DIAGNOSISCODELIST_ID ISADMITDIAGNOSIS DIAGNOSISCODE :$15. PXEXTRACTDATETIME :datetime40.;
format PXEXTRACTDATETIME datetime40.;
cards;
CR-814420 1 1 J96.21 03MAY2019:00:00:22.000000
CR-814420 1 1 J96.21 11MAY2019:00:00:22.000000
CR-814420 2 . J44.1 03MAY2019:00:00:22.000000
CR-814420 2 . J44.1 11MAY2019:00:00:22.000000
CR-814420 3 . A41.9 03MAY2019:00:00:22.000000
CR-814420 3 . A41.9 11MAY2019:00:00:22.000000
CR-814420 4 . J18.9 03MAY2019:00:00:22.000000
CR-814420 4 . J18.9 11MAY2019:00:00:22.000000
;
proc sql;
create table temp as
select *
from have
group by ID,DIAGNOSISCODELIST_ID
having max(PXEXTRACTDATETIME)=PXEXTRACTDATETIME;
quit;
proc sql noprint;
select max(obs) into :obs
from (select count(*) as obs from temp group by ID ) ;
quit;
%put &=obs;
proc summary nway data=temp missing;
class ID ;
output
out = work.wide(drop=_type_ _freq_)
idgroup(out[&obs](DIAGNOSISCODELIST_ID ISADMITDIAGNOSIS DIAGNOSISCODE)=)
;
run;
Thanks a lot. This works!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.