BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello!

 

I have a dataset that contains diagnosis for individuals. Each individual can appear on multiple rows depending on the number of diagnosis they have.

 

Data Have;

 

Ind_ID$            DiagCode$      Gender$    Age$
1                              I11                1               55

1                             C12               1               55

1                             C01               1               55

2                             Y11                2              67 

2                            F99                 2              67

3                             X12                1              92

4                             C05                2              81

 

I want to transpose the dataset and count the number of diagnosis ( var. num_diag) each individual have. I also want to place each diagnosis in separate variable.:

 

Data want; 

 

Ind_ID$            Num_diag   DCode1$  DCode2$ DCode2$  Gender$ Age$
1                               3           I11             C12         C01             1           55

2                               2           Y11            F99                             2           67

3                               1           X12                                               1           92

4                               1           C01                                               2           81

 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

 

Please try this, using a PROC TRANSPOSE first, and then a data step to compute Num_Diag.

Best,

data have;
input Ind_ID DiagCode $ Gender Age ;
cards;
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
;
run;

proc sort data = have;
  BY Ind_ID;
run;

proc transpose data = have out = have_tr (drop = _:) prefix=DiagCode;
  by Ind_ID Gender Age;
  var DiagCode; 
run;

data want;
	set have_tr;
	Num_diag = countw(catx(" ", of DiagCode:));
run;

proc print;
run;

View solution in original post

6 REPLIES 6
yabwon
Onyx | Level 15

Hi,

Read about proc transpose

and try this:

data have;
input Ind_ID DiagCode $      Gender    Age ;
cards;
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
;
run;

proc sort data = have;
  BY Ind_ID;
run;

proc transpose data = have out = want(drop = _name_) prefix=DiagCode;
  by Ind_ID Gender    Age;
  var DiagCode; 
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

 

Please try this, using a PROC TRANSPOSE first, and then a data step to compute Num_Diag.

Best,

data have;
input Ind_ID DiagCode $ Gender Age ;
cards;
1 I11 1 55
1 C12 1 55
1 C01 1 55
2 Y11 2 67
2 F99 2 67
3 X12 1 92
4 C05 2 81
;
run;

proc sort data = have;
  BY Ind_ID;
run;

proc transpose data = have out = have_tr (drop = _:) prefix=DiagCode;
  by Ind_ID Gender Age;
  var DiagCode; 
run;

data want;
	set have_tr;
	Num_diag = countw(catx(" ", of DiagCode:));
run;

proc print;
run;
Chris_LK_87
Quartz | Level 8

Thanks!

 

It seems to work!

 

But I have a question regarding selecting diagnosis (DiagCode). Is it possible to select distinct diagnosis?

ballardw
Super User

@Chris_LK_87 wrote:

Thanks!

 

It seems to work!

 

But I have a question regarding selecting diagnosis (DiagCode). Is it possible to select distinct diagnosis?


Any place you have a Data=somedataset you can use data set options to keep or drop variable or select values for a given variable (or variables).

 

proc transpose data = have (where =(diagcode in (<firstvalue> <secondvalue> <...> ))
    out = have_tr (drop = _:) prefix=DiagCode;
  by Ind_ID Gender Age;
  var DiagCode; 
run;

The example of the Where=()  dataset option above is very generic. You would place the values you want where I have "firstvalue" and such. If the values are character then they should be in quotes. The values are compared exactly to your list so do not expect "Abc" to match "ABC".

The Where=() needs the parentheses to let SAS know where the comparisons end and separate from other options like Keep, drop or rename lists.

Chris_LK_87
Quartz | Level 8
Thank you!
Kurt_Bremser
Super User

The count is achieved easily:

proc sql;
create table want as
  select
    ind_id,
    count(ind_id) as num_diag
  from have
  group by ind_id
;
quit;

Why do you want to transpose your dataset of diagnoses? It will only make your future work harder.

Instead I suppose to normalize your data, reducing redundancy and waste of storage:

data
  patients (keep=ind_id gender age num_diag)
  diagnoses (keep=ind_id diagcode)
;
set have;
by ind_id;
if first.ind_id
then num_diag = 1;
else num_diag + 1;
output diagnoses;
if last.ind_id then output patients;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 951 views
  • 1 like
  • 5 in conversation