BookmarkSubscribeRSS Feed
Wylyann
Calcite | Level 5

Hello, can anyone help me?
I have some medico-administrative data and I want to create a binomial variable that will code an individual as 1, if the type of diabetes diagnosed is constant from one hospital visit to another; and code 0, if the type of diabetes diagnosed varies according to the hospital visit.
This is what my medico-administrative data looks like:

Observation (numeric)Id (numeric) Hospital visit Number (alphanumeric)Type of diabetes diagnosed (numeric)
13042
23052
33072
415011
515022
618031
722062
830012
930031
1030041
8 REPLIES 8
ballardw
Super User

Can you show what you expect the result to look like?

 

Are the ONLY values for the diabetes variable 1 and 2?

Wylyann
Calcite | Level 5
Yes, the only values for the diabetes variable are 1 (type 1 diabetes mellitus) and 2 (type 2 diabetes mellitus).

This involves writing a command that automatically creates the variable "Type of diabetes evolution" (TDevolution) that assigns code 1 to individuals (Id) "3", "18" and "22"; then code 0 to Id "15" and "30".
ballardw
Super User

@Wylyann wrote:
Yes, the only values for the diabetes variable are 1 (type 1 diabetes mellitus) and 2 (type 2 diabetes mellitus).

This involves writing a command that automatically creates the variable "Type of diabetes evolution" (TDevolution) that assigns code 1 to individuals (Id) "3", "18" and "22"; then code 0 to Id "15" and "30".

Still have NOT shown what the resulting data set is supposed to look like.

 

 

Wylyann
Calcite | Level 5

Maybe I didn't understand you correctly. Can you reformulate your question, please?

Wylyann
Calcite | Level 5
My data contains more than 20,000 observations. What I present here is only a Sample.
mkeintz
PROC Star

Read each ID twice, the first time to set a flag if there is a change in diabetes type.  The second time to output data if the flag was set:

 

But the original code I sent included the NOTSORTED option in the by statement - which is not supported when SET has more than one input dataset.  So  "by id diabetes_type nosorted" needs to be reset to "by id", and the subsequent code needs to be changed, as here:

 

data want (drop=_:);
  set have (in=firstpass)  have (in=secondpass);
  by id;
  retain _keep_this_id 0;
  if first.id=1 then _keep_this_id=0;
  if first.id=0 and diabetes_type^=lag(diabetes_type) then _keep_this_id=1;
  if secondpass=1 and _keep_this_id=1;
run;

 

The original code (below) should not be used, and will be flagged by SAS:

 

data want (drop=_:);
  set have (in=firstpass)  have (in=secondpass);
  by id diabetes_type notsorted;
  retain _keep_this_id 0;
  if first.id=1 then _keep_this_id=0;
  if first.id=0 and first.diabetes_type=1 then _keep_this_id=1;
  if secondpass=1 and _keep_this_id=1;
run;

 

The first.id=0 and first.diabetes=1 condition tests for a change in diabetes others than at the beginning of the ID.   Because diabetes can change up, or change down., the BY statement has the notsorted parameter.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Autotuning Deep Learning Models Using SAS

Follow along as SAS’ Robert Blanchard explains three aspects of autotuning in a deep learning context: globalized search, localized search and an in parallel method using SAS.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4523 views
  • 3 likes
  • 4 in conversation