BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yul138
Calcite | Level 5

I would like to check for missing values and weird numbers and add a flag to those observations (create 2 new variables with a value set to 1 if the condition is true: flag for missing systolic, flag for missing diastolic)

my dataset is bp

Screen Shot 2021-07-19 at 14.59.14.png

 i tried one code as 

data g;
do until (last.subject_id);
set bp;
by subject_id;
if first.subject_id then do;
check = bp_systolic; flag = 0;
run;

Screen Shot 2021-07-19 at 14.59.42.png

but it doesn't work

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Define "weird number". How do we know one when we see one?

Your example input data should include at least one example of each case of "problem" value you expect to process, describe the rule(s) for identifying the problem values, and then show what you actually expect for the output of the same values.

 

Best is to provide the example values as data step code as we cannot write code against pictures.

 

Since you example data picture does not show any repeat values of the subject_id it is very hard to tell why you are writing code that appears to expect multiple values for the variable.

 

I'll get you started on a basic data step to show your example data:

data have;
   input subject_id bp_systolic bp_diastolic;
   datalines;
1001 131 81
1002 135 74
1003 135 89
;

add values as needed.

 

Some of "rules" you didn't mention but likely should have for "weird numbers".

1) bp_diastolic greater than or equal to bp_systolic 

2) either of those two variables with a value of 0 (zero) or negative

3) From the time I have spent in the hospital I suspect that your organization may have some concern about either of those BP variables over some numeric value. I'm not going to guess but I know my doctor was concerned when mine 181 over 160 one time.

 

1 should be a simple if/then.

2 and 3 could be implemented with format/informat if involving a single observation. If you insist on attempting to examine multiple observations then you need to describe what you are doing in that case.

You appear to be attempting to report anytime systolic changes in that second code. Why? Hooked up to monitors it fluctuates constantly. If using a stand alone device BP measurement change as fast as the instrument can cycle. And if the measurements are months apart then even looking for "larger" changes, say 10 points or so, may not have any significance.I've had my BP change by 10 or more points in the course of a 30 minute doctor visit.

 

View solution in original post

3 REPLIES 3
sbxkoenk
SAS Super FREQ

Hello,

 

What's wrong with:

data g;
 set bp;
 by subject_id;
 if bp_systolic = . then flag_bp_systolic_missing =1;
 else flag_bp_systolic_missing =0;
 if bp_diastolic= . then flag_bp_diastolic_missing=1;
 else flag_bp_diastolic_missing=0;
run;
/* end of program */

Or maybe you can have multiple lines per subject and you want a missing flag for the subject if at least one of his/her values is missing?

 

Koen

sbxkoenk
SAS Super FREQ

See also today's tip.

It may be relevant to you:

SAS Tip: Quick Check for Missing or Invalid Data Using Formats (Daily tip for 2021-Jul-19)

https://communities.sas.com/t5/SAS-Tips-from-the-Community/SAS-Tip-Quick-Check-for-Missing-or-Invali...

Koen

ballardw
Super User

Define "weird number". How do we know one when we see one?

Your example input data should include at least one example of each case of "problem" value you expect to process, describe the rule(s) for identifying the problem values, and then show what you actually expect for the output of the same values.

 

Best is to provide the example values as data step code as we cannot write code against pictures.

 

Since you example data picture does not show any repeat values of the subject_id it is very hard to tell why you are writing code that appears to expect multiple values for the variable.

 

I'll get you started on a basic data step to show your example data:

data have;
   input subject_id bp_systolic bp_diastolic;
   datalines;
1001 131 81
1002 135 74
1003 135 89
;

add values as needed.

 

Some of "rules" you didn't mention but likely should have for "weird numbers".

1) bp_diastolic greater than or equal to bp_systolic 

2) either of those two variables with a value of 0 (zero) or negative

3) From the time I have spent in the hospital I suspect that your organization may have some concern about either of those BP variables over some numeric value. I'm not going to guess but I know my doctor was concerned when mine 181 over 160 one time.

 

1 should be a simple if/then.

2 and 3 could be implemented with format/informat if involving a single observation. If you insist on attempting to examine multiple observations then you need to describe what you are doing in that case.

You appear to be attempting to report anytime systolic changes in that second code. Why? Hooked up to monitors it fluctuates constantly. If using a stand alone device BP measurement change as fast as the instrument can cycle. And if the measurements are months apart then even looking for "larger" changes, say 10 points or so, may not have any significance.I've had my BP change by 10 or more points in the course of a 30 minute doctor visit.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 710 views
  • 0 likes
  • 3 in conversation