I would like to check for missing values and weird numbers and add a flag to those observations (create 2 new variables with a value set to 1 if the condition is true: flag for missing systolic, flag for missing diastolic)
my dataset is bp
i tried one code as
data g;
do until (last.subject_id);
set bp;
by subject_id;
if first.subject_id then do;
check = bp_systolic; flag = 0;
run;
but it doesn't work
Define "weird number". How do we know one when we see one?
Your example input data should include at least one example of each case of "problem" value you expect to process, describe the rule(s) for identifying the problem values, and then show what you actually expect for the output of the same values.
Best is to provide the example values as data step code as we cannot write code against pictures.
Since you example data picture does not show any repeat values of the subject_id it is very hard to tell why you are writing code that appears to expect multiple values for the variable.
I'll get you started on a basic data step to show your example data:
data have; input subject_id bp_systolic bp_diastolic; datalines; 1001 131 81 1002 135 74 1003 135 89 ;
add values as needed.
Some of "rules" you didn't mention but likely should have for "weird numbers".
1) bp_diastolic greater than or equal to bp_systolic
2) either of those two variables with a value of 0 (zero) or negative
3) From the time I have spent in the hospital I suspect that your organization may have some concern about either of those BP variables over some numeric value. I'm not going to guess but I know my doctor was concerned when mine 181 over 160 one time.
1 should be a simple if/then.
2 and 3 could be implemented with format/informat if involving a single observation. If you insist on attempting to examine multiple observations then you need to describe what you are doing in that case.
You appear to be attempting to report anytime systolic changes in that second code. Why? Hooked up to monitors it fluctuates constantly. If using a stand alone device BP measurement change as fast as the instrument can cycle. And if the measurements are months apart then even looking for "larger" changes, say 10 points or so, may not have any significance.I've had my BP change by 10 or more points in the course of a 30 minute doctor visit.
Hello,
What's wrong with:
data g;
set bp;
by subject_id;
if bp_systolic = . then flag_bp_systolic_missing =1;
else flag_bp_systolic_missing =0;
if bp_diastolic= . then flag_bp_diastolic_missing=1;
else flag_bp_diastolic_missing=0;
run;
/* end of program */
Or maybe you can have multiple lines per subject and you want a missing flag for the subject if at least one of his/her values is missing?
Koen
See also today's tip.
It may be relevant to you:
SAS Tip: Quick Check for Missing or Invalid Data Using Formats (Daily tip for 2021-Jul-19)
Koen
Define "weird number". How do we know one when we see one?
Your example input data should include at least one example of each case of "problem" value you expect to process, describe the rule(s) for identifying the problem values, and then show what you actually expect for the output of the same values.
Best is to provide the example values as data step code as we cannot write code against pictures.
Since you example data picture does not show any repeat values of the subject_id it is very hard to tell why you are writing code that appears to expect multiple values for the variable.
I'll get you started on a basic data step to show your example data:
data have; input subject_id bp_systolic bp_diastolic; datalines; 1001 131 81 1002 135 74 1003 135 89 ;
add values as needed.
Some of "rules" you didn't mention but likely should have for "weird numbers".
1) bp_diastolic greater than or equal to bp_systolic
2) either of those two variables with a value of 0 (zero) or negative
3) From the time I have spent in the hospital I suspect that your organization may have some concern about either of those BP variables over some numeric value. I'm not going to guess but I know my doctor was concerned when mine 181 over 160 one time.
1 should be a simple if/then.
2 and 3 could be implemented with format/informat if involving a single observation. If you insist on attempting to examine multiple observations then you need to describe what you are doing in that case.
You appear to be attempting to report anytime systolic changes in that second code. Why? Hooked up to monitors it fluctuates constantly. If using a stand alone device BP measurement change as fast as the instrument can cycle. And if the measurements are months apart then even looking for "larger" changes, say 10 points or so, may not have any significance.I've had my BP change by 10 or more points in the course of a 30 minute doctor visit.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.