Hello,
I created a data set to help explain my question. In my example, researchers are trying to understand the impact of different types of
Diet plans on change in BMI over 3 study visits.
ID Visit Sex Age Height Weight (lbs) Diet (Type)
1 1 0 23 1.75 200 0
2 1 0 18 2.1 300 1
3 1 1 . 1.5 195 2
1 2 0 24 1.75 180 0
2 2 0 19 . 321 1
3 2 1 29 1.5 . 2
1 3 0 25 1.75 170 0
2 3 0 . 2.1 280 1
3 3 1 30 1.5 135 2
My question is:
- How can I exclude people who weigh below a certain cutoff at visit=1 (for example if I want to exclude all people who weigh <200 lbs at visit 1). The resulting data set would only have ID1 & ID2.
then
- How I can remove ID that have any missing data (missing data could be for any of the variables and for any of the study visits). In this scenario, the resulting data set would only have information for 3 study visits of ID1 (since the other ID have missing values)
My assumption is that I can use proc sql to group and delete ID with missing data but I have no idea how to do it. I appreciate any help you can give me.
Perhaps something like this
proc sql;
select name into :varlist separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE' ;
create table want as
select * from have
where ID not in
(select ID from have where Visit=1 and weight < 200
or cmiss(&varlist) > 0);
quit;
Something like this?
data have;
input ID Visit Sex Age Height Weight Diet;
datalines;
1 1 0 23 1.75 200 0
2 1 0 18 2.1 300 1
3 1 1 . 1.5 195 2
1 2 0 24 1.75 180 0
2 2 0 19 . 321 1
3 2 1 29 1.5 . 2
1 3 0 25 1.75 170 0
2 3 0 . 2.1 280 1
3 3 1 30 1.5 135 2
;
proc sql;
create table want as
select * from have
where ID not in
(select ID from have where Visit=1 and weight < 200
or cmiss(Visit, Sex, Age, Height, Weight, Diet) > 0);
quit;
I am trying to apply the concept to a large data set with ~50 variables. Is there a way to do this [for instance using cmiss ()] without listing all of the variables in the code? The variables are all numeric. Thanks for the help
Good idea to provide:
Test data in the form of a datastep.
And required output.
If you have lots of variables, you should really follow my post and convert the data to an industry standard, you will find your life much easier. For this problem the simplest way is to normalise the data, i.e. put it going down the page, then apply my logic (and if necessary transpose up again), consider:
USUBJID PARAM LBSTRESN
XYZ WEIGHT 123.4
XYZ HEIGHT 23
ABC WEIGHT ...
...
You can then apply my logic very simply to all the data and just add PARAM to the merge items.
You could do it in a transpose dataset as well, but it becomes far more complicated, you would need to merge on all the records for each of the variables, then do an array comparison to output a new array of results.
Transpose down
Do merge and calcuation
Transpose up
Far simpler.
Perhaps something like this
proc sql;
select name into :varlist separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE' ;
create table want as
select * from have
where ID not in
(select ID from have where Visit=1 and weight < 200
or cmiss(&varlist) > 0);
quit;
Map the data into industry standard data models = SDTM/ADaM. The reason being if you mapped into ADaM format, one of the sets of variables is baseline and change from baseline. You would follow the logic of:
select baseline record (if this is visit 1 then simple, but you may have repeats, multiples on the same date etc.)
merge this record back to original data based on ID
Then baseline value appears on all records and you can simply filter with where or if.
Not typing in test data - see other posts for how to post test data, so this code is just a guess:
proc sql; create table WANT as select A.*, B.WEIGHT as BASE from HAVE A left join (select * from HAVE where VISIT=1) B on A.ID=B.ID; quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.