BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lousam
Obsidian | Level 7

Hello,

 

I created a data set to help explain my question. In my example, researchers are trying to understand the impact of different types of 

Diet plans on change in BMI over 3 study visits.  

 

ID Visit  Sex  Age   Height   Weight (lbs)     Diet (Type)

 

1    1     0      23       1.75       200                  0

2    1     0      18        2.1        300                  1

3    1     1      .           1.5        195                  2

 

1    2     0      24        1.75      180                  0

2    2     0      19          .           321                 1

3    2     1      29        1.5           .                    2

 

1    3     0      25        1.75     170                   0

2    3     0       .          2.1        280                  1

3    3     1       30        1.5      135                   2

 

My question is:

 

- How can I exclude people who weigh below a certain cutoff at visit=1 (for example if I want to exclude all people who weigh <200 lbs at visit 1).  The resulting data set would only have ID1 & ID2. 

 

then 

- How I can remove ID that have any missing data (missing data could be for any of the variables and for any of the study visits).  In this scenario, the resulting data set would only have information for 3 study visits of ID1 (since the other ID have missing values)

 

My assumption is that I can use proc sql to group and delete ID with missing data but I have no idea how to do it. I appreciate any help you can give me. 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Perhaps something like this

 

proc sql;
   select name into :varlist separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE' ;
   create table want as
   select * from have
   where ID not in 
   (select ID from have where Visit=1 and weight < 200
                           or cmiss(&varlist) > 0);
quit;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Something like this?

 

data have;
input ID Visit Sex Age Height Weight Diet;
datalines;
1 1 0 23 1.75 200 0
2 1 0 18 2.1 300 1
3 1 1 .  1.5 195 2
1 2 0 24 1.75 180 0
2 2 0 19 . 321 1
3 2 1 29 1.5 . 2
1 3 0 25 1.75 170 0
2 3 0 . 2.1 280 1
3 3 1 30 1.5 135 2
;

proc sql;
   create table want as
   select * from have
   where ID not in 
   (select ID from have where Visit=1 and weight < 200
                           or cmiss(Visit, Sex, Age, Height, Weight, Diet) > 0);
quit;
lousam
Obsidian | Level 7

I am trying to apply the concept to a large data set with ~50 variables. Is there a way to do this [for instance using cmiss ()] without listing all of the variables in the code? The variables are all numeric. Thanks for the help

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Good idea to provide:

Test data in the form of a datastep.

And required output.  

 

If you have lots of variables, you should really follow my post and convert the data to an industry standard, you will find your life much easier.  For this problem the simplest way is to normalise the data, i.e. put it going down the page, then apply my logic (and if necessary transpose up again), consider:
USUBJID PARAM   LBSTRESN

XYZ          WEIGHT 123.4

XYZ          HEIGHT  23

ABC          WEIGHT ...

...

 

You can then apply my logic very simply to all the data and just add PARAM to the merge items.  

 

You could do it in a transpose dataset as well, but it becomes far more complicated, you would need to merge on all the records for each of the variables, then do an array comparison to output a new array of results.  

Transpose down

Do merge and calcuation

Transpose up

Far simpler.

PeterClemmensen
Tourmaline | Level 20

Perhaps something like this

 

proc sql;
   select name into :varlist separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE' ;
   create table want as
   select * from have
   where ID not in 
   (select ID from have where Visit=1 and weight < 200
                           or cmiss(&varlist) > 0);
quit;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Map the data into industry standard data models = SDTM/ADaM.  The reason being if you mapped into ADaM format, one of the sets of variables is baseline and change from baseline.  You would follow the logic of:

select baseline record (if this is visit 1 then simple, but you may have repeats, multiples on the same date etc.)

merge this record back to original data based on ID

Then baseline value appears on all records and you can simply filter with where or if.

Not typing in test data - see other posts for how to post test data, so this code is just a guess:

proc sql;
  create table WANT as
  select  A.*,
          B.WEIGHT as BASE
  from    HAVE A
  left join (select * from HAVE where VISIT=1) B
  on      A.ID=B.ID;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3130 views
  • 4 likes
  • 3 in conversation