Output all observations within a certain group

Accepted Solution Solved
Reply
Contributor
Posts: 41
Accepted Solution

Output all observations within a certain group

Hello. I'm struggling with code to do the following:

 

data have:

visit_id       study_id      delivery_visit   pregnancynum_rank   gestational_diabetes

1                      1                     1                        1                                 1

2                      1                     0                        1                                 1

3                      1                     1                        2                                 1

4                      2                     0                        5                                 1

5                      2                     1                        5                                 1

6                      3                     0                        2                                 1

7                      3                     1                        2                                 1

8                      3                     1                        3                                 1

9                      4                     0                        3                                  1

10                    4                     1                        3                                 1

11                    4                     0                        6                                  1

12                    4                     1                        6                                 1

 

 

data want:

visit_id       study_id      delivery_visit   pregnancynum_rank   gestational_diabetes

1                      1                     1                        1                                 1

2                      1                     0                        1                                 1

3                      1                     1                        2                                 1

4                      2                     0                        5                                 1

5                      2                     1                        5                                 1

6                      3                     0                        2                                 1

7                      3                     1                        2                                 1

8                      3                     1                        3                                 1

9                      4                     0                        3                                  1

10                    4                     1                        3                                 1

11                    4                     0                        6                                  1

12                    4                     1                        6                                 1

 

For moms with gestational diabetes, I need to output the first pregnancy in the dataset for which they had gest. diabetes. For example, study_id had 2 pregnancies with gest. diabetes but I want to output all observations associated with her pregnancynum=1. Mother study_id 4 has 2 pregnanices with gestatational diabetes (pregnancynum 3 and 6), but I only want to keep all obervations associated with pregnancynum=3 (because its her first preg in the dataset with GDM).

 

I've been trying some PROC SQL codes and can't make it work. Suggestions? Thank you!


Accepted Solutions
Solution
‎09-18-2017 01:18 PM
Super User
Posts: 21,530

Re: Output all observations within a certain group

Slice it twice - first get the ID and Preganancy Rank you need and then either use a data step or SQL merge to keep it if you need the first 'group' and not just the first record.

 

If you've tried it, post your code that you have so far.

 

View solution in original post


All Replies
Super Contributor
Posts: 510

Re: Output all observations within a certain group

I'm a little confused by your example - surely study_id 2 only has 1 pregnancy?

Contributor
Posts: 41

Re: Output all observations within a certain group

Posted in reply to ChrisBrooks

The dataset has moms with between 1-7 pregnancies. A woman with multiple pregnancies does not necessarily have gestatational diabetes during all her pregnancies.


So the dataset I have, contains the pregnancies (pregnancynum 1,2,3,4,5,6,and/or 7) for which the mom had gest. daibetes.

Mom study_id 2, for example, only has 1 pregnancy in the dataset when she had gest. diabetes. Her pregnancy number 1, she did not have the condition. So I need all her observations associated with the pregnancy #2.

Mom study_id 4 has had 6 babies in the dataset, but only had gest. diabetes for pregnancy 3 and 6. I don't want to keep the observations for more than 1 pregnancy with GDM. Therefore, I want her observations associated with her first gest. diabetes pregnancy (or number 3 only).

Does that make sense? I want to output all the data associated with the first gest. diabetes pregnancy in the dataset, and get rid of the 2nd, 3rd.....nth gest.diabetes prenancies.

 

Super User
Posts: 21,530

Re: Output all observations within a certain group

Posted in reply to ChrisBrooks

Try using a DATA step and using the FIRST instead of SQL.

 

A data step will be much easier than SQL>

Contributor
Posts: 41

Re: Output all observations within a certain group

I tried that too.   But it only output the first observation, not the entire group of observations, for the pregnancy of interest. 

 Do you have sample code you could provide to show me what you mean ?

thank you

Solution
‎09-18-2017 01:18 PM
Super User
Posts: 21,530

Re: Output all observations within a certain group

Slice it twice - first get the ID and Preganancy Rank you need and then either use a data step or SQL merge to keep it if you need the first 'group' and not just the first record.

 

If you've tried it, post your code that you have so far.

 

Contributor
Posts: 41

Re: Output all observations within a certain group

Right got it. Get first.study_id then join back to original dataset by study_id and pregnancynum. Thanks!

Regular Contributor
Regular Contributor
Posts: 218

Re: Output all observations within a certain group

I followed Reeza's suggestion and did it this way:

 

data pregnancy_data;
   input visit_id study_id delivery_visit pregnancynum_rank gestational_diabetes;
datalines;
1 1 1 1 1
2 1 0 1 1
3 1 1 2 1
4 2 0 5 1
5 2 1 5 1
6 3 0 2 1
7 3 1 2 1
8 3 1 3 1
9 4 0 3 1
10 4 1 3 1
11 4 0 6 1
12 4 1 6 1
;
run;

proc sort data = pregnancy_data;
	by study_id pregnancynum_rank;
run;

*grab the first pregnancy with gestational diabetes for each mom in the dataset;
data first_with_gest_diab;
	set pregnancy_data;
	by study_id pregnancynum_rank;
	if first.study_id and first.pregnancynum_rank and gestational_diabetes = 1;
run;

*go back and get all the records associated with those pregnancies;
data first_with_gest_diab_all;
proc sql;
	create table first_with_gest_diab_all as
	select pregnancy_data.* 
	from pregnancy_data 
	inner join first_with_gest_diab on pregnancy_data.study_id = first_with_gest_diab.study_id
	and pregnancy_data.pregnancynum_rank = first_with_gest_diab.pregnancynum_rank 
	order by visit_id, study_id, pregnancynum_rank;
quit;

That gave me

The SAS System				
				
visit_id	study_id	delivery_visit	pregnancynum_rank	gestational_diabetes
1	1	1	1	1
2	1	0	1	1
4	2	0	5	1
5	2	1	5	1
6	3	0	2	1
7	3	1	2	1
9	4	0	3	1
10	4	1	3	1

Which is the desired result i think.

Contributor
Posts: 41

Re: Output all observations within a certain group

[ Edited ]

@HB, thanks! This line "if first.study_id and first.pregnancynum_rank and gestational_diabetes = 1;"  is what I was looking for. Didn't realize you could do 2 first. in one if statement. Thanks again!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 243 views
  • 3 likes
  • 4 in conversation