BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CynthiaWei
Obsidian | Level 7

Hi SAS Experts,

 

I am having a data set like this:

Family ID female Has fiber every day age Mother Kids
4 4 1 1 25 1 0
4 4.1 0 1 29 0 1
4 4.2 1 0 34 0 1
4 4.3 1 1 45 0 1
5 5 1 1 42 1 0
5 5.1 1 1 53 0 1
6 6 1 0 23 1 0
6 6.1 0 0 29 0 1
6 6.2 0 1 36 0 1
7 7 1 1 48 1 0
7 7.1 1 0 23 0 1
8 8 1 0 53 1 0
8 8.1 0 0 54 0 1
8 8.2 1 1 37 0 1
8 8.3 1 1 41 0 1
8 8.4 0 1 26 0 1

 

So, I want to create some variables representing for only the information of first observation (which is the information for mothers) of every group (the variable here is family).

 

What I want is as follows: 

New created variables are: whether mother have fiber every day and Mother's age.

Basically, the values for these two variables are the values of mothers from each family and repeat as many times as the number of theirs kids participated in the study. So, in this way I can use paired t-test to compare these dependent variables, such as the mother's age and her kids ages or mother's physical activity hours and kids' physical activity hours (not listed here). For comparison of having fiber every day between mothers and their kids, it should use McNemars' test, right?

Family ID female Whether Mother have fiber every day Has fiber every day Mother's age age Mother Kids
4 4 1 1 1 25 25 1 0
4 4.1 0 1 1 25 9 0 1
4 4.2 1 1 0 25 4 0 1
4 4.3 1 1 1 25 6 0 1
5 5 1 1 1 42 42 1 0
5 5.1 1 1 1 42 13 0 1
6 6 1 0 0 23 23 1 0
6 6.1 0 0 0 23 2 0 1
6 6.2 0 0 1 23 4 0 1
7 7 1 1 1 48 48 1 0
7 7.1 1 1 0 48 23 0 1
8 8 1 0 0 53 53 1 0
8 8.1 0 0 0 53 27 0 1
8 8.2 1 0 1 53 24 0 1
8 8.3 1 0 1 53 20 0 1
8 8.4 0 0 1 53 16 0 1

 

Thank you very much!

 

Kind regards,

 

C

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @CynthiaWei,

 

Here is some code to creates the two variables:

data have;
	infile datalines dlm=' ' ;
	input Family 1 ID 3-5 female 7 fiber 9 age 11-12 Mother 14 Kids 16;
	datalines;
4 4   1 1 25 1 0
4 4.1 0 1 29 0 1
4 4.2 1 0 34 0 1
4 4.3 1 1 45 0 1
5 5   1 1 42 1 0
5 5.1 1	1 53 0 1
6 6   1 0 23 1 0
6 6.1 0	0 29 0 1
6 6.2 0	1 36 0 1
7 7   1 1 48 1 0
7 7.1 1	0 23 0 1
8 8   1 0 53 1 0
8 8.1 0	0 54 0 1
8 8.2 1	1 37 0 1
8 8.3 1	1 41 0 1
8 8.4 0	1 26 0 1
;
run;

proc sort data=have;
	by family ID; /* assuming the first ID is for the mother */
run;

data want;
	set have;
	by family id;
	
	/* Compute mother's age*/
	retain mother_age;
	if first.family then mother_age = age;

	/* Compute whether mother have fiber every day */
	retain fiber_YN;
	if first.family then fiber_YN = fiber;

run;

if think you can use Mc Nemar's test using a proc freq

View solution in original post

3 REPLIES 3
ed_sas_member
Meteorite | Level 14

Hi @CynthiaWei,

 

Here is some code to creates the two variables:

data have;
	infile datalines dlm=' ' ;
	input Family 1 ID 3-5 female 7 fiber 9 age 11-12 Mother 14 Kids 16;
	datalines;
4 4   1 1 25 1 0
4 4.1 0 1 29 0 1
4 4.2 1 0 34 0 1
4 4.3 1 1 45 0 1
5 5   1 1 42 1 0
5 5.1 1	1 53 0 1
6 6   1 0 23 1 0
6 6.1 0	0 29 0 1
6 6.2 0	1 36 0 1
7 7   1 1 48 1 0
7 7.1 1	0 23 0 1
8 8   1 0 53 1 0
8 8.1 0	0 54 0 1
8 8.2 1	1 37 0 1
8 8.3 1	1 41 0 1
8 8.4 0	1 26 0 1
;
run;

proc sort data=have;
	by family ID; /* assuming the first ID is for the mother */
run;

data want;
	set have;
	by family id;
	
	/* Compute mother's age*/
	retain mother_age;
	if first.family then mother_age = age;

	/* Compute whether mother have fiber every day */
	retain fiber_YN;
	if first.family then fiber_YN = fiber;

run;

if think you can use Mc Nemar's test using a proc freq

CynthiaWei
Obsidian | Level 7

Hi,

 

Thank you so much for the code. The code worked very well, and my problem has been solved. I really appreciate it a lot!

 

I apologized that I was not able to respond you any sooner due to the business meetings and traveling.

 

Kind regards,

 

Cynthia

mkeintz
PROC Star

@CynthiaWei 

 

I suggest you mark the response by @ed_sas_member as a solution.  This not only recognizes the help provided by respondents but also re-categories your topic as solved - which can help others browsing the forum

 

regards,

Mark

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1345 views
  • 0 likes
  • 3 in conversation