BookmarkSubscribeRSS Feed
ahblackwell
Calcite | Level 5

We have baseline data from male and female partners of the same couple (394 unique observations, 197 dyadic couples).  They are linked through one variable called famcode which is unique to each couple. I want to analyze the data by couple for some measures but cannot figure out how to link them for all analyses.  For example, both the male and female data have a family functioning score and I need to calculate the average difference in score for each couple (not difference in overall mean score but specifically average difference between female and male within each couple).  Similarly, the female data includes variables on intimate partner violence that are not in the male dataset, but I would like to analyze some of the male data by the female partner's reporting of IPV (e.g. if the man's female partner reported IPV, what is the probability he perpetrated child abuse).  Therefore, my questions are:

 

1) What is the best way to merge/append the data so that the couple data is linked by the family code (var=famcode)?  I currently use the below code to append the female and male datasets but it may not be the past way to ensure my data are linked:


          proc sort data=sah.drcbw_final OUT=sah.drcbw_sort;
               by famcode;
          run;

 

          proc sort data=sah.drcbm_final OUT=sah.drcbm_sort;
               by famcode;
          run;


          data sah.drcb;
               set sah.drcbw_sort;
          run;

 

          proc append base=sah.drcb data=sah.drcbm_sort force;

          run;

 

2) Is there a way to copy or link variables from the female dataset onto the male dataset, matching by couple?  For example, so I can analyze male factors by female reporting of IPV?

 

Any advice on how to approach this without having to manually copy data from the female dataset onto the male dataset would be welcome, as that is my current approach.  I use Base SAS Software (which I know is due to expire in March, but is the version that my university provided for free).

2 REPLIES 2
Reeza
Super User
APPEND has put the data stacked, one on top of each other. A merge is what you need if you want to do comparisons. Or you could use PROC TRANSPOSE it depends a bit on your actual data which you haven't shown.

Here's a tutorial on merging data in SAS.
https://stats.idre.ucla.edu/sas/modules/match-merging-data-files-in-sas/

If you need more specific help you'll need to provide example data.
ballardw
Super User

Your appended data would be appropriate for statistics like Mean of a given variable assuming it has the same name. Use the family identifier as either a CLASS or By variable in Proc Means or Summary. Or you could use Proc Report with the family id as group variable requesting mean of analysis variables.

 

To bring other variables into the same record then Merge:

data merged;
   merge sah.drcbw_sort sah.drcbm_sort;
   by famcode;
run; 

However you need to be careful about which of common named variables you want to keep as the second data set will overwrite common variables. Here is a brief example:

data work.ex1;
   input famcode var1 var2;
datalines;
1  1  1
2  2  2
;
data work.ex2;
   input famcode var1 var3;
datalines;
1  4  15
2  5  16
;

data work.merged;
   merge work.ex1 
         work.ex2
   ;
   by famcode;
run;

proc print data=work.merged;
run;

Note that the VAR! values in the merged data are from work.ex2.

So you may need to add drop options if you want to keep the VAR1 from work.ex1

data work.merged;
   merge work.ex1 
         work.ex2 (drop=var1)
   ;
   by famcode;
run;

or rename if you need both variables.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 675 views
  • 0 likes
  • 3 in conversation