BookmarkSubscribeRSS Feed
beleeve
Calcite | Level 5

I'm trying to run bivar logistic regression on 2 variables but from 2 datasets that have the same var names just from different timepoints. I've listed the code I have so far but essentially, var 2 needs to be from the first dataset and var 1 and 3 need to be from a combination of both datasets. I've tried renaming the variable that needs data from only one data set but when I add the other set, there are fewer values for some reason.

 

data work.asdf;
set 'dataset1'(rename=(var2=var2a));

set 'dataset2';

...

 

This is what I was trying to run:

proc sort;
by var1;

proc freq;
by var1;
table var2a*var3 / expected cellchi2 chisq;

run;

 

 

2 REPLIES 2
Quentin
Super User

Hi,

 

It's hard to understand your goal.  Can you please show five records of dataset1, five records of dataset2, and what you want to create when you combine them into work.asdf?

 

If you are combining variables from dataset1 and dataset2, typically you would do that with a MERGE statement.

 

If you are combining rows from dataset1 and dataset2, typically you would do that with a single SET statement which lists both datasets.

 

Your current code, with two SET statements, is almost certainly not doing what you want.  But I'm not sure what you're trying to do, so not sure how to help.

The Boston Area SAS Users Group is hosting free webinars!
Next up: Troy Martin Hughes presents Calling Open-Source Python Functions within SAS PROC FCMP: A Google Maps API Geocoding Adventure on Wednesday April 23.
Register now at https://www.basug.org/events.
ballardw
Super User

You might also indicate exactly how you need to use the variable with the same name.

You can add an variable than indicates which data set a specific record comes from such as

data work.combined;
   set dataset1  (in=in1)
         dataset2  (in=in2)
   ;
   if in1 then Source='Dataset1';
   else if in2 then Source='Dataset2';
run;

You could then use the Source variable in analysis to differentiate between the original set such as

 

proc freq data=combined;
   tables source*var2;
run;

Note that multiple SET statements, while allowed by syntax, will get you into some pretty complex behaviors. I suspect that you only have as many records from the larger data set as appeared in the smaller one, especially if dataset1 is the smaller.

 

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 732 views
  • 0 likes
  • 3 in conversation