BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_NYS
Obsidian | Level 7

Hi

I have two data sets that come from the same code, but run at two different points in time. One of the variables, Jur2006, has more 2000 observations marked as 'true' in the prior run than the current run. I would expect a few hundred different, but 2000 is too many.

To see what the difference is relative to the observations, I would like to compare the two data sets and identify the observations marked as 'true' in the prior run, but no longer marked as 'true' in the current run and output those observations in a 3rd data set. Is there a way to do this with PROC COMPARE or using a simple merge (which I am trying)?

Paul

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Sounds like a job for Proc SQL.

Proc Sql;

     Create table mismatch as

     select a.*

     from ( select * from FirstDataSet where Jur2006='True') as a

             natural join

            (select * from SecondDataSet where Jur2006='False') as a

     where a.Jur2006 ne b.Jur2006;

quit;

You'll need to change the names of the datasets and how the variable is indicated to be true or false. The output dataset barring ill fortune should have the records from the first run where the value changed.

WARNING: if enough values are repeated such that what might be considered a combination of identification variables have the same pattern for multiple records you're going to have some fun.

View solution in original post

1 REPLY 1
ballardw
Super User

Sounds like a job for Proc SQL.

Proc Sql;

     Create table mismatch as

     select a.*

     from ( select * from FirstDataSet where Jur2006='True') as a

             natural join

            (select * from SecondDataSet where Jur2006='False') as a

     where a.Jur2006 ne b.Jur2006;

quit;

You'll need to change the names of the datasets and how the variable is indicated to be true or false. The output dataset barring ill fortune should have the records from the first run where the value changed.

WARNING: if enough values are repeated such that what might be considered a combination of identification variables have the same pattern for multiple records you're going to have some fun.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 683 views
  • 0 likes
  • 2 in conversation