BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GokulK
Calcite | Level 5

Hello All,

I have two datasets—Dataset Test_Sample and Full_Sample where the dataset Test_Sample is a subset of Full_Sample. I want to delete observations that I have in Test_Sample from Full_Sample. For instance, if my Test_Sample and Full_Sample look as follows:

 

Test_Sample

Firm       Year     Size

A            2011      8

A            2012      8.1

B            2011      7

 

Full_sample

Firm        Year    Size

A            2011      8

A            2012      8.1

B            2011      7

B            2012      7.2

C            2011      7.9

C            2012      8.1

 

The desired output is

Firm        Year      Size

B            2012      7.2

C            2011      7.9

C            2012      8.1

 

Can someone please share the code with me so that I can get the desired output?

Best regards,

Gokul

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data test;
input Firm   $    Year     Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
;
 

data Full;
input Firm   $   Year    Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
B            2012      7.2
C            2011      7.9
C            2012      8.1
;

proc sql;
create table want as
select * from full
except
select * from test
;
quit;

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Here's the basic idea:

data desired;
   merge test_sample (in=delete_these) full_sample;
   by firm year;
   if delete_these then delete;
run;

There is some question about what should happen if the FIRM and YEAR match, but the SIZE is different.  This program gets rid of those observations.

GokulK
Calcite | Level 5

Thank you for the prompt reply. I don't want to match by SIZE. As long as the FIRM and YEAR of both the samples match, then I want to delete such observations from full_sample.

 

Best regards,

Gokul

Reeza
Super User

If the values differ does it matter which one is kept?

GokulK
Calcite | Level 5

Hello Reeza,

Full_sample observations = Test_sample observations + non test firm-year observations. Since Full_sample has more observations, I want to delete firm-year observations from full_sample that are similar to Test_sample. So I will have three unique samples--Full_sample, Test_sample, and NonTest_sample.

 

Best regards,

Gokul

Ksharp
Super User
data test;
input Firm   $    Year     Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
;
 

data Full;
input Firm   $   Year    Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
B            2012      7.2
C            2011      7.9
C            2012      8.1
;

proc sql;
create table want as
select * from full
except
select * from test
;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1052 views
  • 0 likes
  • 4 in conversation