BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GokulK
Calcite | Level 5

Hello All,

I have two datasets—Dataset Test_Sample and Full_Sample where the dataset Test_Sample is a subset of Full_Sample. I want to delete observations that I have in Test_Sample from Full_Sample. For instance, if my Test_Sample and Full_Sample look as follows:

 

Test_Sample

Firm       Year     Size

A            2011      8

A            2012      8.1

B            2011      7

 

Full_sample

Firm        Year    Size

A            2011      8

A            2012      8.1

B            2011      7

B            2012      7.2

C            2011      7.9

C            2012      8.1

 

The desired output is

Firm        Year      Size

B            2012      7.2

C            2011      7.9

C            2012      8.1

 

Can someone please share the code with me so that I can get the desired output?

Best regards,

Gokul

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data test;
input Firm   $    Year     Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
;
 

data Full;
input Firm   $   Year    Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
B            2012      7.2
C            2011      7.9
C            2012      8.1
;

proc sql;
create table want as
select * from full
except
select * from test
;
quit;

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Here's the basic idea:

data desired;
   merge test_sample (in=delete_these) full_sample;
   by firm year;
   if delete_these then delete;
run;

There is some question about what should happen if the FIRM and YEAR match, but the SIZE is different.  This program gets rid of those observations.

GokulK
Calcite | Level 5

Thank you for the prompt reply. I don't want to match by SIZE. As long as the FIRM and YEAR of both the samples match, then I want to delete such observations from full_sample.

 

Best regards,

Gokul

Reeza
Super User

If the values differ does it matter which one is kept?

GokulK
Calcite | Level 5

Hello Reeza,

Full_sample observations = Test_sample observations + non test firm-year observations. Since Full_sample has more observations, I want to delete firm-year observations from full_sample that are similar to Test_sample. So I will have three unique samples--Full_sample, Test_sample, and NonTest_sample.

 

Best regards,

Gokul

Ksharp
Super User
data test;
input Firm   $    Year     Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
;
 

data Full;
input Firm   $   Year    Size;
cards;
A            2011      8
A            2012      8.1
B            2011      7
B            2012      7.2
C            2011      7.9
C            2012      8.1
;

proc sql;
create table want as
select * from full
except
select * from test
;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 695 views
  • 0 likes
  • 4 in conversation