BookmarkSubscribeRSS Feed
Sandeep77
Lapis Lazuli | Level 10

Hi all,

I have imported a csv file in SAS and have a dataset. I want to compare both of them and find the common postcodes. What is the best way to do it? The imported file is elnz_lowell_12sep and the dataset name is Trace_results. The postcode header is TA_PCODE. I have tried the proc compare step but I don't want the report. I want the table with common variable.

Proc compare base=work.elnz_lowell_12sep
Compare=Trace_result;
var TA_PCODE;
run;
8 REPLIES 8
PaigeMiller
Diamond | Level 26

Your title says "common values" but then your text asks for "common variable". Which is it? Could you explain further? Can you provide example data and example output?

--
Paige Miller
Sandeep77
Lapis Lazuli | Level 10

Hi,

As I said that I want to find out from TA_PCODE variable that how many of them are same.

Sample data:

Data elnz_lowell_12sep;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567456	23 Park Lane	High street	Bradford		BD7 1RE
24468265	81 Moudsley Streed		Leeds		LS12
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;

Data Trace_result;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567428	43 Roysdale road		London		L24 8PN
54468547	32 Merton Road		Leeds		LS110 1AB
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;
Tom
Super User Tom
Super User

What exactly do you need to test?  You mentioned POSTCODE but your data seems to be based on REFERENCE_NUMBER.

 

If REFERENCE_NUMBER is a unique key in both datasets then here is a way to split the data into four datasets.

 

data left right left_only right_only;
  set elnz_lowell_12sep(in=inleft) Trace_result(in=inright);
  by Reference_number ;
  if first.Reference_number	and last.Reference_numbe then do;
     if inleft then output left_only;
     else output right_only;
  end;
  else if inleft then output left;
  else output right;
run;

You can look a the LEFT_ONLY and RIGHT_ONLY datasets to see the reference numbers that show up in only one place.

 

You can use PROC COMPARE on the left and right datasets to see if any of the values of the other variables differ for the matching observations.

proc compare data=left compare=right;
  id Reference_number;
run;

 

Kurt_Bremser
Super User

@Sandeep77 wrote:

Hi,

As I said that I want to find out from TA_PCODE variable that how many of them are same.

Sample data:

Data elnz_lowell_12sep;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567456	23 Park Lane	High street	Bradford		BD7 1RE
24468265	81 Moudsley Streed		Leeds		LS12
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;

Data Trace_result;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567428	43 Roysdale road		London		L24 8PN
54468547	32 Merton Road		Leeds		LS110 1AB
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;

These steps will only result in a lot of ERRORs and unusable datasets with missing numeric values. Please test and fix your codes before posting. It's not rocket science, you CAN do that.

Kurt_Bremser
Super User

If you want to compare the ta_pcode values for given reference_numbers, you need to do this:

proc sort data=elnz_lowell_12sep;
by reference_number;
run;

proc sort data=trace_result;
by reference_number;
run;

proc compare
  base=elnz_lowell_12sep
  compare=trace_result
;
var ta_pcode;
id reference_number;
run;

 

 

Reeza
Super User

You really do need to clean up that data input. For simplicity I've dropped everything except the codes of interest.

 

Data elnz_lowell_12sep;
infile cards dsd truncover;
input Reference_number TA_PCODE $;
cards;
24567456, BD7 1RE
24468265, LS12
56452464, M14 6PA
;
run;

Data Trace_result;
infile cards dsd truncover;
input Reference_number TA_PCODE $;
datalines;
24567428, L24 8PN
54468547, LS110 1AB
56452464, M14 6PA
;
run;

proc sort data=elnz_lowell_12sep(keep=ta_pcode) out=elnz nodupkey;
by ta_pcode;
run;

proc sort data=Trace_result(keep=ta_pcode) out=trace nodupkey;
by ta_pcode;
run;

data want;
merge elnz (in=_e) trace (in=_t);
by ta_pcode;

length status $15.;

if _e & _t then Status='Both';
else if _e then Status='ELNZ Only';
else if _t then Status = 'Trace Only';
 

run;

ballardw
Super User

@Sandeep77 wrote:

Hi,

As I said that I want to find out from TA_PCODE variable that how many of them are same.

Do you want to know "how many are the same", a count, or the actual values that are the same, a data set with the codes in both?

This does the latter:

data set1;
  input code $;
datalines;
a
a
a
b
c
d
b
c
c
;

data set2;
  input code $;
datalines;
a
c
q
q
q
c
c
c
;

proc sql;
   create table commoncode as
   select distinct a.code
   from set1 as a 
        natural join
        set2 as b
   ;
quit;

To find codes in Set1 but not Set2

proc sql;
   create table in1_not2 a
   select distinct code from set1
   except
   select distinct code from set2
   ;
quit;

You don't explicitly show what the expected result should look like.


@Sandeep77 wrote:

Hi,

As I said that I want to find out from TA_PCODE variable that how many of them are same.

Sample data:

Data elnz_lowell_12sep;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567456	23 Park Lane	High street	Bradford		BD7 1RE
24468265	81 Moudsley Streed		Leeds		LS12
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;

Data Trace_result;
input Reference_number	Add_line_1	Add_line_2	Add_line_	Add_line_4	TA_PCODE;
datalines;
24567428	43 Roysdale road		London		L24 8PN
54468547	32 Merton Road		Leeds		LS110 1AB
56452464	49 laistridge lane		Manchester		M14 6PA
;
run;

 

 

ballardw
Super User

Example input data and expected results.

 

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 3247 views
  • 0 likes
  • 6 in conversation