Hello team!
I have brainteaser… I need to clean up a file with extra entries. I have the following data:
data work.have;
infile datalines delimiter=',';
length Field_A $8 Field_B $8;
input Field_A $ Field_B $;
datalines;
13976907,12555848
12555848,13976907
13589994,11280580
11280580,13589994
10947277,10808526
10808526,10947277
12334829,92335000
92335000,12334829
;
run;
What I need to do is the following:
When Field_A from row 1 is equal to Field_B from row 2
Then I need to keep only row 1.
The data comes in pairs of two rows and it is like this,
row1.Field_a = row2.Filed_B AND row1.Field_B = Row2.Field_A
my output shall be:
13976907,12555848
13589994,11280580
10947277,10808526
12334829,92335000
I haven’t done anything like that in the past… can this bee accomplished?
Any help is welcomed!
Thank you in advance!
It works just fine!
Does this code meet your expectations:
data want (drop=Field_B_2);
merge have have (keep=Field_B rename=(Field_B = Field_B_2) firstobs=2);
if Field_A = Field_B_2 then output;
run;
By using the lag() function, and modulo of _n_:
data work.have;
infile datalines delimiter=',';
length Field_A $8 Field_B $8;
input Field_A $ Field_B $;
datalines;
13976907,12555848
12555848,13976907
13589994,11280580
11280580,13589994
10947277,10808526
10808526,10947277
12334829,92335000
92335000,12334829
;
data want;
set have;
if mod(_n_,2) = 0 and lag(field_a) = field_b then delete;
run;
proc print data=want noobs;
run;
Result:
Field_A Field_B 13976907 12555848 13589994 11280580 10947277 10808526 12334829 92335000
Please use the "little running man" for posting code (as I did), so that the code is kept as-is. Posting code into the main posting window will cause unwanted changes (extra lines, smileys, ...)
Thank you so much!
Iτ ςορκσ ξθστ φινε!
It works just fine!
The accepted answer is displayed just after the question in the thread so that interested readers can find it quickly.
You should thus select the answer that solved your problem, here , i think @ed_sas_member 's or @Kurt_Bremser 's.
The data comes in pairs of two rows and it is like this,
row1.Field_a = row2.Filed_B AND row1.Field_B = Row2.Field_A
So you just need to drop even rows :
data work.have;
infile datalines delimiter=',';
length Field_A $8 Field_B $8;
input Field_A $ Field_B $;
if mod(_N_,2);
datalines;
13976907,12555848
12555848,13976907
13589994,11280580
11280580,13589994
10947277,10808526
10808526,10947277
12334829,92335000
92335000,12334829
;
run;
data work.have;
infile datalines delimiter=',';
length Field_A $8 Field_B $8;
input Field_A $ Field_B $;
datalines;
13976907,12555848
12555848,13976907
13589994,11280580
11280580,13589994
10947277,10808526
10808526,10947277
12334829,92335000
92335000,12334829
;
data temp;
set have;
call sortc(field_a,field_b);
run;
proc sort data=temp out=want nodupkey;
by field_a field_b;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.