Reading two files simultaneously

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

Reading two files simultaneously

Hi,

I have two datasets as depicted in the first example below ("have1" and "have2"). The "Obj" column is identical for both datasets. I just want to keep the cases where their remaining columns are not identical ("want"):

/* STEP A */

Data have1;

    Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;

Datalines;

    A 10 15

    B 20 25

    C 30 35

Run;

/* STEP B */

Data have2;

    Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;

Datalines;

    A 10 15

    B 20 99

    C 30 35

Run;

/* STEP C */

Data want;

    Set have1;

    Set have2(DROP=Obj Rename=(Obj_value1=Obj_value1b Obj_value2=Obj_value2b));

    If

        Obj_value1 ne Obj_value1b OR

        Obj_value2 ne Obj_value2b

    Then Do;

        Output;

    End;

Run;

The problem is that "have1" and "have2" are actually large csv-files consisting of millions of rows and hundreds of columns. Therefore, reading both files sequentially (STEP A and B) and then comparing them (STEP C) is extremly ressource intensive and slow.

My current best solution reads the first dataset "have1" (STEP A), and then - in one step (STEP B) - reads dataset "have2" row for row, compares them with the according rows of the first one, and only outputs data if necessary:

/* STEP A */

Data have1;

    Infile "D:\have1.csv" Dsd Missover;

    Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;

Run;

/* STEP B */

Data want;

    Set have1;

    Infile "D:\have2.csv" Dsd Missover;

    Input Obj_drop :$2. Obj_value1b :best4. Obj_value2b :best4.;

    Drop Obj_drop;

    IF

        Obj_value1 ne Obj_value1b OR

        Obj_value2 ne Obj_value2b

    Then Do;

        Output;

    End;

Run;

Question: Is there a way to skip STEP A and simultaneously read BOTH files row for row (not just "have2" like in the example above)? That would be an optimal solution!

(small) Question 2: Is there a way to skip the first variable ("Obj") when reading "have2"? Reading and droping it right afterwards doesn't really seem elegant...

Thanks for any help!


Accepted Solutions
Solution
‎10-11-2013 09:50 AM
Respected Advisor
Posts: 3,799

Re: Reading two files simultaneously

Posted in reply to Georg_UPB

A view might be useful, two actually.

/* STEP A */
options parmcards=FT15F001;
filename FT15F001 temp;
Data hv1 / view=hv1;
   infile FT15F001;
   Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;
  
parmcards;
   
A 10 15
    B
20 25
    C
30 35
;;;;
   Run;



/* STEP B */
options parmcards=FT16F001;
filename FT16F001 temp;
Data hv2 / view=hv2;
   infile FT16F001;
   Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;
  
parmcards;
   
A 10 15
    B
20 99
    C
30 35
;;;;
   Run;



/* STEP C */

Data want;
   merge hv1 hv2(Rename=(Obj_value1=Obj_value1b Obj_value2=Obj_value2b));
   by OBJ;
   If Obj_value1 ne Obj_value1b OR Obj_value2 ne Obj_value2b;
   Run;
proc print;
  
run;

View solution in original post


All Replies
Solution
‎10-11-2013 09:50 AM
Respected Advisor
Posts: 3,799

Re: Reading two files simultaneously

Posted in reply to Georg_UPB

A view might be useful, two actually.

/* STEP A */
options parmcards=FT15F001;
filename FT15F001 temp;
Data hv1 / view=hv1;
   infile FT15F001;
   Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;
  
parmcards;
   
A 10 15
    B
20 25
    C
30 35
;;;;
   Run;



/* STEP B */
options parmcards=FT16F001;
filename FT16F001 temp;
Data hv2 / view=hv2;
   infile FT16F001;
   Input Obj :$2. Obj_value1 :best4. Obj_value2 :best4.;
  
parmcards;
   
A 10 15
    B
20 99
    C
30 35
;;;;
   Run;



/* STEP C */

Data want;
   merge hv1 hv2(Rename=(Obj_value1=Obj_value1b Obj_value2=Obj_value2b));
   by OBJ;
   If Obj_value1 ne Obj_value1b OR Obj_value2 ne Obj_value2b;
   Run;
proc print;
  
run;
Contributor
Posts: 38

Re: Reading two files simultaneously

Posted in reply to data_null__

They actually are extremely useful! Thank you very much!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 194 views
  • 0 likes
  • 2 in conversation