DATA Step, Macro, Functions and more

Read two datasets in a DATA step

Reply
N/A
Posts: 0

Read two datasets in a DATA step

Hi,

I have a dataset (file1), in that I have some 10 to 15 strings. And in another dataset (file2) there are thousands of lines. First, I need to read the first string from the file1 and check for it in all the lines of file2. If the string is present in any line (of file2) then write that line into a new file. Then read the second string from file1 and do the same. Do this for all the strings in file1.

In SAS, how to accomplish this?

Thanks,
Shravan
SAS Super FREQ
Posts: 8,743

Re: Read two datasets in a DATA step

Hi:
You could investigate a PROC SQL join or a SAS data step program MERGE. If you look in the documentation, you will find examples of each of these techniques. Also, there have numerous papers published on these and other similar techniques. You might perform a Google search on performing TABLE LOOKUPs with SAS -- there have been a number of papers that explain how to use the above and other techniques.

cynthia
Frequent Contributor
Posts: 120

Re: Read two datasets in a DATA step

proc sql;
create table new_table as select b.* from
(select string from file1) as a
inner join
(select * from file2) as b
on a.string = b.string
;
quit;

Should do the work!
N/A
Posts: 0

Re: Read two datasets in a DATA step

Hi Fred,

As per your suggestion, I have written the below piece of code to accomplish the task.

Thank you for your help.

DATA DCOL;
KEEP DSN VOLSER;
INFILE DCOLLECT;
INPUT @5 RECTYPE $1.
@25 DSN $44.
@79 VOLSER $6. ;
IF RECTYPE = 'D';
********************************************************************
*** CREATE THE INPUT DATASET MISCLST FROM DCOLLECT MISCDSN MEMBER***
********************************************************************;
DATA MISCLST;
KEEP MISCDSN LEN;
INFILE LST;
INPUT @1 MISCDSN $44. ;
LEN = LENGTH(MISCDSN);
********************************************************************
*** JOIN DCOLLECT REPORT AND MISCDSN MEMBER ***
********************************************************************;
PROC SQL;
CREATE TABLE NEW_TABLE AS SELECT * FROM
(SELECT A.DSN, A.VOLSER
FROM DCOL A, MISCLST B
WHERE INDEX(A.DSN,(SUBSTR(B.MISCDSN,1,B.LEN))) > 0)
ORDER BY VOLSER
;
QUIT;
PROC PRINT DATA=NEW_TABLE;
DATA OUT;
SET NEW_TABLE;
FILE OUTDD;
PUT @1 DSN @46 VOLSER;
Super Contributor
Super Contributor
Posts: 3,174

Re: Read two datasets in a DATA step

For what its worth, a somewhat simpler WHERE coding technique supported by PROC SQL is shown below:

WHERE A.DSN CONTAINS TRIM(B.MISCDSN)

Scott Barry
SBBWorks, Inc.
Ask a Question
Discussion stats
  • 4 replies
  • 135 views
  • 0 likes
  • 4 in conversation