DATA Step, Macro, Functions and more

Read two datasets in a DATA step

Reply
N/A
Posts: 0

Read two datasets in a DATA step

Hi,

I have a dataset (file1), in that I have some 10 to 15 strings. And in another dataset (file2) there are thousands of lines. First, I need to read the first string from the file1 and check for it in all the lines of file2. If the string is present in any line (of file2) then write that line into a new file. Then read the second string from file1 and do the same. Do this for all the strings in file1.

In SAS, how to accomplish this?

Thanks,
Shravan
SAS Super FREQ
Posts: 8,864

Re: Read two datasets in a DATA step

Posted in reply to deleted_user
Hi:
You could investigate a PROC SQL join or a SAS data step program MERGE. If you look in the documentation, you will find examples of each of these techniques. Also, there have numerous papers published on these and other similar techniques. You might perform a Google search on performing TABLE LOOKUPs with SAS -- there have been a number of papers that explain how to use the above and other techniques.

cynthia
Regular Contributor
Posts: 191

Re: Read two datasets in a DATA step

Posted in reply to deleted_user
proc sql;
create table new_table as select b.* from
(select string from file1) as a
inner join
(select * from file2) as b
on a.string = b.string
;
quit;

Should do the work!
N/A
Posts: 0

Re: Read two datasets in a DATA step

Hi Fred,

As per your suggestion, I have written the below piece of code to accomplish the task.

Thank you for your help.

DATA DCOL;
KEEP DSN VOLSER;
INFILE DCOLLECT;
INPUT @5 RECTYPE $1.
@25 DSN $44.
@79 VOLSER $6. ;
IF RECTYPE = 'D';
********************************************************************
*** CREATE THE INPUT DATASET MISCLST FROM DCOLLECT MISCDSN MEMBER***
********************************************************************;
DATA MISCLST;
KEEP MISCDSN LEN;
INFILE LST;
INPUT @1 MISCDSN $44. ;
LEN = LENGTH(MISCDSN);
********************************************************************
*** JOIN DCOLLECT REPORT AND MISCDSN MEMBER ***
********************************************************************;
PROC SQL;
CREATE TABLE NEW_TABLE AS SELECT * FROM
(SELECT A.DSN, A.VOLSER
FROM DCOL A, MISCLST B
WHERE INDEX(A.DSN,(SUBSTR(B.MISCDSN,1,B.LEN))) > 0)
ORDER BY VOLSER
;
QUIT;
PROC PRINT DATA=NEW_TABLE;
DATA OUT;
SET NEW_TABLE;
FILE OUTDD;
PUT @1 DSN @46 VOLSER;
Super Contributor
Super Contributor
Posts: 3,174

Re: Read two datasets in a DATA step

Posted in reply to deleted_user
For what its worth, a somewhat simpler WHERE coding technique supported by PROC SQL is shown below:

WHERE A.DSN CONTAINS TRIM(B.MISCDSN)

Scott Barry
SBBWorks, Inc.
Ask a Question
Discussion stats
  • 4 replies
  • 140 views
  • 0 likes
  • 4 in conversation