The SAS was written years ago by a previous developer. Right now I am getting a file 77 MB in size from my SQL statement, when before the file was only 1.68 MB in size from SAS. What I need is an explanation as to what ends up in CLS_CDUP_ENRL.xls CDUP, CLASS_TBL, and ENRLMNTS are large data extracts from the enterprise system. After every KEEP statement there is an OUTPUT statement. Example KEEP OWNER STRM SESSION_CODE CLASS_SECTION SUBJECT CATALOG_NBR CLASS_NBR EMPLID DUP_CODE; RUN; PROC EXPORT DATA=CDUP_NO_ENRLMNT OUTFILE= "C:\Processes\C_DUP\for_1058\CDUP_NO_ENRLMNT.xls" DBMS=EXCEL2000 REPLACE; The KEEP statement causes a DATA step to write only the variables that you specify to one or more SAS data sets. My problem is I don't see a specific KEEP statement before CLS_CDUP_ENRL is exported. I don't know what is retained in CLS_CDUP_ENRL. Maybe I'm getting confused where the previous developer who made this SAS program used CLS_CDUP_ENRL in both a DATA statement and a SET statement. Here is the SAS code. DATA CLASS_NOT_FOUND; MERGE CLASS_TBL (IN=IS_A) CDUP (IN=IS_B); BY CLS_KEY; IF IS_B AND NOT IS_A; KEEP OWNER STRM SESSION_CODE CLASS_SECTION SUBJECT CATALOG_NBR EMPLID DUP_CODE; RUN; PROC EXPORT DATA=CLASS_NOT_FOUND OUTFILE= "C:\Processes\C_DUP\for_1058\CLASS_NOT_FOUND_CDUP_ER1.xls" DBMS=EXCEL2000 REPLACE; RUN; DATA CLS_CDUP_VLD; MERGE CLASS_TBL (IN=IS_A) CDUP (IN=IS_B); BY CLS_KEY; IF IS_B AND IS_A; STDNT_KEY = EMPLID || STRM || CLASS_NBR; STDNT_KEY2 = EMPLID || CRSE_ID; RUN; PROC SORT DATA=CLS_CDUP_VLD; BY STDNT_KEY; RUN; DATA CLS_CDUP_ENRL; MERGE CLS_CDUP_VLD (IN=IS_A) ENRLMNTS (IN=IS_B); BY STDNT_KEY; IF IS_B AND IS_A; RUN; DATA CDUP_NO_ENRLMNT; MERGE CLS_CDUP_VLD (IN=IS_A) ENRLMNTS (IN=IS_B); BY STDNT_KEY; IF IS_A AND NOT IS_B; KEEP OWNER STRM SESSION_CODE CLASS_SECTION SUBJECT CATALOG_NBR CLASS_NBR EMPLID DUP_CODE; RUN; PROC EXPORT DATA=CDUP_NO_ENRLMNT OUTFILE= "C:\Processes\C_DUP\for_1058\CDUP_NO_ENRLMNT.xls" DBMS=EXCEL2000 REPLACE; RUN; DATA DUPS_CDUP_ENRL; SET CLS_CDUP_ENRL; BY STDNT_KEY; IF (FIRST.STDNT_KEY = 1 & LAST.STDNT_KEY = 0) OR (FIRST.STDNT_KEY = 0 & LAST.STDNT_KEY = 0) OR (FIRST.STDNT_KEY = 0 & LAST.STDNT_KEY = 1); KEEP OWNER EMPLID STRM SESSION_CODE SUBJECT CATALOG_NBR CLASS_SECTION DUP_CODE; RUN; PROC EXPORT DATA=DUPS_CDUP_ENRL OUTFILE= "C:\Processes\C_DUP\for_1058\DUPLICATE_CDUP_ENTRIES.xls" DBMS=EXCEL2000 REPLACE; RUN; PROC SORT DATA=CLS_CDUP_ENRL; BY STDNT_KEY2; RUN; DATA UNMATCH_CDUP_ENRL; SET CLS_CDUP_ENRL; BY STDNT_KEY2; IF (FIRST.STDNT_KEY2 = 1 & LAST.STDNT_KEY2 = 1); KEEP OWNER EMPLID STRM SESSION_CODE SUBJECT CATALOG_NBR CLASS_SECTION DUP_CODE STATUS; RUN; PROC EXPORT DATA=UNMATCH_CDUP_ENRL OUTFILE= "C:\Processes\C_DUP\for_1058\UNMATCHED_CDUP_ENTRIES.xls" DBMS=EXCEL2000 REPLACE; RUN; PROC EXPORT DATA=CLS_CDUP_ENRL OUTFILE= "C:\Processes\C_DUP\for_1058\CLS_CDUP_ENRL.xls" DBMS=EXCEL2000 REPLACE; RUN;
... View more