BookmarkSubscribeRSS Feed
collegestudent2
Fluorite | Level 6

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;

1 REPLY 1
Reeza
Super User
  • How big is the data set? 10-20 million of rows is very different than processing something that's a few hundred million rows. 
  • I don't think the code you've posted is correct or works at all, does it? I suspect things like "Branch ID" are a single field, which should then be BranchID or Branch_ID depending on your preference. You can have it named as "Branch ID" if you really want but then need to list it as 'Branch ID'n in the INPUT statement. I suspect BRANCH NAME should be a character field as well? Should it be Data or Date?
  • Depending on the answer to my first question you can either just add a WHERE statement to filter it out, but it may be better to read it all once, split it by BranchID if necessary into smaller sets and go from there. 
  • If you need to eventually process all the data, instead I would recommend learning how to use the OBS= option to limit your data when testing and then removing it for actually running your process. You can also add an OBS= observation on the INFILE statement to process only a few records while you sort out the data step to read the data and then remove it once you're set. 

 

 

DATA transaction; 
INFILE 'C:/MyRawData/BankTrans.csv' DLM = ',' DSD LRECL = 1500 OBS=50; INPUT BranchName $ BranchID $ TransID $ Account $ Date MMDDYY8. StartTime Time8. EndTime Time8. AmountBalance; IF BRANCHID = '12345'; RUN;

@collegestudent2 wrote:

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;


EDIT: I changed the input code to read in fields like BRANCHID as characters which is the usual standard. 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 477 views
  • 0 likes
  • 2 in conversation