Watch this Ask the Expert session to learn the critical step of uploading your data into SAS.
You will learn:
The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.
What if you're missing the leading zero for 01/01/11?
It will still work without the leading zero.
Can you use infile/input with a csv/xlsx file?
You can only use infile/input to read a text file. NOT XLSX. If it is a CSV file then yes you can, it is considered a delimited file with a delimiter of a comma.
You do have a couple of options: 1) you can save a .xlsx file as a text file, .csv, thus allowing you to read using INFILE/INPUT. But the only reason I'd do that is if i did not license the "SAS ACCESS to PC FILE FORMATS" module in SAS; so I think of this options as a work-around. Option 2) is better: Assuming you have the above license, write a libname statement pointing to a particular Excel workbook, like this: libname myxls xlss "s:\workshop\data\excel_workbook.xlsx;
With this method, you may then access any worksheet in the above workbook (excel_workbook.xls) as if it were an existing SAS table, via a SET statement! so, like this:
some logic here....;
a bit more logic....;
What do you do to determine the correct positions/how many positions there are for each variable? Like what if row 5000 has 8 characters for var1 but row 1 has 3 characters? Do we just eyeball it and guesstimate?
You can use Notepad to help determine field positions. If you open your raw data file in notepad, on the bottom right it shows field positions. Some companies provide a data dictionary with field positions.
Can forward slash (/) stand in its own line? Or does there need to follow something?
Do you mean (/) used for reading a record? It can stand on its own, which means read a record. But without any details following, it will not transfer to PDV.
Follow-up question: is there a similar way to specify what informat with PROC import?
No, you cannot with PROC import. It is not as flexible as the infile/input in the Data step. Proc import is out of the box and does it's best to interpret. But if it doesn't do what you want, that's when you use infile/input.
You need a trailing @ sign to hold the input.
Could you please point me to code examples (SAS Documentation or SAS conference proceedings paper) of reading unstructured HTTP response (HTML) data meeting certain conditions for the rows?
Do these input examples work the same when using a direct connection to Teradata?
Teradata is a database; SAS can connect to it directly using the SAS/ACCESS engine to Teradata. Teradata tables are NOT text files, so this technique is not applicable. Hope that answers your question.
If job title in second line ends in 60 instead of 55, will it read the salary correctly?
Great question. What will happen is your job title will look odd and be something like this: Sales Manager108,25. Because the column point for salary is at 56, the point will in fact move back and successfully read salary.
Is a subsetting WHERE more efficient than a subsetting IF in a data step that reads in a SAS data set, for instance, from a SET statement?
Yes, you are correct. When reading a SAS dataset using the SET statement, it can "access" the columns and their values directly before it gets to the PDV. This allows the use of the WHERE statement for those existing columns, and, as you mentioned, is more efficient.
Great content, thanks. I agree it's a lost art. I would appreciate hearing your opinion why this is still relevant today when many data analysts access data using some version of GUI and data import wizards.
This topic is relevant today for at least three reasons: 1) Data Import wizards don’t - and often can’t - deal with complex hierarchical files. Ultimately reading raw data requires an understanding of the data structure and the subtle nuances involved necessary to read accurately; 2) When auditing, confirming, or doing a peer-review, the reviewer must be able to verify that the data coming off the data collection system was read correctly. It’s NOT safe or advisable to ignore this critical first step, just assuming that the newly-created SAS file is spot on correct – because is often is not correct; and 3) Proc IMPORT writes only the most basic data step, omitting all the necessary logic, variable creation, data manipulation, and validation that ultimately has to be done. From an efficiency perspective, it makes incredible sense for the analyst to build ONE data step that both imports and performs necessary computations in the same step.
Please see additional resources in the attached slide deck.
Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.