Hi fellow SAS users and @kevin12 ,
I am on a task that is similar to web crawling (
Webcrawling website for certain webpage that has t... - SAS Support Communities).
I want to search a bunch of Xcel files for a word and return file (ideally file and folder) names that contain it.
Any suggestions are welcome.
Many thanks!
While SAS could do it, to me it doesn't feel like a SAS task. If you're on Windwos, I would try Windows file search first. From a quick google, looks like it can handle it, if you set the appropriate options. Or since you mention dumping the data to CSV, if you did that most text editors (Notepad++, Ultraedit) could do the search.
Before you automate a process you should start with a single file.
Do you know how to do this for a single file?
Is using OS commands an option or do you need SAS entirely?
AFAIK you would likely have to import each Excel file, that could then have multiple sheets with the possibility some of the data may not be visible to SAS - ie anything in a text box. No idea if that's actually an issue in your situation but something to think about :).
@pink_poodle wrote:
Hi fellow SAS users and @kevin12 ,
I am on a task that is similar to web crawling (
Webcrawling website for certain webpage that has t... - SAS Support Communities).
I want to search a bunch of Xcel files for a word and return file (ideally file and folder) names that contain it.
Any suggestions are welcome.
Many thanks!
Thank you, @Reeza! I also found a publication that is in the right direction (please see attached). The paper contains a macro that searches text files in folders for a word and returns their names if the word is there. I could save the xcel files in .csv format that is similar to text and see what the %strsrch macro does. I would rather automate file coversion as well. I would really want to know if this macro was already made into a SAS procedure.
While SAS could do it, to me it doesn't feel like a SAS task. If you're on Windwos, I would try Windows file search first. From a quick google, looks like it can handle it, if you set the appropriate options. Or since you mention dumping the data to CSV, if you did that most text editors (Notepad++, Ultraedit) could do the search.
Web crawling is quite easy, as HTML is simple text. Similarly, text files are very easy to search and locate.
But since Excel files (xlsx) are zip-compressed archives of XML files, each file needs to be uncompressed and decoded (the correct XML files need to be searched for data, as some of them only contain metadata) first before you can search.
In case of text files, grep on a UNIX can search all files in a directory tree in one call.
In SAS it's a one liner...
if you write an appropriate powershell script that returns the results to SAS formated as table
First step is to get your powerShell script to run.
Start here
- Cheers -
> you write an appropriate powershell script that returns the results to SAS formated as table
Including reading compressed (zip, xlsx, egp, rcv etc) files and subfolders?
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.