BookmarkSubscribeRSS Feed
PaigeMiller
Diamond | Level 26

My company has an internal Google Drive installation where we can store information and then share it throughout the company as desired. I received an e-mail with a hyperlink to an Excel file. Now I could click on the link, open the file in Excel, then save it to my local computer or some other server where I know the file name, but really I want to just use the hyperlink in PROC IMPORT to access the file on Google Drive without creating a copy locally.


Here is the hyperlink:

 

https://mandtbank.sharepoint.com/:x:/r/sites/IndirectSecuritizationandCapitalMarkets/Shared Documents/General/Loan Tapes (Non-Deal Specific)/M%26T Auto Tape 2022-Sep 2024 Originations - Internal - Barclays Initial Selections.xlsx?d=wbb3a4a2834e2467eb075a3aa6c103012&csf=1&web=1&e=HkXncD

 

If I try to use SAS code like this including the hyperlink (and eliminating the ? and subsequent text), it doesn't work:

 

libname the_file "https://mandtbank.sharepoint.com/:x:/r/sites/Proprietary Folder Name/Shared Documents/General/Loan Tapes (Non-Deal Specific)/M%26T Auto Tape 2022-Sep 2024 Originations - Internal - Other Proprietary Info.xlsx";

proc import datafile=the_file dbms=excel out=a replace;
run;

 

I get errors

376   libname the_file "https://mandtbank.sharepoint.com/:x:/r/sites/Proprietary Folder Name/Shared Documents/General/Loan Tapes
376 ! (Non-Deal Specific)/M%26T Auto Tape 2022-Sep 2024 Originations - Internal - Other Proprietary Info.xlsx";
ERROR: Invalid physical name for library THE_FILE.
ERROR: Error in the LIBNAME statement.

 

How can I do this?

--
Paige Miller
10 REPLIES 10
ChrisHemedinger
Community Manager

Hi Paige,

 

The link you shared is not Google Drive -- it's SharePoint Online (Microsoft 365).

 

The way to accomplish this is via APIs. First, you need to use an API to download the file of interest to your local SAS session. Then you can PROC IMPORT it. You cannot use LIBNAME or FILENAME directly to point to the file in the cloud. Authentication and single-signon and other factors make this too complicated.

 

I do have a well documented method for this that lots of companies use:

https://blogs.sas.com/content/sasdummy/2024/07/29/sas-programming-office-365-onedrive/

 

It requires some setup and probably some collaboration with your IT folks (whoever manages your Microsoft 365 tenant). But once set, you will be able to use the macros I provide to do something like this (snippet here):

 

/* DOWNLOAD A FILE FROM ONEDRIVE TO SAS SESSION */
/*
 With a list of the items in this folder, we can download
 any file of interest
*/
%downloadFile(driveId=&driveId., 
  folderId=&folder., 
  sourceFilename=ScoreCard2022.xlsx, 
  destinationPath=%sysfunc(getoption(WORK)));

/* Downloaded an Excel file into SAS? Now we can PROC IMPORT if we want */
proc import file="%sysfunc(getoption(WORK))/ScoreCard2022.xlsx" 
 out=xldata
 dbms=xlsx replace;
run;

 

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
Quentin
Super User

Hi Paige,

 

If you don't want to go the full API route, another approach is to use OneDrive (assuming you have it, I think if you have SharePoint / Teams you probably have OneDrive also) to synch the file from SharePoint to your OneDrive folder.  OneDrive gets a folder on your PC, and syncs to the SharePoint file.  So you have a local copy (mirror) on your PC, but at least you don't have to click a link to download it manually whenever you want to update it. 

 

I'm trying this now for one project, and while I'm generally not a fan of storing files in Sharepoint/Teams/OneDrive, it's working ok.  The source is an Excel file stored in Teams (which is using SharePoint in the background).  OneDrive syncs it to a folder on my PC, C:\UserData\Quentin\OneDrive\Foo\bar.xlsx.  And that happens through OneDrive magic.  Then on my PC, I can use PC SAS to import the file, or I can have a windows job push the file to the SAS server I use.

 

The other option I looked into was PowerAutomate, which is Microsoft's product for automating things in the office 365 cloud.  But it turns out at my company, it was difficult to set up a 'gateway' so that PowerAutomate could see our Windows file shares etc.

 

With companies storing more and more files in Office 365 cloud stuff (SharePoint/Teams/OneDrive/PowerBI...) I feel like MS is in effect building a mote around those files, making it easy to work on them from within their ecosystem of tools, but less easy to get them out (without using API's).

 

That said, grateful to Chris for developing and sharing the API approach.  I'm sure before too long I'll need to give up on my ad-hoc approach and will be happy to have the Power of SAS to generate API calls to manage Office 365 files.

ballardw
Super User

Suggestion if investigating the One-Drive approach suggested by @Quentin : Check with your IT about file size limits and such.

My organization required a move for all of our "working data" to One-drive.

I found out that one of the simulation models I ran exceeded the file size they allowed.

Then they investigated how much data I had and said "That's way more than was supposed to be on One-drive". Apparently they set limits without actually investigating what individuals might actually need...

 

Also ran into an issue that may have been the result of other activities by our IT department where they managed to delete the contents of a couple of folders where I stored infrequently used SAS programs. Which I found out when updating the  SAS Zip code data set and the program I have to create custom formats from Zip codes couldn't be found...

Quentin
Super User

I'm wary of OneDrive as well.  The idea of magically syncing to the cloud and only storing locally what is needed sounds nice in theory, but I find it scary.  Can I really trust that OneDrive is smart enough to give me the current file every time I read it, rather than let me read an old cached version? I'm viewing my OneDrive use case for this Excel file as a test case... The short-term project I did this for turned into a medium-term project, so there is ongoing work to get the data out of Excel and into a database, which will of course make everything work better.

 

I can't imagine storing a big SAS dataset in one drive and having it download from the cloud in the background when I try to read it.   We also had a "store everything in OneDrive" idea at work.  And they moved our "My Documents" folders to One Drive.  But I think many people are ignoring it.  Might make sense for Word docs and little stuff where there is need for collaboration, but I don't need it for code or data.  And don't really want it in the background, messing with backups etc.

PaigeMiller
Diamond | Level 26

@Quentin wrote:

 

Might make sense for Word docs and little stuff where there is need for collaboration, but I don't need it for code or data.  And don't really want it in the background, messing with backups etc.


 

After reading the discussion, I am inclined to agree with Quentin. A humongous Excel file on OneDrive isn't a good place for me to access such a large file.

 

But I also think that this far into the computer age, such a complicated procedure to determine the right way to access files seems completely unnecessary and counter-productive.

--
Paige Miller
ChrisHemedinger
Community Manager

We can argue the merits of putting data and code in systems like OneDrive or SharePoint, but the fact is that cloud storage is here to stay. And many customers have their SAS environment in the cloud too, so connectivity between these systems is a necessity. It's increasingly rare for people to have local SAS on their desktop to analyze/ingest local data (including Excel files) from their organization. For documents like these, right now the approach is to use APIs.

 

With macros we can simplify the operation...a bit...for SAS programmers. I began documenting these methods in 2019 and have evolved them quite a bit since. They can be tricky to set up but once in place, it works great. I have batch jobs that read and write content to SharePoint (or Teams folders) every day.

 

We have the same challenges here at SAS that others have discussed for large files, code files, and policies such as document retention and sensitivity labels. For code, I do not recommend OneDrive or SharePoint. Git is the standard for keeping and versioning code, and for many companies that's GitLab or GitHub (enterprise/cloud versions that are gated for just them, not the public versions). All of my important SAS code is in GitLab repositories for backup, versioning, and business continuity.

 

SAS data sets also don't belong in SharePoint or OneDrive, especially big files. That eats into the quota limits that seem large to start with, but when you start throwing in everything that people want to save, it dwindles quickly.

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
PaigeMiller
Diamond | Level 26

Chris

You have re-framed the problem to "cloud storage is here to stay" (which I agree with), but this is not my problem and I don't accept your re-framing.

 

The problem is that to access a file in the cloud that I have permission to access, I have to jump through hoops to access it. I realize SAS did not create these hoops that I have to jump through, but I do not like it, and I will simply skip jumping through hoops and save myself some time and make a local copy. I'm 100% certain simpler access methods could have been devised, but were not.

--
Paige Miller
Tom
Super User Tom
Super User

Let's get back to the actual problems with the code you posted in your original question.

 

PROC IMPORT is not very good at dealing with reading from something that is not a file.  So first converting the link into an actual file is going to be needed for now.

 

Your code also is mistaken in mixing two different ways to access EXCEL files.  You can either use a LIBNAME with the XLSX engine, which will allow you to treat the XLSX file as if it was a library with multiple dataset members (one per spreadsheet in the XLSX workbook).

 

Or you can point PROC IMPORT at the file and ask it to read one of the sheets (or a range of cells in one of the sheets) into a dataset.

 

So I would first try pointing a FILEREF at the file using the URL filename engine.

filename myfile url "https://.....myfile.xlxs";

I believe you cannot then access that directly with PROC IMPORT (I know it does not work for a plain text file accessed that way so I would extremely surprised if it worked for a zip file (which is what an XLSX file is).  So you will first need to make a local copy of the file.  Try something like:

filename myfile url "https://.....myfile.xlxs"
  recfm=f lrecl=512
;
filename local temp recfm=f lrecl=512;
%let rc=%sysfunc(fcopy(myfile,local));

Now you can try using PROC IMPORT.  For example this code would pull the first worksheet out of the file.

proc import dbms=xlsx file=local out=want replace;
run;

Or use the XLSX libref engine and copy all of the worksheets.

libname myfile xlsx "%sysfunc(pathname(local))";
proc copy inlib=myfile outlib=work;
run;

If the FCOPY() function did not copy the XLSX file from the sharepoint site then you will want to look at Chris' sharepoint access tools to help you get the file copied.

Kurt_Bremser
Super User

@Quentin wrote:

.... I feel like MS is in effect building a mote around those files, making it easy to work on them from within their ecosystem of tools, but less easy to get them out (without using API's).


Embrace - Extend - Extinguish, the modus operandi of the Redmond crooks.

mkeintz
PROC Star

@Quentin wrote:

 

With companies storing more and more files in Office 365 cloud stuff (SharePoint/Teams/OneDrive/PowerBI...) I feel like MS is in effect building a mote around those files, making it easy to work on them from within their ecosystem of tools, but less easy to get them out (without using API's).

 

mote ("A tiny particle, speck, or piece of dust").  I think you meant "moat": ("A broad deep ditch that is dug around a castle or other fortress as a defense against attach").

 

I can see justification for either.

 

Enough already - I'll go back to my day job now.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2719 views
  • 13 likes
  • 7 in conversation