BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
stf5018
SAS Employee

I am trying to loop over all the files in an Azure Data Lake and want to read specific .json files so that I may do something with the data in those files. I have access to the data lake and proved I can see the files stored there from within SAS. I only want to read files with a specific name (in this example, “process.json”).

 

I can access the ADLS location and see the files. I can filter out the files I don’t want. I cannot successfully use the “filename” statement to read the files. I am unsure how to get the actual name of the file into the filename statement within the loop. How do I accomplish this? I can't seem to get the filename statement to incorporate the actual value of "name". Code and log output is below. 

 

Note that I have to read json files. I cannot convert to any other file type prior to reading into SAS.

 

 

options azuretenantid = "###-###-###-";

%let appId="###-###-###-";
%let acctName="###";
%let filesys="###";


filename d adls "/"
   applicationId=&appId 
   accountname=&acctName 
   filesystem=&filesys;

data _null_;

	folder_id = dopen("d");
	put folder_id=;

	num_files = dnum(folder_id);
	put num_files=;

	do i = 1 to num_files;
		name=dread(folder_id,i);

		if prxmatch('/(process)\.(json)/',name) then do; 
			put name=; 

			filename myjson adls "&name"
				applicationId=&appId
				accountname=&acctName 
				filesystem=&filesys;
			
			libname temp json fileref=myjson;
			*do something with data in temp;

		end;
	end;

	closerc = dclose(folder_id);
	put closerc=;
run;

 

 

 

Log output: 

data _null_;
80
81 folder_id = dopen("d");
82 put folder_id=;
83
84 num_files = dnum(folder_id);
85 put num_files=;
86
87 do i = 1 to num_files;
88 name=dread(folder_id,i);
89
90 if prxmatch('/(process)\.(json)/',name) then do;
91 put name=;
92
93 filename myjson adls "&name"
WARNING: Apparent symbolic reference NAME not resolved.
94 applicationId=&appId
95 accountname=&acctName
96 filesystem=&filesys;
97
98 libname temp json fileref=myjson;
NOTE: JSON data is only read once. To read the JSON again, reassign the JSON LIBNAME.
ERROR: The AZURE file &name was not found.
ERROR: The endpoint within the Credential Service, azureDeviceCode, could not be found
ERROR: The endpoint within the Credential Service, azureDeviceCode, could not be found
ERROR: The specified path does not exist.
ERROR: Error in the LIBNAME statement.
99 *do something with data in temp;
100
101 end;
102 end;
103
104 closerc = dclose(folder_id);
105 put closerc=;
106 run;
folder_id=1
num_files=6
name=process.json
closerc=0
NOTE: DATA statement used (Total process time):
real time 1.78 seconds
cpu time 0.12 seconds

107
108 %studio_hide_wrapper;
118
119

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You appear to have two issues.

First it does not look like your attempt to use the DREAD() function to get the list of names worked.  You have PUT statements in the code, but no names where written to the log.  I would recommend skipping the filtering (at least until you know if you can get the names) so you can see what the names retrieved look like.

 

Second you have logic issue in the code.  You have embedded the global FILENAME and LIBNAME statements into the middle of your DATA step.  That will not really work right as the global statements will run while the data step compiler is figuring out how to run the code and before it actually starts running the code that is going to try to retrieve the names.  Plus you are referencing a macro variable NAME that you never assigned any value to.

 

I recommend solving the first problem first and see if you can actually get the list of files.

data files;
  folder_id = dopen("d");
  put folder_id=;
  num_files = dnum(folder_id);
  put num_files=;
  do i = 1 to num_files;
    length name $256 ;
    name=dread(folder_id,i);
    output;
  end;
  closerc = dclose(folder_id);
  put closerc=;
run;

Then figure out whether what you retrieved can be used to generate valid FILENAME and LIBNAME statements.  Then figure out if you can actually retrieve data from the JSON file using the libref created by the LIBNAME statement.  

Once you have it working for one file convert it into code you call with different input file names.  For example as a macro.

%readone(name);
filename myjson adls "&name"
  applicationId=&appId
  accountname=&acctName 
  filesystem=&filesys
;

libname myjson json ;
proc append base=all data=myjson.mydata force;
run;
%mend readone;

Then use the list of files to generate calls to the macro.

data _null_;
  set files;
  if lowcase(scan(name,-1,'.'))='json' then 
     call execute(cats('%nrstr(%mymacro)(',name,')'))
  ;
run;

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

You appear to have two issues.

First it does not look like your attempt to use the DREAD() function to get the list of names worked.  You have PUT statements in the code, but no names where written to the log.  I would recommend skipping the filtering (at least until you know if you can get the names) so you can see what the names retrieved look like.

 

Second you have logic issue in the code.  You have embedded the global FILENAME and LIBNAME statements into the middle of your DATA step.  That will not really work right as the global statements will run while the data step compiler is figuring out how to run the code and before it actually starts running the code that is going to try to retrieve the names.  Plus you are referencing a macro variable NAME that you never assigned any value to.

 

I recommend solving the first problem first and see if you can actually get the list of files.

data files;
  folder_id = dopen("d");
  put folder_id=;
  num_files = dnum(folder_id);
  put num_files=;
  do i = 1 to num_files;
    length name $256 ;
    name=dread(folder_id,i);
    output;
  end;
  closerc = dclose(folder_id);
  put closerc=;
run;

Then figure out whether what you retrieved can be used to generate valid FILENAME and LIBNAME statements.  Then figure out if you can actually retrieve data from the JSON file using the libref created by the LIBNAME statement.  

Once you have it working for one file convert it into code you call with different input file names.  For example as a macro.

%readone(name);
filename myjson adls "&name"
  applicationId=&appId
  accountname=&acctName 
  filesystem=&filesys
;

libname myjson json ;
proc append base=all data=myjson.mydata force;
run;
%mend readone;

Then use the list of files to generate calls to the macro.

data _null_;
  set files;
  if lowcase(scan(name,-1,'.'))='json' then 
     call execute(cats('%nrstr(%mymacro)(',name,')'))
  ;
run;
stf5018
SAS Employee

You gave me the answer I needed! I am getting the filtering correct (see after line 106 - one file name gets printed "name=process.json"). 

 

What I wasn't taking into account that FILENAME and LIBNAME are global. The code below is what I am using and it works. Thanks for the help!

 

filename d adls "/"
	applicationId=&appId
   accountname=&acctname
   filesystem=&filesys;  

data files;

	folder_id = dopen("d");
	put folder_id=;

	num_files = dnum(folder_id);
	put num_files=;

	do i = 1 to num_files;
		length name $256 ;
		name=dread(folder_id,i);

		if prxmatch('/(process)\.(json)/',name) then do; 
			output;

		end;
	end;

	closerc = dclose(folder_id);
	put closerc=;
run;

%macro readone(name);
filename myjson adls "&name"
  applicationId=&appId
  accountname=&acctName 
  filesystem=&filesys
;

libname myjson json ;
proc append base=all data=myjson.mydata force;
run;
%mend;

data _null_;
  set files;
  call execute(cats('%nrstr(%readone)(',name,')'))
  ;
run;

 

 

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1219 views
  • 1 like
  • 2 in conversation