Hello all,
I have a flat file consisting of Series ID, Data, Year, Period, Period Name and Value. It is of the form:
[{"seriesID":"CES1000000001","data":[{"year":"2014","period":"M06","periodName":"June","value":"907","footnotes":[{"code":"P","text":"preliminary"}]},{"year":"2014","period":"M05","periodName":"May","value":"903","footnotes":[{"code":"P","text":"preliminary"}]},{"year":"2014","period":"M04","periodName":"April","value":"901","footnotes":[{}]},
and so forth,
This file contains 6 possible variables
I'd like to read this file into a data set, such that each variable gets its own entry (ideally without the quotation mark or other delimters). Please help as I seem unable to properly read it into sas.
Kind regards
I know that this may look more ugly than necessary, but it follows "if in doubt, use brute force":
data want (keep=seriesID year period periodName value code text);
infile '$HOME/sascommunity/SeriesDataOut.txt' lrecl=4096 truncover;
length
data_line $ 4096
seriesID $ 20
year 4
period $ 3
periodName $ 20
value 8
code $ 1
text $ 30
;
input;
data_line = _infile_;
if substr(data_line,1,12) = '[{"seriesID"';
data_line = substr(data_line,15);
i = indexc(data_line,'"');
seriesID = substr(data_line,1,i-1);
i = index(data_line,'":[');
data_line = substr(data_line,i+4); * removes first { from remaining string;
i = index(data_line,'},{');
do while (i ne 0);
sub_line = substr(data_line,1,i-1);
data_line = substr(data_line,i+3);
%macro process_subline;
i = index(sub_line,'year":"');
year = input(substr(sub_line,i+7,4),4.);
i = index(sub_line,'period":"');
period = substr(sub_line,i+9,3);
i = index(sub_line,'periodName":"');
sub_line = substr(sub_line,i+13);
i = indexc(sub_line,'"');
periodName = substr(sub_line,1,i-1);
i = index(sub_line,'value":"');
sub_line = substr(sub_line,i+8);
i = indexc(sub_line,'"');
value = input(substr(sub_line,1,i-1),best5.);
i = index(sub_line,'footnotes":[{');
sub_line = substr(sub_line,i+13,length(sub_line)-i-14);
i = index(sub_line,'code":"');
sub_line = substr(sub_line,i+7);
i = indexc(sub_line,'"');
code = substr(sub_line,1,i-1);
i = index(sub_line,'text":"');
sub_line = substr(sub_line,i+7);
text = substr(sub_line,1,length(sub_line)-1);
output;
%mend;
%process_subline;
i = index(data_line,'},{');
end;
sub_line = data_line;
%process_subline;
run;
It is json, does this help? http://support.sas.com/resources/papers/proceedings13/296-2013.pdf (there are more of those)
Using the @variable to go to a position and seeing the "data" as table separator the record elsewhere are some thoughts. input statement: SAS(R) 9.3 Statements: Reference
I know that this may look more ugly than necessary, but it follows "if in doubt, use brute force":
data want (keep=seriesID year period periodName value code text);
infile '$HOME/sascommunity/SeriesDataOut.txt' lrecl=4096 truncover;
length
data_line $ 4096
seriesID $ 20
year 4
period $ 3
periodName $ 20
value 8
code $ 1
text $ 30
;
input;
data_line = _infile_;
if substr(data_line,1,12) = '[{"seriesID"';
data_line = substr(data_line,15);
i = indexc(data_line,'"');
seriesID = substr(data_line,1,i-1);
i = index(data_line,'":[');
data_line = substr(data_line,i+4); * removes first { from remaining string;
i = index(data_line,'},{');
do while (i ne 0);
sub_line = substr(data_line,1,i-1);
data_line = substr(data_line,i+3);
%macro process_subline;
i = index(sub_line,'year":"');
year = input(substr(sub_line,i+7,4),4.);
i = index(sub_line,'period":"');
period = substr(sub_line,i+9,3);
i = index(sub_line,'periodName":"');
sub_line = substr(sub_line,i+13);
i = indexc(sub_line,'"');
periodName = substr(sub_line,1,i-1);
i = index(sub_line,'value":"');
sub_line = substr(sub_line,i+8);
i = indexc(sub_line,'"');
value = input(substr(sub_line,1,i-1),best5.);
i = index(sub_line,'footnotes":[{');
sub_line = substr(sub_line,i+13,length(sub_line)-i-14);
i = index(sub_line,'code":"');
sub_line = substr(sub_line,i+7);
i = indexc(sub_line,'"');
code = substr(sub_line,1,i-1);
i = index(sub_line,'text":"');
sub_line = substr(sub_line,i+7);
text = substr(sub_line,1,length(sub_line)-1);
output;
%mend;
%process_subline;
i = index(data_line,'},{');
end;
sub_line = data_line;
%process_subline;
run;
Thanks for Kurt,
This seems to work great when pulling a single data series ID. Unfortunately, I'm required to pull at least 25 Series ID from BLS on a single run. Would you guys know of a more efficient way to extract this data and successfully read it into SAS with multiple SeriesID's (up to 25)?
I'm using the following code to extract data from BLS:
%let url=http://api.bls.gov/publicAPI/v1/timeseries/data/;
filename in "C:\SAS BLS\SeriesIn.txt";
filename out "C:\SAS BLS\SeriesDataOut.txt"
recfm=v lrecl=32000;proc http in=in out=out url="&url" method="post" ct="application/json";run;
_______________________________________________________________________________________________________
Sample SeriesIn.txt ( A text file that indicates the Seriesid's to be extracted)
{"seriesid":["CEU0800000003","CFU0000008000","LEU0254555900","APU0000701111"],
"startyear":"2002",
"endyear":"2012"}
run;
Try using this to parse your downloaded file. You might need to add a LRECL if the series are longer.
data parse;
infile out dlm=',:' dsd truncover ;
length series $20 name $32 value $200 ;
input series @;
put _n_= series=;
series = compress(series,'[{"}]');
if series='seriesID' then do;
n+1;
row=0;
* Read name of the series and throw away the "data" constant string that follows it;
input series name @;
series = compress(series,'[{"}]');
do col=1 by 1 until (name = ' ');
input name value @;
name = compress(name,'[{"}]');
value = compress(value,'[{"}]');
row + (name='year');
if name ne ' ' then output;
end;
end;
run;
proc transpose data=parse out=want (drop=_name_) ;
by n series row ;
var value;
id name ;
run;
proc print;
run;
I wrote that code under the assumption that every seriesID would have its own line. If more than one come in one line, then my method needs to be adapted to that.
You could do a first step that slices the line into the parts that represent single seriesID's and the apply my code to the result.
You should check proc groove .
See the last part of this paper.
Xia Keshan
That is a nice one Xia.
Does remember proc groovy is requiring the XCMD being open and the mentioned classpath's must be installed somewhere.
Two questions not being touched in an open local situation. In a server-based approach this might be different.
Well you posted at least now that it is a JSON request and you are using a (most) probably local installed SAS foundation.
In that case processing JSON files is the most appropriate way to go. Xia posted a blog how to do that.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.