BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
texasmfp
Lapis Lazuli | Level 10

I am trying to develop a short SAS program to download data from an API and place it into a csv file.  The following code gets the values, plus a lot of unwanted information:

 

%LET reporter=IN;
data _null_;
start="2019";
end="2024";
reporter="&reporter";

length urltest $5000 ;
 urltest = cats
('http://dataservices.imf.org/REST/SDMX_xml.svc/CompactData/IFS/M.'
,reporter
,'.PCPI_IX.?startPeriod='
,start
,'&endPeriod='
,end
,'');

call symputx('urltest',quote(trim(urltest),"'"));
put urltest $char.; 
run;


filename out5 "e:\sas data\cpi.csv";
proc http
 url=&urltest
 method="get" out=out5;
run;

proc import datafile=out5
            out=class2
            dbms=dlm
            replace;
	   datarow=18;
     delimiter=' ';
run;

I'd like the final data to look like this:

2019-01	166.454645739264
2019-02	166.812356296011
2019-03	167.408540557254
2019-04	168.362435375245
2019-05	169.316330193235
2019-06	170.389461863473
2019-07	171.939540942707
2019-08	172.893435760697
2019-09	173.847330578687
2019-10	175.516646510170
2019-11	177.185962441652
2019-12	179.332225782130
2020-01	179.093752077633
2020-02	177.782146702896
2020-03	177.185962441652
2020-04	180.524594304618
2020-05	179.928410043374
2020-06	181.001541713613
2020-07	183.505515610837
2020-08	184.459410428827
2020-09	186.486436917055
2020-10	188.871173962031
2020-11	189.467358223274
2020-12	187.559568587294
2021-01	186.367200064807
2021-02	186.724910621553
2021-03	186.963384326050
2021-04	188.155752848538
2021-05	191.255911007006
2021-06	192.329042677245
2021-07	193.759884904230
2021-08	194.236832313225
2021-09	194.594542869971
2021-10	197.336990471692
2021-11	198.767832698677
2021-12	198.052411585185
2022-01	197.575464176190
2022-02	198.052411585185
2022-03	199.960201221165
2022-04	202.821885675135
2022-05	204.729675311115
2022-06	205.802806981354
2022-07	206.756701799344
2022-08	207.829833469583
2022-09	209.022201992071
2022-10	210.691517923553
2022-11	210.453044219056
2022-12	209.499149401065
2023-01	210.453044219056
2023-02	210.810754775802
2023-03	211.287702184797
2023-04	212.360833855036
2023-05	213.433965525274
2023-06	215.818702570250
2023-07	222.138255739434
2023-08	222.019018887185
2023-09	219.515044989961
2023-10	220.945887216946
2023-11	222.138255739434
2023-12	221.422834625941
2024-01	221.184360921444

But what I get is this:

<?xml version="1.0" encoding="utf-8"?>
<CompactData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message https://registry.sdmx.org/schemas/v2_0/SDMXMessage.xsd http://dataservices.imf.org/compact/IFS http://dataservices.imf.org/compact/IFS.xsd" xmlns="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message">
  <Header>
    <ID>1c600b42-a065-4ee1-b54d-bd4bde37c287</ID>
    <Test>false</Test>
    <Prepared>2024-04-17T01:40:35</Prepared>
    <Sender id="1C0">
      <Name xml:lang="en">IMF</Name>
      <Contact>
        <URI>http://www.imf.org</URI>
        <Telephone>+ 1 (202) 623-6220</Telephone>
      </Contact>
    </Sender>
    <Receiver id="ZZZ" />
    <DataSetID>IFS</DataSetID>
  </Header>
  <DataSet xmlns="http://dataservices.imf.org/compact/IFS">
    <Series FREQ="M" REF_AREA="IN" INDICATOR="PCPI_IX" UNIT_MULT="0" BASE_YEAR="2010=100" TIME_FORMAT="P1M">
      <Obs TIME_PERIOD="2019-01" OBS_VALUE="166.454645739264" />
      <Obs TIME_PERIOD="2019-02" OBS_VALUE="166.812356296011" />
      <Obs TIME_PERIOD="2019-03" OBS_VALUE="167.408540557254" />
      <Obs TIME_PERIOD="2019-04" OBS_VALUE="168.362435375245" />
      <Obs TIME_PERIOD="2019-05" OBS_VALUE="169.316330193235" />
      <Obs TIME_PERIOD="2019-06" OBS_VALUE="170.389461863473" />
      <Obs TIME_PERIOD="2019-07" OBS_VALUE="171.939540942707" />
      <Obs TIME_PERIOD="2019-08" OBS_VALUE="172.893435760697" />
      <Obs TIME_PERIOD="2019-09" OBS_VALUE="173.847330578687" />
      <Obs TIME_PERIOD="2019-10" OBS_VALUE="175.51664651017" />
      <Obs TIME_PERIOD="2019-11" OBS_VALUE="177.185962441652" />
      <Obs TIME_PERIOD="2019-12" OBS_VALUE="179.33222578213" />
      <Obs TIME_PERIOD="2020-01" OBS_VALUE="179.093752077633" />
      <Obs TIME_PERIOD="2020-02" OBS_VALUE="177.782146702896" />
      <Obs TIME_PERIOD="2020-03" OBS_VALUE="177.185962441652" />
      <Obs TIME_PERIOD="2020-04" OBS_VALUE="180.524594304618" />
      <Obs TIME_PERIOD="2020-05" OBS_VALUE="179.928410043374" />
      <Obs TIME_PERIOD="2020-06" OBS_VALUE="181.001541713613" />
      <Obs TIME_PERIOD="2020-07" OBS_VALUE="183.505515610837" />
      <Obs TIME_PERIOD="2020-08" OBS_VALUE="184.459410428827" />
      <Obs TIME_PERIOD="2020-09" OBS_VALUE="186.486436917055" />
      <Obs TIME_PERIOD="2020-10" OBS_VALUE="188.871173962031" />
      <Obs TIME_PERIOD="2020-11" OBS_VALUE="189.467358223274" />
      <Obs TIME_PERIOD="2020-12" OBS_VALUE="187.559568587294" />
      <Obs TIME_PERIOD="2021-01" OBS_VALUE="186.367200064807" />
      <Obs TIME_PERIOD="2021-02" OBS_VALUE="186.724910621553" />
      <Obs TIME_PERIOD="2021-03" OBS_VALUE="186.96338432605" />
      <Obs TIME_PERIOD="2021-04" OBS_VALUE="188.155752848538" />
      <Obs TIME_PERIOD="2021-05" OBS_VALUE="191.255911007006" />
      <Obs TIME_PERIOD="2021-06" OBS_VALUE="192.329042677245" />
      <Obs TIME_PERIOD="2021-07" OBS_VALUE="193.75988490423" />
      <Obs TIME_PERIOD="2021-08" OBS_VALUE="194.236832313225" />
      <Obs TIME_PERIOD="2021-09" OBS_VALUE="194.594542869971" />
      <Obs TIME_PERIOD="2021-10" OBS_VALUE="197.336990471692" />
      <Obs TIME_PERIOD="2021-11" OBS_VALUE="198.767832698677" />
      <Obs TIME_PERIOD="2021-12" OBS_VALUE="198.052411585185" />
      <Obs TIME_PERIOD="2022-01" OBS_VALUE="197.57546417619" />
      <Obs TIME_PERIOD="2022-02" OBS_VALUE="198.052411585185" />
      <Obs TIME_PERIOD="2022-03" OBS_VALUE="199.960201221165" />
      <Obs TIME_PERIOD="2022-04" OBS_VALUE="202.821885675135" />
      <Obs TIME_PERIOD="2022-05" OBS_VALUE="204.729675311115" />
      <Obs TIME_PERIOD="2022-06" OBS_VALUE="205.802806981354" />
      <Obs TIME_PERIOD="2022-07" OBS_VALUE="206.756701799344" />
      <Obs TIME_PERIOD="2022-08" OBS_VALUE="207.829833469583" />
      <Obs TIME_PERIOD="2022-09" OBS_VALUE="209.022201992071" />
      <Obs TIME_PERIOD="2022-10" OBS_VALUE="210.691517923553" />
      <Obs TIME_PERIOD="2022-11" OBS_VALUE="210.453044219056" />
      <Obs TIME_PERIOD="2022-12" OBS_VALUE="209.499149401065" />
      <Obs TIME_PERIOD="2023-01" OBS_VALUE="210.453044219056" />
      <Obs TIME_PERIOD="2023-02" OBS_VALUE="210.810754775802" />
      <Obs TIME_PERIOD="2023-03" OBS_VALUE="211.287702184797" />
      <Obs TIME_PERIOD="2023-04" OBS_VALUE="212.360833855036" />
      <Obs TIME_PERIOD="2023-05" OBS_VALUE="213.433965525274" />
      <Obs TIME_PERIOD="2023-06" OBS_VALUE="215.81870257025" />
      <Obs TIME_PERIOD="2023-07" OBS_VALUE="222.138255739434" />
      <Obs TIME_PERIOD="2023-08" OBS_VALUE="222.019018887185" />
      <Obs TIME_PERIOD="2023-09" OBS_VALUE="219.515044989961" />
      <Obs TIME_PERIOD="2023-10" OBS_VALUE="220.945887216946" />
      <Obs TIME_PERIOD="2023-11" OBS_VALUE="222.138255739434" />
      <Obs TIME_PERIOD="2023-12" OBS_VALUE="221.422834625941" />
      <Obs TIME_PERIOD="2024-01" OBS_VALUE="221.184360921444" />
    </Series>
  </DataSet>
</CompactData>

How does one strip out all of the unwanted information at the start and after the desired data, as well as parse the desired data using multiple delimiters, including double quotation marks as a delimiter?

 

Thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Have you tried using libname xmlv2, you will have to define a xmlmap, but you won't have to implement a parser.

Another way:

data bob;
   length 
      time_period $ 7
      value 8
   ;
   
   format value 16.12;
   
   infile "source.xml";
   input;
   
   retain rx;
   drop rx;
   
   if _n_ = 1 then do;
      rx = prxparse('/TIME_PERIOD="(.*)" OBS_VALUE="(.*)"/');
   end;
   
   if prxmatch(rx, trim(_infile_)) then do;
      time_period = prxposn(rx, 1, trim(_infile_));
      value = input(prxposn(rx, 2, trim(_infile_)), ?? best32.);
      output;
   end;
run;

View solution in original post

3 REPLIES 3
andreas_lds
Jade | Level 19

Have you tried using libname xmlv2, you will have to define a xmlmap, but you won't have to implement a parser.

Another way:

data bob;
   length 
      time_period $ 7
      value 8
   ;
   
   format value 16.12;
   
   infile "source.xml";
   input;
   
   retain rx;
   drop rx;
   
   if _n_ = 1 then do;
      rx = prxparse('/TIME_PERIOD="(.*)" OBS_VALUE="(.*)"/');
   end;
   
   if prxmatch(rx, trim(_infile_)) then do;
      time_period = prxposn(rx, 1, trim(_infile_));
      value = input(prxposn(rx, 2, trim(_infile_)), ?? best32.);
      output;
   end;
run;
Ksharp
Super User

Yes. It is a XML file ,not CSV file. Using XMLV2 engine to import it.

 

%LET reporter=IN;
data _null_;
start="2019";
end="2024";
reporter="&reporter";

length urltest $5000 ;
 urltest = cats
('http://dataservices.imf.org/REST/SDMX_xml.svc/CompactData/IFS/M.'
,reporter
,'.PCPI_IX.?startPeriod='
,start
,'&endPeriod='
,end
,'');

call symputx('urltest',quote(trim(urltest),"'"));
put urltest $char.; 
run;


filename out5 "c:\temp\cpi.xml";
proc http
 url=&urltest
 method="get" out=out5;
run;

filename x temp;
libname x xmlv2 "c:\temp\cpi.xml" xmlmap=x automap=replace;
proc copy in=x out=work;run;

Ksharp_0-1713343886491.png

 

texasmfp
Lapis Lazuli | Level 10

Thanks Ksharp.  Your suggestion works as well.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 648 views
  • 0 likes
  • 3 in conversation