I have the code below which uses the attached .csv file as an input; my problem comes at the _null_ datastep.
I am trying to create a .xml file which holds the dataseries from all three states, but I can only get the last one. Ideally, my output would look smth like this. What am I doing wrong? Note value1 is different across state series.
Any thoughts would be extremely appreciated; thanks in advance.
<path> AL value1 value2................................................................................valueN </path>
<path> AR value1 value2...............................................................................valueN </path>
<path> AK value1 value2...............................................................................valueN </path>
data hpi;
infile "FilePath\DataRaw.csv" dsd firstobs=2;
length Date $6;
input Date $ AL AR AK;
run;
data newdata;
set hpi;
x=indexc(Date,'-');
if x=3 then do;
year=compress(substr(Date,1,2));
month=compress(substr(Date,4,6));
end;
if x=4 then do;
year=compress(substr(Date,5,6));
month=compress(substr(Date,1,3));
end;
year='20'||year;
newDate=compress(cat(month,year));
newDate2=input(newDate,monyy7.);
format newDate2 monyy7.;
keep AK AL AR newDate2;
rename newDate2=Date;
run;
proc transpose data=newdata
out=newdata1;
var AL AR AK;
id Date;
run;
data newdata2;
set newdata1;
rename _name_=State;
run;
/*Convert dataset into .xml file*/
data _null_;
file "FilePath\HPA.xml";
set newdata2 end=last;
/*PUT @001 '<?xml version="1.0"?>' */
do _n_=1 to last;
PUT @001 '<path>' /
state DEC2016 JAN2017 FEB2017 MAR2017 APR2017 MAY2017 JUN2017 JUL2017 AUG2017 SEP2017 OCT2017 NOV2017 DEC2017
JAN2018 FEB2018 MAR2018 APR2018 MAY2018 JUN2018 JUL2018 AUG2018 SEP2018 OCT2018 NOV2018 DEC2018
JAN2019 FEB2019 MAR2019 APR2019 MAY2019 JUN2019 JUL2019 AUG2019 SEP2019 OCT2019 NOV2019 DEC2019
JAN2020 FEB2020 MAR2020 APR2020 MAY2020 JUN2020 JUL2020 AUG2020 SEP2020 OCT2020 NOV2020 DEC2020
JAN2021 FEB2021 MAR2021 APR2021 MAY2021 JUN2021 JUL2021 AUG2021 SEP2021 OCT2021 NOV2021 DEC2021
JAN2022 FEB2022 MAR2022 APR2022 MAY2022 JUN2022 JUL2022 AUG2022 SEP2022 OCT2022 NOV2022 DEC2022
JAN2023 FEB2023 MAR2023 APR2023 MAY2023 JUN2023 JUL2023 AUG2023 SEP2023 OCT2023 NOV2023 DEC2023
JAN2024 FEB2024 MAR2024 APR2024 MAY2024 JUN2024 JUL2024 AUG2024 SEP2024 OCT2024 NOV2024 DEC2024
JAN2025 FEB2025 MAR2025 APR2025 MAY2025 JUN2025 JUL2025 AUG2025 SEP2025 OCT2025 NOV2025 DEC2025
JAN2026 FEB2026 MAR2026 APR2026 MAY2026 JUN2026 JUL2026 AUG2026 SEP2026 OCT2026 NOV2026 DEC2026
JAN2027 FEB2027 MAR2027 APR2027 MAY2027 JUN2027 JUL2027 AUG2027 SEP2027 OCT2027 NOV2027 DEC2027
JAN2028 FEB2028 MAR2028 APR2028 MAY2028 JUN2028 JUL2028 AUG2028 SEP2028 OCT2028 NOV2028 DEC2028
JAN2029 FEB2029 MAR2029 APR2029 MAY2029 JUN2029 JUL2029 AUG2029 SEP2029 OCT2029 NOV2029 DEC2029
JAN2030 FEB2030 MAR2030 APR2030 MAY2030 JUN2030 JUL2030 AUG2030 SEP2030 OCT2030 NOV2030 DEC2030
JAN2031 FEB2031 MAR2031 APR2031 MAY2031 JUN2031 JUL2031 AUG2031 SEP2031 OCT2031 NOV2031 DEC2031
JAN2032 FEB2032 MAR2032 APR2032 MAY2032 JUN2032 JUL2032 AUG2032 SEP2032 OCT2032 NOV2032 DEC2032
JAN2033 FEB2033 MAR2033 APR2033 MAY2033 JUN2033 JUL2033 AUG2033 SEP2033 OCT2033 NOV2033 DEC2033
JAN2034 FEB2034 MAR2034 APR2034 MAY2034 JUN2034 JUL2034 AUG2034 SEP2034 OCT2034 NOV2034 DEC2034
JAN2035 FEB2035 MAR2035 APR2035 MAY2035 JUN2035 JUL2035 AUG2035 SEP2035 OCT2035 NOV2035 DEC2035
JAN2036 FEB2036 MAR2036 APR2036 MAY2036 JUN2036 JUL2036 AUG2036 SEP2036 OCT2036 NOV2036 DEC2036
JAN2037 FEB2037 MAR2037 APR2037 MAY2037 JUN2037 JUL2037 AUG2037 SEP2037 OCT2037 NOV2037 DEC2037
JAN2038 FEB2038 MAR2038 APR2038 MAY2038 JUN2038 JUL2038 AUG2038 SEP2038 OCT2038 NOV2038 DEC2038
JAN2039 FEB2039 MAR2039 APR2039 MAY2039 JUN2039 JUL2039 AUG2039 SEP2039 OCT2039 NOV2039 DEC2039
JAN2040 FEB2040 MAR2040 APR2040 MAY2040 JUN2040 JUL2040 AUG2040 SEP2040 OCT2040 NOV2040 DEC2040
JAN2041 FEB2041 MAR2041 APR2041 MAY2041 JUN2041 JUL2041 AUG2041 SEP2041 OCT2041 NOV2041 DEC2041
JAN2042 FEB2042 MAR2042 APR2042 MAY2042 JUN2042 JUL2042 AUG2042 SEP2042 OCT2042 NOV2042 DEC2042
JAN2043 FEB2043 MAR2043 APR2043 MAY2043 JUN2043 JUL2043 AUG2043 SEP2043 OCT2043 NOV2043 DEC2043
JAN2044 FEB2044 MAR2044 APR2044 MAY2044 JUN2044 JUL2044 AUG2044 SEP2044 OCT2044 NOV2044 DEC2044
JAN2045 FEB2045 MAR2045 APR2045 MAY2045 JUN2045 JUL2045 AUG2045 SEP2045 OCT2045 NOV2045 DEC2045
JAN2046 FEB2046 MAR2046 APR2046 MAY2046 JUN2046 JUL2046 AUG2046 SEP2046 OCT2046 NOV2046 DEC2046
' </path> ';
end;
output;
run;
I have simplified last step using array and a do loop:
data _null_;
file "&FilePath.HPA.xml";
set newdata1 end=last;
array dat _numeric_;
if _N_=1 then PUT @001 '<?xml version="1.0"?>';
PUT @1 '<path>' state @;
do i=1 to dim(dat);
put dat(i) @;
end;
PUT '</path>';
run;
I have tried several changes to the xml line format but unsuccefully.
With that last code I get the next message:
"error on line 3 at column 1: extra content at the end of the document".
The error I get is on my linux system.
I opened the file in win10 system and no errors - all 3 states exist.
Attached is the HPA_xml.txt file. Rename it to HPA.xml and try to open it on your system.
I'm looking at your code and it is not clear:
data _null_; file "FilePath\HPA.xml"; set newdata2 end=last; /*PUT @001 '<?xml version="1.0"?>' */ do _n_=1 to last; PUT @001 '<path>' / state DEC2016 JAN2017 FEB2017 MAR2017 ... .............
.............' </path> '; end; output; run;
- what did you mean by do
_n_=1 to last;
...end ? You can cancel the do - end statement,
as the SET statement makes this loop implicitly.
- there is no meaning to the output statement when you start with
data _null_
;
You write to the file using
put
statement.
Understood; your point is very well taken. I have deleted the loop and I am now using PGStats' code below; the problem now is that I am only getting the AL record; I am missing the AR and AK records. Thoughts?
Could be simplified to:
data hpi;
infile "&sasforum\datasets\DataRaw.csv" dsd firstobs=2;
length DateStr $6;
input DateStr AL AR AK;
yr = compress(DateStr,,"kdo");
mt = compress(DateStr,"-","do");
date = input(cats("01",mt,"20",yr),date9.);
format date yymm.;
keep date al ar ak;
run;
proc transpose data=hpi out=newData name=State prefix=d_;
var al ar ak;
id date;
run;
data _null_;
file "&sasforum\datasets\HPA.xml";
set newdata;
if _n_=1 then PUT '<?xml version="1.0"?>';
PUT '<path>' / State (d_:) (best8.2) / '</path>';
run;
Thanks for getting back to me; I used your code. The problem now, however, is that I am getting only the AL record; I am missing the AK and AR records in my output .xml file.Thoughts?
I ran your code adapted to my SAS UE environment with the changes I popsted before, that is:
data _null_;
file "FilePath\HPA.xml";
set newdata2 end=last;
if _N_=1 then PUT @001 '<?xml version="1.0"?>';
do _n_=1 to last;
PUT @001 '<path>' /
state DEC2016 JAN2017 FEB2017 MAR2017 ...
............. .............
/'</path>';
end;
run;
When I opened the HPA.xml created file I see only AL BUT
1) There is an error message: "error on line 5 at column 1: Extra content at the end of the document"
2) When I asked to view the source there ar all 3 states.
So the error is a XML format error and not SAS programming error.
BTW - in your first step, better change next lines to avoid warning messages in the log:
if x=3 then do;
year=compress(substr(Date,1,2));
month=compress(substr(Date,4,3));
end;
if x=4 then do;
year=compress(substr(Date,5,2));
month=compress(substr(Date,1,3));
end;
Thanks. Any ideas on how to deal with this .xml formatting error?
I have simplified last step using array and a do loop:
data _null_;
file "&FilePath.HPA.xml";
set newdata1 end=last;
array dat _numeric_;
if _N_=1 then PUT @001 '<?xml version="1.0"?>';
PUT @1 '<path>' state @;
do i=1 to dim(dat);
put dat(i) @;
end;
PUT '</path>';
run;
I have tried several changes to the xml line format but unsuccefully.
With that last code I get the next message:
"error on line 3 at column 1: extra content at the end of the document".
The error I get is on my linux system.
I opened the file in win10 system and no errors - all 3 states exist.
Attached is the HPA_xml.txt file. Rename it to HPA.xml and try to open it on your system.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.