Hello everybody,
I have a handful of JSON files that I can now transform automatically into datasets, in part thanks to this link and in part thanks to some documentation on macros and loops. The thing is that I'm transposing the files, groped by the first data column (p1) from the ALLDATA table and, in a couple of files, some of the variables in p1 are being truncated as they have more than 32 characters, making them duplicates.
Just to make it clear, they are NOT variable names. They are observations and my guess is that either while reading the JSON files or while transposing them they are treated as variables. Otherwise I understand that they would be as long as the largest one... right?
The user that helped me with that in that post, also gave me solution to this but I can only use it when I know the variables the JSON has. In my case, there will be hundreds of JSON files each with its own variables. They will all have something in common and that is their tree structure so I will call the ALLDATA p1 variable the same way every time. Is there any way of using a data step to format the length of just the first column and leave the rest as they are? Here's the code he suggested to do that when knowing the variables beforehand:
data want;
length
category $32
location $100
speed $20
idCategory $10
speedMBps 8
timeFixedPrice $10
includedDiscount 8
;
set json.category: indsname=indsname;
category=scan(indsname,-1,'.');
drop ordinal_: ;
run;
Thanks in advance!!!
So you are saying the values in the JSON text are too long to use as variable names?
If so then use them as the variable LABEL instead. Create a different variable to use as the variable NAME.
If nothing else you can just number them like this:
data alldata;
set json.alldata end=eof;
where v>0;
by p1 notsorted;
length variable $32 ;
if _n_=1 then do;
declare hash h();
rc=h.definekey('p2');
rc=h.definedata('variable','varnum','p2');
rc=h.definedone();
end;
if h.find() then do;
varnum+1;
variable=cats('var',varnum);
rc=h.add();
end;
if eof then rc=h.output(dataset:'allvars');
run;
proc print data=allvars;
run;
proc transpose data=alldata out=want(drop=_name_);
by p1 notsorted ;
id variable;
idlabel p2 ;
var value ;
run;
proc print;
run;
proc print label;
run;
Result:
So you are saying the values in the JSON text are too long to use as variable names?
If so then use them as the variable LABEL instead. Create a different variable to use as the variable NAME.
If nothing else you can just number them like this:
data alldata;
set json.alldata end=eof;
where v>0;
by p1 notsorted;
length variable $32 ;
if _n_=1 then do;
declare hash h();
rc=h.definekey('p2');
rc=h.definedata('variable','varnum','p2');
rc=h.definedone();
end;
if h.find() then do;
varnum+1;
variable=cats('var',varnum);
rc=h.add();
end;
if eof then rc=h.output(dataset:'allvars');
run;
proc print data=allvars;
run;
proc transpose data=alldata out=want(drop=_name_);
by p1 notsorted ;
id variable;
idlabel p2 ;
var value ;
run;
proc print;
run;
proc print label;
run;
Result:
@Tom thanks again. You are a genius and a life saver. I owe you two beers now 😉
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.