BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bayzid
Obsidian | Level 7

How can i transpose the following long data to the wide format?

data long;
input var $ value source;
cards;
aid 45 1
bid 46 1
aid 47 1
bid 48 1
aid 45 2
bid 46 2
aid 47 2
bid 48 2
; run;

data wide;
input aid bid source;
cards;
45 46 1
47 48 1
45 46 2
47 48 2
; run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Works fine for me:

libname json json "C:\downloads\MyDrug_76de7937-695d-4981-84df-9cf34a029451.json" ;
data test;
 row+1;
 set json.alldata;
run;

proc print data=test(obs=10);
run;
OBS    row    P    P1              V    Value

  1      1    1    DrugId          1    e88c72aa-f2b1-4e87-b17a-cdc069a4825c
  2      2    1    BrandName       1    Pembrolizumab infusion
  3      3    1    GenericName     1    null
  4      4    1    Strength        1    null
  5      5    1    Form            1    null
  6      6    1    Route           1    null
  7      7    1    Schedule        1    null
  8      8    1    DoseUnit        1    null
  9      9    1    IsCytotoxic     1    false
 10     10    1    IsCytoStatic    1    false

But I don't think that is what you want to do if you are using the ALLDATA view since it already has multiple observations per JSON object.

 

If the JSON is well formed then try using something like this instead.

data test;
  retain row;
  set json.alldata;
  if _n_=1 then key=p1;
  retain key;
  row + key=p1;
run;
proc print data=test(obs=5);
 where p1=key;
run;
OBS    row    P      P1      V                   Value                     key

  1     1     1    DrugId    1    e88c72aa-f2b1-4e87-b17a-cdc069a4825c    DrugId
 28     2     1    DrugId    1    9e19974c-a373-4488-ade0-ea8e8cb171cf    DrugId
 55     3     1    DrugId    1    9e865e2f-ee0a-4616-bdfc-796cb0a835f4    DrugId
 82     4     1    DrugId    1    4affd24c-a1c7-4c69-893f-ba1774f05bb3    DrugId
109     5     1    DrugId    1    9c02b467-5165-4fdf-baef-11309c5aa92b    DrugId

View solution in original post

24 REPLIES 24
bayzid
Obsidian | Level 7

The aid and bid may change sequence or we can have more members to add, such as cid. Yes- all the id variables will have identical source.

Kurt_Bremser
Super User

So you must first define a clear rule for detecting the moment where you write a wide observation and reset the values of RETAINed variables to missing. Without an identifier for corresponding input observations, you can't use PROC TRANSPOSE, you'll need some tricky DATA step programming.

bayzid
Obsidian | Level 7

The data came from appending multiple JSON files. For each source there are same number of aid and bid.

Kurt_Bremser
Super User

But the issue here is that the combinations of source and var are not unique, so you need another rule which identifies the "groups" that should constitute a single wide observation.

BayzidurRahman
Obsidian | Level 7

Yes- that's the challenge. The id variables are not unique within the group of by variables. Can something be done using array?

bayzid
Obsidian | Level 7

We need to have a row identifier as a combination of var set and source as shown below.

data long;
input var $ value source row;
cards;
aid 45 1 1
bid 46 1 1
aid 47 1 2
bid 48 1 2
aid 45 2 3
bid 46 2 3
aid 47 2 4
bid 48 2 4
; run;
bayzid
Obsidian | Level 7

No.

bayzid
Obsidian | Level 7

As one can see in my last dataset, unique number should be assigned  for each new repeat of the var and source.

Kurt_Bremser
Super User

This needs some tricky code:

data long;
input var $ value source;
cards;
aid 45 1
bid 46 1
aid 47 1
bid 48 1
aid 45 2
bid 46 2
aid 47 2
bid 48 2
;

proc sql noprint;
select distinct var into :columns separated by " " from long;
quit;
%put &columns;
%macro wide;

data wide;
set long;
by source;
array c {*} &columns;
retain
%do i = 1 %to %sysfunc(countw(&columns));
  %scan(&columns,&i)
%end;
;
if first.source then call missing(of c{*});
select (var);
%do i = 1 %to %sysfunc(countw(&columns));
  when ("%scan(&columns,&i)") do;
    if %scan(&columns,&i) ne .
    then do;
      output;
      call missing(of c{*});
    end;
    %scan(&columns,&i) = value;
  end;
%end;
end;
if last.source and (
%do i = 1 %to %sysfunc(countw(&columns));
  %if &i ne 1 %then or;
  %scan(&columns,&i) ne .
%end;
)
then output;
drop var value;
run;

%mend;

%wide
bayzid
Obsidian | Level 7

Thanks for your time to generate the code. It works for numeric data with short length. But my actual data is a mixture of character and long integer and it doesn't work well.

My data:


Obs var value source row123456789101112
aid11a3fc66-1f44-4421-bad5-7b0f053a8a1911
bid149414100016810511
cid149413100016810111
aid11a3fc66-1f44-4421-bad5-7b0f053a8a1912
bid149413100016865912
cid149413100016833312
aid11a3fc66-1f44-4421-bad5-7b0f053a8a1923
bid215413100016810123
cid8879413100016810123
aid11a3fc66-1f44-4421-bad5-7b0f053a8a1924
bid159873100016810124
cid148975100016810124

when i use your code the resulting wide table looks like:


Obs source row aid bid cid1234
12.1.4941E151.4941E15
12.1.4941E151.4941E15
24.2.1541E158.8794E16
24.1.5987E151.4898E15

If i use proc transpose with source and row variables in the by statement it works fine. Thus I just need help to generate the row variable.


Obs source row aid bid cid1234
1111a3fc66-1f44-4421-bad5-7b0f053a8a1914941410001681051494131000168101
1211a3fc66-1f44-4421-bad5-7b0f053a8a1914941310001686591494131000168333
2111a3fc66-1f44-4421-bad5-7b0f053a8a19215413100016810188794131000168101
2211a3fc66-1f44-4421-bad5-7b0f053a8a1915987310001681011489751000168101

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 24 replies
  • 1196 views
  • 2 likes
  • 5 in conversation