BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

What is the RULE that says when to wrap them?

Do you just want to take pairs?

data long;
  input var $ value source;
cards;
aid 45 1
bid 46 1
aid 47 1
bid 48 1
aid 45 2
bid 46 2
aid 47 2
bid 48 2
; 

data long_pairs;
  set long;
  group + (1=mod(_n_,2));
run;

proc transpose data=long_pairs out=wide(drop=_name_);
  by group source;
  id var;
  var value;
run;

proc print;
run;
OBS    group    source    aid    bid

 1       1         1       45     46
 2       2         1       47     48
 3       3         2       45     46
 4       4         2       47     48
bayzid
Obsidian | Level 7

The rule is each appearance of all the distinct values in var by source. If we have 3 distinct values in var then the last digit needs to be 3 to convert into long_pair. Thus, we need to find out the number of distinct values in "var" for each dataset.

data long;
input var $ value $36. source row;
cards;
aid 11a3fc66-1f44-4421-bad5-7b0f053a8a19 1 1
bid 1494141000168105                     1 1
cid 1494131000168101                     1 1
aid 11a3fc66-1f44-4421-bad5-7b0f053a8a19 1 2
bid 1494131000168659                     1 2
cid 1494131000168333                     1 2
aid 11a3fc66-1f44-4421-bad5-7b0f053a8a19 2 3
bid 2154131000168101                     2 3
cid 88794131000168101                    2 3
aid 11a3fc66-1f44-4421-bad5-7b0f053a8a19 2 4
bid 1598731000168101                     2 4
cid 1489751000168101                     2 4
; run;

data long_pairs;
  set long;
  group + (1=mod(_n_,3));
run;

proc transpose data=long_pairs out=wide(drop=_name_);
  by group source;
  id var;
  var value;
run;

I am trying to avoid to find out the number of distinct values.

Reeza
Super User
Go back and check the JSON files. Maybe you shouldn't be working with the all data files but different files. Or there should be something that identifies each group uniquely. Or if it's from different files you can add in each file name.
bayzid
Obsidian | Level 7

The file name is being saved in the Master_&group datasets but that is not enough to uniquely identify the rows to reshape into wide. The JSON files do not have any column as row identifier but it appears on the left of each file as shown below (1-6).

bayzid_0-1682540736295.png

We need to find some way to import that information from each file as a new column in the Master_&group dataset.

 

Reeza
Super User
In JSON the brackets denote the series of information and group it, which is why I'm suggesting you look at the json mapped data and see if a different file has the data in the format you need.

Those numbers are just line numbers in the text editor. And if you changed the view you could change the line numbers. It's not something you can rely on.
bayzid
Obsidian | Level 7

The row numbers that appear in the text editor would be fine for the purpose of transposing the data to wide format.

Tom
Super User Tom
Super User

If you read the JSON file using the JSON engine then is should have made an ORDINAL variable you can use.

If you read the JSON file with your own data step then just make your own counter variables.

data want;
  infile 'file.json';
  row+1;
  ....
run;
bayzid
Obsidian | Level 7

I am using the row+; in the following code but it is not generating the row variable as the JSON engine would make the ordinal variable.

filename dirtree url 
    'https://raw.githubusercontent.com/sasutils/macros/master/dirtree.sas';
%include dirtree /source2;
%dirtree(C:\Users\mq10004085\Macquarie University\MRFF Aged care Data - Documents\json_test    /* Pipe delimited directory list (default=.) */
        , out=filelist  /* Output dataset name */
        , maxdepth=1 /* Maximum tree depth */);
%macro read_json(myfile=, group = );
    filename mydata &myfile.;
    libname myjson JSON fileref=mydata nrm;;
    proc datasets lib=myjson; 
    quit;
    *how to parse the file has not been specified - looks like you need to specify the lengths;
    data file2save;
        length source $200. p1 $256. Value $256.;
        set myjson.alldata;
		        source=&myfile.; 
           	row+1;  run;
    proc append base=master_&group data=file2save force;  run;
    proc sql;
        drop table file2save;
    quit;
    filename mydata;
    libname myjson;
%mend read_json;
proc sql;
    drop table master;
quit;
data run_list;
    set filelist;
    where type='F';
   *add other logic to add group variable, this is just an example;
    group=substr(filename, 1, find(filename,'_'));
*test this is created correctly;
    str=catt('%read_json(myfile="C:\Users\mq10004085\Macquarie University\MRFF Aged care Data - Documents\json_test\', filename, '", group=', group,  ');');
  *once tests are complete uncomment the execution to test it;  
 call execute(str);
run;

This is how it looks like.

bayzid_0-1682551782289.png

 

Tom
Super User Tom
Super User

Works fine for me:

libname json json "C:\downloads\MyDrug_76de7937-695d-4981-84df-9cf34a029451.json" ;
data test;
 row+1;
 set json.alldata;
run;

proc print data=test(obs=10);
run;
OBS    row    P    P1              V    Value

  1      1    1    DrugId          1    e88c72aa-f2b1-4e87-b17a-cdc069a4825c
  2      2    1    BrandName       1    Pembrolizumab infusion
  3      3    1    GenericName     1    null
  4      4    1    Strength        1    null
  5      5    1    Form            1    null
  6      6    1    Route           1    null
  7      7    1    Schedule        1    null
  8      8    1    DoseUnit        1    null
  9      9    1    IsCytotoxic     1    false
 10     10    1    IsCytoStatic    1    false

But I don't think that is what you want to do if you are using the ALLDATA view since it already has multiple observations per JSON object.

 

If the JSON is well formed then try using something like this instead.

data test;
  retain row;
  set json.alldata;
  if _n_=1 then key=p1;
  retain key;
  row + key=p1;
run;
proc print data=test(obs=5);
 where p1=key;
run;
OBS    row    P      P1      V                   Value                     key

  1     1     1    DrugId    1    e88c72aa-f2b1-4e87-b17a-cdc069a4825c    DrugId
 28     2     1    DrugId    1    9e19974c-a373-4488-ade0-ea8e8cb171cf    DrugId
 55     3     1    DrugId    1    9e865e2f-ee0a-4616-bdfc-796cb0a835f4    DrugId
 82     4     1    DrugId    1    4affd24c-a1c7-4c69-893f-ba1774f05bb3    DrugId
109     5     1    DrugId    1    9c02b467-5165-4fdf-baef-11309c5aa92b    DrugId

BayzidurRahman
Obsidian | Level 7

Thanks Tom.

That worked as expected.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 24 replies
  • 1197 views
  • 2 likes
  • 5 in conversation