BookmarkSubscribeRSS Feed
Pandu2
Obsidian | Level 7

Hi All, I reckon everyone is doing great. I require your assistance on something like. I've a CSV file and I imported into sas so, the data in that CSV file looks like the below one. And it is completely a json data so, from this I would like to create a table whose fields are Batch no, Goodsname, Quantity, Infotype and please closely observe this data as you see there's a value named  NEWLY, this value doesn't come often but when it comes the values under this has to be the fields like D_Infotype, D_Typeid,  D_Quantity. I'll explain you how this to be analysed, here BatchNo is an end point for every record so, until a new BatchNo comes the values should be treated as one records data.  Any help much appreciated. Thankyou.

Required output;
BatchNo  GoodsName   Quantity  Infotype      D_Infotype  D_Typeid D_Quantity
001       Nikeshoes      3      Sneakers                               
001       Pants          5      Trackpants      RAQ       109          0
009       Nikeshoes      3      Sneakers
009       Pants          5      Trackpants
011       Jackets        3      Topwears       RAQ       109           0
              
"""BatchNo"":""001""""
"""Goodsname"":""Nikeshoes""""
"""Quantity"":""3""""
"""Infotype"":""Sneakers""""
"""Goodsname"":""pants""""
"""Quantity"":""5""""
"""Infotype"":""Trackpants""""
"""Newly"":""""""
"""Infotype"":""RAQ""""
"""TypeId"":""109""""
"""Cause"":""Broken""""
"""Quantity"":""0""""
"""BatchNo"":""009""""
"""Goodsname"":""Nikeshoes""""
"""Quantity"":""3""""
"""Infotype"":""Sneakers""""
"""Goodsname"":""pants""""
"""Quantity"":""5""""
"""Infotype"":""Trackpants""""
"""BatchNo"":""011""""
"""Goodsname"":""Jackets""""
"""Quantity"":""3""""
"""Infotype"":""topwears""""
"""Newly"":""""""
"""Infotype"":""RAQ""""
"""TypeId"":""100""""
"""Cause"":""Broken""""
"""Quantity"":""0""""

 

23 REPLIES 23
andreas_lds
Jade | Level 19

Interesting problem. First thought was to use two input statements, the first reads the whole line, then some tweaking of _infile_ and the second input statement could use named input to fill the variables. On a second glance this route is blocked by reusing names (InfoType => InfoType + D_InfoType).

So everything has to be done manually. At this point "interesting" changed to "annoying".

Here's a first attempt:

data work.have;
   infile "PATH_TO_YOUR_FILE";
   input; /* read whole line */
  
   _infile_ = compress(_infile_, '"'); /* remove annoying quotes */
   
   length info value $ 40; /* maybe to short/long */
   
   info = upcase(scan(_infile_, 1, ':'));
   value = scan(_infile_, 2, ':');
run;

data work.want;
   set work.have end=jobDone;
   
   length 
      BatchNo $ 3
      GoodsName $ 40
      Quantity 8
      Infotype $ 20

      D_Infotype $ 3
      D_Typeid $ 3
      D_Quantity 8

     is_info 8
   ;
   
   retain BatchNo GoodsName Quantity Infotype D_Infotype D_Typeid D_Quantity is_info;
     
   if _n_ = 1 then do;
      is_info = 0;
   end;
   
   select (info);
      when ('BATCHNO') do;
         if not missing(BatchNo) then do;
            output;
            call missing(GoodsName, Quantity, Infotype, of D_:);
         end;
         
         BatchNo = value;
         is_info = 0;
      end;
      when ('GOODSNAME') do;
         if not missing(GoodsName) then do;
            output;
            call missing(Quantity, Infotype, of D_:);          
         end;
         
         GoodsName = value;
         is_info = 0;         
      end;
      when ('QUANTITY') do;
         if not is_info then do;
            Quantity = input(value, ?? best.);
         end;
         else do;
            D_Quantity = input(value, ?? best.);
         end;
      end;
      when ('INFOTYPE') do;
         if not is_info then do;
            Infotype = value;
         end;
         else do;
            D_Infotype = value;
         end;
      end;
      when ('TYPEID') do;
         D_TypeID = value;
      end;
      when ('NEWLY', 'CAUSE') do;
         is_info = 1;
      end;
      otherwise;
   end; 
   
   if jobDone then do;
      output;
   end;
   
   drop info value;
run;
Sajid01
Meteorite | Level 14

Hello @Pandu2 
The data in the post does not look like the content of a .json file. Can you please share a sample of the original json file.
SAS oes have a procedure for reading json. Have a look here files https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/proc/p0ie4bw6967jg6n1iu629d40f0by.htm

Pandu2
Obsidian | Level 7

Hey Sajid,

 

Yes you're right, it's not the original json format I previously got rid of the nested brace delimiters except quotes.

ballardw
Super User

@Pandu2 wrote:

Hey Sajid,

 

Yes you're right, it's not the original json format I previously got rid of the nested brace delimiters except quotes.


Perhaps that was not the thing to do as there are specific methods to deal with at least some JSON files. What you have created is not JSON and is a pretty poor file to do anything with. The way the quotes are left do not match any typical character separated text file either.

 

I might suggest going back to your original file and then look at the SAS documentation for the LIBNAME JSON statement. If your original file is reasonably nice you should be able to see a data set with some similar to what you want.

Pandu2
Obsidian | Level 7

Hi Andreas,

 

Thankyou for providing the code for me, however after running it there are no values in it. Gave empty dataset.

andreas_lds
Jade | Level 19

@Pandu2 wrote:

Hi Andreas,

 

Thankyou for providing the code for me, however after running it there are no values in it. Gave empty dataset.


If the first dataset is empty, than the file you have doesn't match what you have posted in the second box.

Pandu2
Obsidian | Level 7

The first dataset has values in a key value pair but the 2nd dataset has no values.

Pandu2
Obsidian | Level 7

Please I request you to help me on this.

Reeza
Super User

It would really help to provide a sample of the raw source file, this isn't structured well as there's no real indication of when a group ends/starts easily. The parenthesis do help with that and you've removed them and their logic so now we have to guess...

 

Pandu2
Obsidian | Level 7

The reason why I didn't post the exact json data with parenthesis is bcuz of even if I post the complete raw data there's no use of it cuz that's how the data looks. So, for your point where the group ends/ starts. It starts with BatchNo and ends with a new batchNo. Please have a look at my Required Output and the raw data which I posted that would makes sense to you. Thankyou!.

Sajid01
Meteorite | Level 14

Hello @Pandu2 
Please see the following posts. following the method you should be able import a json file to a sas dataset

Solved: Return a input string as a JSON table - SAS Support Communities 

Solved: Re: Check the input of date is a correct input for... - SAS Support Communities

In case you still have issues do post the log and provide data as done in the post.

Reeza
Super User

@Pandu2 wrote:

The reason why I didn't post the exact json data with parenthesis is bcuz of even if I post the complete raw data there's no use of it cuz that's how the data looks.

No it doesn't, it has the additional parenthesis which help define rows/groups. Your data does not. 

 


So, for your point where the group ends/ starts. It starts with BatchNo and ends with a new batchNo. Please have a look at my Required Output and the raw data which I posted that would makes sense to you. Thankyou!.

However, BatchNo is retained between groups, so what identifies each new line? GoodsName comes after the BatchNo so that makes it annoying to process.

Pandu2
Obsidian | Level 7

Actually when a new batchNo comes that's the sign of previous BatchNo group ends. That's how I figured out after doing some analysis how the group ends, otherwise our business and I doesn't know how it ends cuz the complexity of data and not following any specific pattern.

Reeza
Super User

This works for your example data but will not scale if you add other variables or the pattern breaks. 

 

data have;
input String : $200.;
cards;
"""BatchNo"":""001""""
"""Goodsname"":""Nikeshoes""""
"""Quantity"":""3""""
"""Infotype"":""Sneakers""""
"""Goodsname"":""pants""""
"""Quantity"":""5""""
"""Infotype"":""Trackpants""""
"""Newly"":""""""
"""Infotype"":""RAQ""""
"""TypeId"":""109""""
"""Cause"":""Broken""""
"""Quantity"":""0""""
"""BatchNo"":""009""""
"""Goodsname"":""Nikeshoes""""
"""Quantity"":""3""""
"""Infotype"":""Sneakers""""
"""Goodsname"":""pants""""
"""Quantity"":""5""""
"""Infotype"":""Trackpants""""
"""BatchNo"":""011""""
"""Goodsname"":""Jackets""""
"""Quantity"":""3""""
"""Infotype"":""topwears""""
"""Newly"":""""""
"""Infotype"":""RAQ""""
"""TypeId"":""100""""
"""Cause"":""Broken""""
"""Quantity"":""0""""
;;;;
run;



data step1;
set have;
retain BatchNo;

fieldName = compress(scan(string, 1, ":"), '"');
fieldValue = compress(scan(string, 2, ":"), '"');

if fieldName = 'BatchNo' then BatchNo = FieldValue;
else output;
run;


data step2;
set step1;
by BatchNo;
retain rowNum Prefix;
length prefix $2.;

if fieldName = 'Goodsname' then do;
rowNum+1;
call missing(prefix);
output;
end;
else if fieldName = 'Newly' then 
prefix = 'D_';
else do;
fieldName = catt(prefix, fieldName);
output;
end;

run;

proc transpose data=step2 out=want;
by batchno rownum;
id fieldName;
var fieldValue;
run;

 

Results:

Obs BatchNo rowCount _NAME_ Goodsname Quantity Infotype Newly D_Infotype D_TypeId D_Cause D_Quantity
1 001 1 fieldValue Nikeshoes 3 Sneakers          
2 001 2 fieldValue pants 5 Trackpants   RAQ 109 Broken 0
3 009 3 fieldValue Nikeshoes 3 Sneakers          
4 009 4 fieldValue pants 5 Trackpants          
5 011 5 fieldValue Jackets 3 topwears   RAQ 100 Broken 0

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 1212 views
  • 8 likes
  • 5 in conversation