BookmarkSubscribeRSS Feed
brophymj
Quartz | Level 8

 

I have a dataset with 100000 rows and each row contains an extremely long string with lots of data. I'm want to create a new field with the "total payout value". This is the first number after the string "total_payout_value". The problem is the length of the string between the "total_payout_value" and the number can vary and also the number length can vary. Here is a sample from my dataset:

 

"total_payout_value" : { "asset" : "SBD", "amount" : 37.13 }, "pending_payout_value" : { "asset" : 

"total_payout_value" : {"amount" : 37.13, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 

"total_payout_value" : {"amount" : 0, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 

 

The actual sting is 1000s of characters long but each string will contain the string "total_payout_value" only once and the first number after this string is what I'm trying to extract. You will also see that sometimes the order of words between "total_payout_value" and the number can change so unfortuntately this length is not fixed. 

 

I tried this code but I don't know how to allow for the varying lengths. Really, what I need is is something that find the first number after the string "total_payout_value" 

 

data test1(keep = start end payout );
set data;
format payout $8.;
start = find(dataset,' "total_payout_value" : { "asset" : "SBD", "amount" : ')+54;
end = find(dataset,', "asset" : "SBD" }, "author"');
payout = substr(dataset,start,end-start));
run;

 

 

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What you have there is a JSon file, its a particular type of file structure.  Rather than trying to parse it yourself, use one of the methods given in your previous post on the subject:

https://communities.sas.com/t5/ODS-and-Base-Reporting/importing-JSON-data/m-p/346679/highlight/true#...

 

Or one of the many other posts on the subject:

https://communities.sas.com/t5/Base-SAS-Programming/Confusion-errors-in-libname-Json/m-p/319716/high...

https://communities.sas.com/t5/Base-SAS-Programming/Need-to-convert-a-JSON-file-to-SAS-dataset/m-p/3...

 

There are also several free tools out there, for example:

http://csvkit.readthedocs.io/en/1.0.2/

 

Kurt_Bremser
Super User

How to retrieve data using name-value pairs, manually:

%let id1=total_payout_value;
%let id2=amount;

data test;
infile cards truncover;
input o_string $100.;
string = o_string;
i = index(string,"&id1");
string = substr(string,i);
string = scan(string,2,'{');
string = scan(string,1,'}');
i = index(string,"&id2");
string = substr(string,i+length("&id2"));
string = scan(string,2,':');
string = scan(string,1,',');
&id2 = input(string,best.);
drop i string;
cards;
"total_payout_value" : { "asset" : "SBD", "amount" : 37.13 }, "pending_payout_value" : { "asset" : 
"total_payout_value" : {"amount" : 37.13, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 
"total_payout_value" : {"amount" : 0, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 
;
run;

 

Oligolas
Barite | Level 11

Hi,

 

what about regex?

data test;
infile cards truncover;
input string $100.;
cards;
"total_payout_value" : { "asset" : "SBD", "amount" : 37.13 }, "pending_payout_value" : { "asset" : 
"total_payout_value" : {"amount" : 37.13, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 
"total_payout_value" : {"amount" : 0, "asset" : "SBD"}, "pending_payout_value" : { "asset" : 
;
run;

data test1;
   length var $200;
   set test;
   var=prxchange('s/.*total_payout_value\D*(\d*\.?\d*).*/$1/',-1,string);
run;

 

 

________________________

- Cheers -

Astounding
PROC Star

I think you are making this overly complex by searching for the wrong string.  The payout amount seems to follow the first instance of "amount" : so search for that instead.

 

This would create a character variable:

 

data want;

set have;

payout = scan( substr(dataset, index(dataset, '"amount" :' ) + 10), 1);

run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 770 views
  • 2 likes
  • 5 in conversation