I am trying to use the json libname engine to parse some large ish files (example of files located here: https://transparency-in-coverage.uhc.com/undefined). The specific file I'm looking at is named this: 2022-08-01_UNITED-HEALTHCARE-SERVICES_Third-Party-Administrator_PS1-50_C2_in-network-rates.json.gz I try to parse this file as below. filename gen3 ZIP '/data/2022-08-01_UNITED-HEALTHCARE-SERVICES_Third-Party-Administrator_PS1-50_C2_in-network-rates.json.gz' GZIP;
libname gen3 json NOALLDATA ; These are large files so I don't expect anyone to download them and try this out necessarily but wanted to provide context. After this I try to save off the different files parsed from the json to disk using something like the following code. data ps1.IN_NETWORK;
set gen3.IN_NETWORK;
run;
data ps1.NEGOTIATED_PRICES_SERVICE_CODE;
set gen3.NEGOTIATED_PRICES_SERVICE_CODE;
run;
data ps1.PROVIDER_GROUPS_NPI;
set gen3.PROVIDER_GROUPS_NPI;
run; The data steps run fine for all of the files parsed from the json, except for one (Provider_Groups_NPI). This file is not the largest file of any of them, although it is about 10 GB. The data step completes in about 10-15 minutes for one of the non problem files that is about 25 GB. But for the provider groups npi it gets 99% of the file written to the desired location and then freezes for about 10-12 hours, then it saves the file off. It does this repeatedly, so it's not some sort of one off processing error or slowdown. Does anyone have thoughts as to what could be going on here or ways I could attempt to troubleshoot?
... View more