- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I recently used the following program to access and import consumer complaints data from the CFPB within Enterprise Guide for monthly risk intelligence mainly focusing on benchmarking against our peers. I would run the following simple Rest API Get Method to obtain the consumer complaints data (token is included but i'm not sure if its still needed):
METHOD="GET"
OUT=CFPB;
RUN;
CREATE TABLE WORK._01_EXT_COMPLAINTS AS
SELECT T1.*
FROM POSTS.ROOT T1;
QUIT;
The CFPB recently updated their API protocols and much of their site and now list the following link:
https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/
The CFPB also provides a method to test searches (GET statements) from their GitHub page:
https://cfpb.github.io/api/ccdb/api/index.html#/Complaints/get_
I ran a couple of successful tests to see if a response occurred and it produced the following curl statement on a couple of tests of displaying non-aggregating data as well as ascending by the creation date:
curl -X GET "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?sort=created_date_d..." -H "accept: application/json"
When I try to download the file using the previous PROC HTTP statement but replacing it with the following URL path:
filename cfpb temp;
/* Neat service from Open Notify project */
proc http
url="https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/"
method= "GET"
out=cfpb;
run;
libname space JSON fileref=cfpb;
PROC SQL;
CREATE TABLE WORK._01_EXT_COMPLAINTS_S AS
SELECT T1.*
FROM POSTS.ROOT T1;
QUIT;
I get the following timeout errors:
ERROR: Error connecting to 23.5.233.75:443. (The connection has timed out.)
ERROR: Unable to establish connection to www.consumerfinance.gov.
ERROR: Unable to connect to Web server.
I feel like i am missing a few things but i'm definitely a rookie in this space so any assistance/guidance would be most beneficial.
Thank You,
Elvis Kanlic
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I just tried this, it worked
filename req url 'https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?date_received_max=2020-01-31&date_received_min=2020-01-01&field=all&format=json&no_aggs=true&queryString=%3Fdate_received_max%3D2020-01-31%26date_received_min%3D2020-01-01%26field%3Dall%26size%3D25%26sort%3Dcreated_date_desc&size=23914&sort=created_date_desc';
LIBNAME resp JSON fileref=req;
The trick is to make sure the url is fully spelled out, rather than truncated with ...
Hope this helps,
Ahmed
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this,
filename req url 'https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?sort=created_date_desc&no_aggs=true' debug lrecl=8192;
LIBNAME resp JSON fileref=req;
From the looks of it, the json response is being put one single stream of characters, without any new line separators
But the JSON Library engine can parse the JSON response into separate JSON objects within.
Hope this helps,
Ahmed
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Ahmed for the quick response. Unfortunately I am getting the following error message:
ERROR: The connection has timed out..
ERROR: Error in the LIBNAME statement.
I thought maybe it had to due to the size of the file so I tried to create a subset by leveraging the query from the CFPB to gather only defects from 01/01/2020 to 01/31/2020 and still get the same error.
Here is the original link that I created from the CFPB's query for January 2020 complaints:
And I modified your code to include January only defects:
filename req url 'https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?date_received_max=2...' debug lrecl=8192;
LIBNAME resp JSON fileref=req;
Still the same error message:
ERROR: The connection has timed out..
ERROR: Error in the LIBNAME statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The code as posted by @AhmedAl_Attar works for me.
If you get a timeout executing this code 1:1 then eventually you've got a firewall blocking the new URL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the heads up - I will contact my admin to see what is the issue.Could either one of you please run the following code on your machine to see if it works? I would really appreciate it!
%let url2=2020-01-01%str(&)field=all%str(&)format=json%str(&)no_aggs=true;
%let fullurl=&url1.&url2.;
%put &url2;
%put &fullurl;
filename CFPB temp;
proc http
url= "&fullurl."
method="GET"
out=CFPB;
run;
/* Let the JSON engine do its thing */
libname posts JSON fileref=CFPB;
title "Automap of JSON data";
/* examine resulting tables/structure */
proc datasets lib=posts;
quit;
proc print data=posts.alldata(obs=20);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've executed the code you've posted but never got a result back (=job was hanging).
I then also used the generated URL with the syntax @AhmedAl_Attar posted.
filename req url 'https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?date_received_min2020-01-01&field=all&format=json&no_aggs=true' debug lrecl=8192;
LIBNAME resp JSON fileref=req;
This resulted in the following error which is likely due to my session running SBSC as LATIN1 and I can't change it to UTF8.
ERROR: Invalid JSON in input near line 1 column 2903276: Some code points did not transcode.
If your session runs with a session encoding of UTF8 then it might be worth to try and execute above code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I just tried this, it worked
filename req url 'https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/?date_received_max=2020-01-31&date_received_min=2020-01-01&field=all&format=json&no_aggs=true&queryString=%3Fdate_received_max%3D2020-01-31%26date_received_min%3D2020-01-01%26field%3Dall%26size%3D25%26sort%3Dcreated_date_desc&size=23914&sort=created_date_desc';
LIBNAME resp JSON fileref=req;
The trick is to make sure the url is fully spelled out, rather than truncated with ...
Hope this helps,
Ahmed
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I sent a ticket over to our Admin requesting them to release/remove the firewall to the url - Hey Ahmed, do you run into an issue when you try to create a sas dataset with the data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not from the latest sample I sent you. I get three data sets from the JSON Response
_SOURCE
ALLDATA
ROOT
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you - I am waiting for the admins to remove the firewall and will let you know if everything works on my end.