I am running into a situation where proc import dbms=csv is failing to identify numbers. Every column comes back as type 'CHAR' despite the CSV passing validation with flying colors.
filename datain url 'https://data.cms.gov/resource/csi5-w7vr.csv';
proc import datafile=datain out=dataout dbms=csv replace;
Alphabetic List of Variables and Attributes# Variable Type Len Format Informat2321142592048182162215172426272829303132333435363738394041153101112137196
The file has specifications with several numbers. Documentation for file is here. https://dev.socrata.com/foundry/data.cms.gov/csi5-w7vr
I suggest downloading that file and examining or reading from a local version.
There are a number of things that may be going on with the HTTPS connection. And we can't see any of them or the actual data because of security.
What does the generated data step code look like?
I would suggest, especially if you are going to read other files with the same layout, that you copy the generated code from the log to the editor, change the character informats for the variables that should be numeric or dates to an appropriate informat, save the code and rerun the data step instead of Proc Import.
Also look in the log for any messages related to "transcoding" or "encoding" issues.
Took a look and they are putting a " " around everything.
You can modify the data step as mentioned then add something like:
numvar = input(charvar,6.);
The 6 would come from the length of the informat associated with the character version.
If you have currency values then use a COMMA format.
If the variable should be a date then use an appropriated date informat instead.
It seems asterisks are being used to denote missing values. proc import can't deal with that.
Write your own data step, according to the file specification, and you may have to do extra logic to handle the asterisks in a graceful manner.
The quotes are handled by using the dsd option in the infile statement. Replacing the single asterisks with nothing can be done by applying the tranwrd() function to _infile_ (the input buffer), and then doing the actual read.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.