Dear All,
I have delimited txt file with 3 million records. I would like to convert into SAS Dataset. Please help me how can I do that ?
sampele Text Records are below.
"0000012121","ROSE","N","N","N","N","N","N","N"
"0000002103","ROBT JENKINS","N","N","N","N","N","N","N"
"0000010111","ANA MAY NAG","N","N","N","N","N","N","N"
"0000010222","DAAA AA","N","N","N","N","N","N","N"
"0000011333","THAA C BLUUU HEEEEEEE","N","N","N","N","N","N","N"
"0000012444","MICCAAA SAAAAAAA","N","N","N","Y","Y","Y","Y"
"0000013555","PATTTTTT SMITH","Y","Y","Y","Y","Y","N","Y"
"0000013666","ELLMM BRAAAAA","N","N","N","N","N","N","N"
"1001113777","DOK A BAAA, JR.","N","N","N","N","N","N","N"
Note: 2nd field for some records, even in middle of the name, commas are there. Please suggest me how can I overcome this scenario while converting into SAS Dataset ?
Regards,
S Ravuri.
Should be simple to achieve (DSD option implies comma separated fields and takes care of unpacking quoted strings) :
data want;
infile ""c:\yourPath\yourFileName.txt" dsd;
length code $10 name $40 v1-v7 $1;
input code name v1-v7;
run;
PG
Have you tried proc import, via the point and click interface? If the comma's are within the quotation marks SAS will read it properly.
ie File > Import Data
Specify comma delimited
Should be simple to achieve (DSD option implies comma separated fields and takes care of unpacking quoted strings) :
data want;
infile ""c:\yourPath\yourFileName.txt" dsd;
length code $10 name $40 v1-v7 $1;
input code name v1-v7;
run;
PG
How beautiful is the perfection of simplcity - especially when it is so aesthetically structured. Kudos, sir.
None of these answers are necessarily wrong - there are, indeed, correct. However, regardless of the number of records, the KISS principle still reigns supreme. The simplest answer is usally the best AND the most efficient. PROC IMPORT has one advantage, perhaps, in that it includes the INFORMAT and FORMAT statements but in this case, given the simplicity of the records and the blessedness of the DSD option, it is an unnecessary complication to an otherwise very simple act. Please remember, fellow SASsers - this is exactly what good old version 5.13 was built around - bullding SAS data sets from flat files. It did the job just fine then and can do so now in v9. Cheers!
You can also use proc import directly. The critical thing you have to tell the code, in either case, is whether the file contains variable names. Here is an example of the proc import code:
PROC IMPORT OUT= WORK.want
DATAFILE= "C:\have.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
If you use the Proc Import interface for comma delimited data as Reeza suggests, I would recommend going to the options and setting the guessing rows value to the maximum of 32767. The number of records is such that your second field might have a considerable number of values longer than might be found in the default first 20 rows.
I would also recommend saving the code generated by proc import that will appear in the log as you may want to modify something and it is easier if you have a starting point. For instance, you don't mention if you have variable names in your file. If the are not on the first row of the file you are likely to end up with variables named Var1, Var2, Var3 ... You could change that in the code from the log to be meaningful such as Id, Name or what ever makes sense.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.