PS I wrote this data step code in about 15 minutes, including editing the test data. Once you get used to it, you'll be quite as quick.
For future reference, this is a list of what went wrong:
- the format of my dates in Windows didn't match the format of my dates in SAS
- imported Excel files, which caused me to ahve sas7bdat files of 20-30 GB
- also used CSV files, but I get better results if you just change the .csv extention to .txt
- the code Kurt provided had length in it, which my original code didn't
I have found a possible solution fore your dates that does not necessitate manual conversion (as I did with the macro):
options dflang=dutch;
data _null_;
input date eurdfde9.;
format date yymmdd10.;
put date=;
cards;
01/mei/16
;
run;
will produce this log:
16 options dflang=dutch;
17
18 data _null_;
19 input date eurdfde9.;
20 format date yymmdd10.;
21 put date=;
22 cards;
date=2016-05-01
In your original script, I leave out:
%macro convert_date(invar,outvar);
format &outvar yymmdd10.;
&invar = tranwrd(&invar,'mrt','mar');
&invar = tranwrd(&invar,'mei','may');
&invar = tranwrd(&invar,'okt','oct');
&outvar = input(&invar,date9.);
drop &invar;
%mend;
AND
%convert_date(date_treated_c,date_treated);
%convert_date(r_creation_date_c,r_creation_date);
run;
Next up I insert the above script right before the data or is it during data but before the infile?
First, you need to set the dflang system option.
Then, in the data step, you assign the eurdfde9. informat to the date variables. Omit the _c variables I used for the input, instead use the final date variables in the input statement.
So the final code looks like this:
options dflang=dutch;
data boonen;
infile '$HOME/sascommunity/boonen.txt' dlm=';' truncover firstobs=2 lrecl=500;
length
r_object_id $16
object_name $100
document_direction $3
company $10
type_of_dossier_flow $20
kind_of_document $10
document_handler $20
dossier_number $20
detail_kind_of_document $100
date_treated 5
r_creation_date 5
document_state $10
;
informat
date_treated
r_creation_date
eurdfde9.
;
format
date_treated
r_creation_date
yymmdd10.
;
input
r_object_id
object_name
document_direction
company
type_of_dossier_flow
kind_of_document
document_handler
dossier_number
detail_kind_of_document
date_treated
r_creation_date
document_state
;
run;
Edit, your datafile is NOT a CSV file. It is a delimited file, you need to specify the delimiter, and also question with the person sending the file why they have used a semicolon in a Comma Seperated Variable file.
It would help if you provided the CSV with a couple of lines so we can run it on our end. Also, formatting your code will help readbility (remember, most code is looked at by other people, anything you can do to make that code simple and easy to read will make everyones life easier) - and in the below I have corrected several typos - inputr_object_id is not valid syntax for example. The input line is the key line here, this is the one that actually reads the data in, the informat statement tells the reader what format the data comes in as, and format tesll the reader what to write to the dataset as. The input here just states if the incoming is a string or not - you will see character ones have a $, numeric don't. I would also suggest you put a lenght statement in:
data tables.test; infile 'C:\temp\SAS_Data\Tables\document.csv' dlm=";" firstobs=2; length r_object_id $16 object_name $25 document_direction $5 company $3 type_of_dossier_flow $15 kind_of_document $8 document_handler $25 dossier_number $6 details_kind_of_document $10 date_treated 8 r_creation_date 8 document_state $10; informat r_object_id $16. object_name $25. document_direction $5. company $3. type_of_dossier_flow $15. kind_of_document $8. document_handler $25. dossier_number $6. details_kind_of_document $10. date_treated date9. r_creation_date date9. document_state $10.; format r_object_id $16. object_name $25. document_direction $5. company $3. type_of_dossier_flow $15. kind_of_document $8. document_handler $25. dossier_number $6. details_kind_of_document $10. date_treated date9. r_creation_date date9. document_state $10.; input r_object_id $ object_name $ document_direction $ company $ type_of_dossier_flow $ kind_of_document $ document_handler $ dossier_number $ details_kind_of_document $ date_treated r_creation_date document_state $; keep kind_ofdocument document_handler date_treated r_creation_date document_state; run;
Here are the 10 first rows of the data sheet.
The format input problem is due to copy pasting stuff.
Hi,
Sorry, as I said, it would be useful if you posted the file you were trying to import (the one called csv in your example). That below you have posted is an Excel file - which doesn't match what you were trying to import (and I don't download Excel files as they are a security risk).
I copied the data from the CSV file and pasted it into Excel, but forgot to save it appropriatly. It seems you can't attach CSV files either. Just rename the .txt to .csv and you should be ready to go.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.