I am creating a SAS dataset that I need in a specific format to be written out using proc export to create a tab delimited text file. The output has the first row missing the variable values and only contains the hard coded values and the last row is empty so the output is missing the last row of the dataset. I do not know why this is happening. Is there somthing I am missing is there an option I need to add or is my dataset creation incorrect?
Any assistance you can offer will be greatly appreciated.
Thanks,
Elliott
here is my code:
data my_data
(keep= Task_ID
Process_ID
Team_ID
Received_Date
Received_Time
Case_Notes
Task_Due_date
Process_Due_date
Allocation_Staff_ID
PendDate
CompletedTime
Reference_1
Reference_2
Reference_3
Reference_4
Reference_5
Reference_6
Reference_7
Reference_8
Additional_Field_1
Additional_Field_2
Additional_Field_3
Additional_Field_4
Additional_Field_5
Additional_Field_6
Additional_Field_7
Additional_Field_8
Additional_Field_9
Additional_Field_10
Additional_Field_11
Additional_Field_12
Additional_Field_13
Additional_Field_14
Additional_Field_15
Additional_Field_16
Additional_Field_17
Additional_Field_18
Additional_Field_19
Additional_Field_20
Additional_Field_21
Additional_Field_22
Additional_Field_23
Additional_Field_24
Additional_Field_25
Additional_Field_26
Additional_Field_27
Additional_Field_28
Additional_Field_29
Additional_Field_30
Additional_Field_31
Additional_Field_32);
retain
Task_ID
Process_ID
Team_ID
Received_Date
Received_Time
Case_Notes
Task_Due_date
Process_Due_date
Allocation_Staff_ID
PendDate
CompletedTime
Reference_1
Reference_2
Reference_3
Reference_4
Reference_5
Reference_6
Reference_7
Reference_8
Additional_Field_1
Additional_Field_2
Additional_Field_3
Additional_Field_4
Additional_Field_5
Additional_Field_6
Additional_Field_7
Additional_Field_8
Additional_Field_9
Additional_Field_10
Additional_Field_11
Additional_Field_12
Additional_Field_13
Additional_Field_14
Additional_Field_15
Additional_Field_16
Additional_Field_17
Additional_Field_18
Additional_Field_19
Additional_Field_20
Additional_Field_21
Additional_Field_22
Additional_Field_23
Additional_Field_24
Additional_Field_25
Additional_Field_26
Additional_Field_27
Additional_Field_28
Additional_Field_29
Additional_Field_30
Additional_Field_31
Additional_Field_32
;
format
Task_ID $16.
Process_ID $16.
Team_ID
Received_Date
Received_Time
Case_Notes
Task_Due_date
Process_Due_date
Allocation_Staff_ID
PendDate
CompletedTime
Additional_Field_1
Additional_Field_2
Additional_Field_3
Additional_Field_4
Additional_Field_5
Additional_Field_6
Additional_Field_7
Additional_Field_8
Additional_Field_9
Additional_Field_10
Additional_Field_11
Additional_Field_12
Additional_Field_13
Additional_Field_14
Additional_Field_15
Additional_Field_16
Additional_Field_17
Additional_Field_18
Additional_Field_19
Additional_Field_20
Additional_Field_21
Additional_Field_22
Additional_Field_23
Additional_Field_24
Additional_Field_25
Additional_Field_26
Additional_Field_27
Additional_Field_28
Additional_Field_29
Additional_Field_30
Additional_Field_31
Additional_Field_32 $10.
Reference_1 $18.
Reference_2 $10.
Reference_3 $50.
Reference_4 $10.
Reference_5 $10.
Reference_6 $10.
Reference_7 $10.
Reference_8 $16.
;
Task_ID ='Error_Detail';
Process_ID ='Error_Detail';
Team_ID ='Team 372 ' ;
Received_Date =' ';
Received_Time =' ';
Case_Notes =' ';
Task_Due_date =' ';
Process_Due_date =' ';
Allocation_Staff_ID =' ';
PendDate ='';
CompletedTime ='';
Reference_1 = ACCT_NBR;
Reference_2 = ' ' ;
Reference_3 = sub_client;
Reference_4 = LANG_CD_W;
Reference_5 = LANG_CD_V;
Reference_6 = language_ind;
Reference_7 = ERR_LANG_IND;
Reference_8 = 'Error_Detail';
Additional_Field_1 =' ';
Additional_Field_2 =' ';
Additional_Field_3 =' ';
Additional_Field_4 =' ';
Additional_Field_5 =' ';
Additional_Field_6 =' ';
Additional_Field_7 =' ';
Additional_Field_8 =' ';
Additional_Field_9 =' ';
Additional_Field_10 =' ';
Additional_Field_11 =' ';
Additional_Field_12 =' ';
Additional_Field_13 =' ';
Additional_Field_14 =' ';
Additional_Field_15 =' ';
Additional_Field_16 =' ';
Additional_Field_17 =' ';
Additional_Field_18 =' ';
Additional_Field_19 =' ';
Additional_Field_20 =' ';
Additional_Field_21 =' ';
Additional_Field_22 =' ';
Additional_Field_23 =' ';
Additional_Field_24 =' ';
Additional_Field_25 =' ';
Additional_Field_26 =' ';
Additional_Field_27 =' ';
Additional_Field_28 =' ';
Additional_Field_29 =' ';
Additional_Field_30 =' ';
Additional_Field_31 =' ';
Additional_Field_32 =' ';
set data_pop;
run;
proc export data=my_data
outfile="/my_path/my_directory/my_file.txt"
dbms=tab
replace;
run;
PROC EXPORT is not the problem. We know that works.
But I cannot figure out what you are trying to do with the data step you have before the PROC EXPORT. And I cannot read the wall of code you posted.
Why can't you just write out the original dataset, DATA_POP?
Why do you have the data step? What is it trying to do?
The only thing it things it could possible be doing are
1) Changing the order of the variables.
2) Changing the LENGTH of the variables (which is a side effect of your FORMAT statement appearing before the SET statement).
3) Changing the formats attached to some variables.
4) Adding some variables that do not already appear in DATA_POP and setting them to a constant value that will be retained onto every observation.
It will not add any observations that do not already appear in DATA_POP.
PROC EXPORT is not the problem. We know that works.
But I cannot figure out what you are trying to do with the data step you have before the PROC EXPORT. And I cannot read the wall of code you posted.
Why can't you just write out the original dataset, DATA_POP?
Why do you have the data step? What is it trying to do?
The only thing it things it could possible be doing are
1) Changing the order of the variables.
2) Changing the LENGTH of the variables (which is a side effect of your FORMAT statement appearing before the SET statement).
3) Changing the formats attached to some variables.
4) Adding some variables that do not already appear in DATA_POP and setting them to a constant value that will be retained onto every observation.
It will not add any observations that do not already appear in DATA_POP.
Hello Tom,
I need to create the output text file in a specific format for uploading to a workflow queueing system. It needs those exact variables in that order. There are only a few fields that populate with data from data pop. The rest are hardcoded or blank.
If I have written my data step incorrectly, what have I done wrong? This is the first time I have had to create a file like this and am trying to learn the best way to achieve the output needed.
Thank you for your help.
Elliott
If the goal is just to set the variable order and add extra constant variables then you might be on the right track. You just made the code way too complicated.
But it should have worked.
So please explain what is wrong with the output. For example you mentioned it is excluding the last line, but your code does not exclude any lines.
Hello Tom,
I was able to resolve my issues by moving the format below the set stmt in the datastep.
What could I have done to streamline the code if I have made it too complicated?
Thanks,
Elliott
In terms of the data step you don't need the FORMAT step at all, unless you want to just add one to REMOVE and formats that might have been attached in the source dataset.
Using a RETAIN statement to force the variable order is a normal work around. You could reduce the typing by using variable lists for the two large series of variables with numeric suffixes. Once you have the order defined the KEEP statement can be reduced to just using a positional variable list so you only have to type the first and last variable name.
Since you are setting some variable to constants there is no need to set a LENGTH for any of the variables.
So perhaps something like this:
data my_data ;
retain
Task_ID Process_ID Team_ID Received_Date Received_Time
Case_Notes Task_Due_date Process_Due_date Allocation_Staff_ID
PendDate CompletedTime Reference_1-Reference_8
Additional_Field_1-Additional_Field_32
;
keep Task_ID -- Additional_Field_32;
format _all_ ;
set data_pop;
Task_ID ='Error_Detail';
Process_ID ='Error_Detail';
Team_ID ='Team 372 ' ;
Reference_1 = ACCT_NBR;
Reference_3 = sub_client;
Reference_4 = LANG_CD_W;
Reference_5 = LANG_CD_V;
Reference_6 = language_ind;
Reference_7 = ERR_LANG_IND;
Reference_8 = 'Error_Detail';
run;
Thank you
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.