08-20-2014 09:30 PM
I am working on positioning data elements in SAS to exact record postions and then creating a text file? But for some reason I am not able to get the data show in the correct columns.
I need to submit some patient data to National Death Index registry to obtain their date and cause of death. So for each data elements, certain numbers of spaces are allowed: For instance Last name (1-20) First name (21-35)
SSN (37-45) and so on... I am attaching that document; chapter 2 and page 14-18 provide the guidance on how the data should be prepared.
The data is originally in excel file. I am attaching a portion of (unreal) data in the excel file. The first sheet of the excel file has some (unreal) data, that I tried to copy paste in the following SAS code in order to position the each data element to the positons as instructed in the PDF doc. The second sheet in the attached excel file shows the SAS output data. Please take a look and you'll know how the data is misplaced, not under the respective columns headings.
I needed to assign the spaces/positions to each element in SAS and then convert it to a text file without any delimiters. Below is the SAS code that I am using and the SAS output file is on sheet 2, which shows that the data is not in the correct columns. I am not sure if there's a specific way to enter the data in the SAS code. I just copy pasted it from the excel file into the SAS code.
input Last_Name $ 1-20 @21 First_Name $ 21-35 @36 Middle_Initial $ 36 @37 Social_Security_Number 37-45 @46Month_of_Birth 46-47 @48 Day_of_Birth 48-49 @50 Year_of_Birth 50-53 @54 Father_Surname $ 54-71 @72 Age_Unit_at_death_ 72 @73 Number_of_Age_Units_death 73-74 @75 Sex $ 75 @76 Race $ 76 @77 Marital_Status $ 77 @78 State_Residence $ 78-79 @80 State_Birth $ 80-81 @82 Control_ID_number 82-91 @92 Age_in_1991 92-97;
asdad dasdad 121331313 06 18 1928 M CA 10001 62
adad adad 121212121 02 23 1934 M CA 10002 56
adad adad 021212121 02 23 1934 M CA 10002 56
bnvn vnbvbn 021111212 10 27 1917 F CA 10003 72
rty lfghdh 326598542 07 31 1919 M CA 10008 71
sada bnmb 326512522 06 26 1933 M CA 10010 57
sdfsf ftgh 010201020 07 12 1920 M CA 10011 70
Please take a look and let me know if you could offer any suggestions/guidance on this.
Thanks so much!
08-20-2014 09:35 PM
The other two PDFs are: original EXcel data and the SAS output data obtained after running the code.
08-21-2014 01:40 AM
Your SAS input specification is according the NDI Specs.
Your test data does NOT look the NDIS Specs, you have to put those in the same positions as described
(Looks like a Cobol FD, C Struct, Java fixed length class)
Use a classic program editor with mono-type font and place your testdata in the described columns.
Do this very precise as described conform the instructions.
More easy is to use real NDI-data.
oops not a very good advice the content is sensitive privacy data.
08-21-2014 03:19 PM
Yes, tha's the reason. In an attempt to de-identify the patient data, I used the unreal names and SSNs, because I was going to post it here on the forum.
08-21-2014 09:35 AM
Perhaps I am misunderstanding the question, but it seems really simple to me. To write the data OUT from SAS you would just need to create a data step with a PUT statement that looks a lot like the INPUT statement that you posted. You would NOT be creating a PDF file from this, but instead be creating a text file.
08-21-2014 03:29 PM
Thanks Tom for the respone and sorry for the confusion. The forum doesnot allow any other file formats but PDF so I had to convert the excel into PDF format.
Secondly, the challenge I am facing is not getting the data to show in the correct columns. For instance, if you take a look at the 2nd PDF (Portion of data_SAS output) , you'll see that the first column which is for last name, includes the first name and part of SSN as well.. and same thing happens with other columns as well. I am not sure if my SAS code needs any revision or if I am required to use any more options to make sure that each column in the output SAS data file containts respective data. I have never inputted data in SAS using certain legnth for each variable data so I don't know if my code is correct or if it is missing anything.
Converting this SAS output file to text file will be the next/final step.
Hope I am making sense here and question is understood.
08-21-2014 03:41 PM
I still don't understand what your issue is.
If you have the data in a SAS dataset and you want to create a text file with the data in particular positions then just run a data step and create the file. There will be no need to "convert" the file you have written since it is already being created as a text file.
So if you have successfully created a dataset named TEST with the information as specified in the data step you posted then to write a text file you need to run a data step like this:
file 'mytextfile.txt' lrecl=97 ;
put @1 Last_Name
08-21-2014 04:05 PM
From your response what I understand is that first I should import the Excel data into SAS and then run the SAS code as above to get the text file with data arranged in specified positions.
I have almost 1500 records in an Excel file with all these variables. So I'll import it in SAS first and then run the code.Hope that works..
08-21-2014 04:10 PM
Should work. If you use PROC IMPORT then SAS might not make the data with the proper types. Also if the data is too long for the specified fields then you might have issues with the code above. If so you can add formats to force SAS to truncate the data to fit the number of columns reserved for it in the layout. For example:
put @1 Last_Name $20.
@21 First_Name $15.
@36 Middle_Initial $1.
08-21-2014 04:31 PM
So Tom, after importing when I ran the code, my log shows me following errors. I am attaching the PDF too to the original post, just to make ot easy to read the log...
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='sbhw2';
4 %LET _CLIENTPROJECTPATH='C:\Users\aerande\Documents\Project4.egp';
5 %LET _CLIENTPROJECTNAME='Project4.egp';
6 %LET _SASPROGRAMFILE=;
8 ODS _ALL_ CLOSE;
9 OPTIONS DEV=SASEMF;
10 GOPTIONS XPIXELS=0 YPIXELS=0;
11 ODS LISTING GPATH=&sasworklocation;
12 FILENAME EGHTML TEMP;
13 ODS HTML(ID=EGHTML) FILE=EGHTML ENCODING='utf-8' STYLE=Analysis
13 ! STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/x86/SASEnterpriseGuide/4.3/Styles
13 ! /Analysis.css") GPATH=&sasworklocation;
NOTE: Writing HTML(EGHTML) Body file: EGHTML
14 FILENAME EGPDF TEMP;
15 ODS PDF(ID=EGPDF) FILE=EGPDF STYLE=printer SAS;
NOTE: Writing ODS PDF(EGPDF) output to DISK destination "EGPDF", printer "PDF".
16 FILENAME EGSR TEMP;
17 ODS tagsets.sasreport12(ID=EGSR) FILE=EGSR STYLE=Analysis
17 ! STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/x86/SASEnterpriseGuide/4.3/Styles
17 ! /Analysis.css") GPATH=&sasworklocation ENCODING=UTF8 options(rolap="on");
NOTE: Writing TAGSETS.SASREPORT12(EGSR) Body file: EGSR
19 data _NULL_;
20 set sbhw.SBHW_SAS;
21 file 'SBHW_text.txt' lrecl=97 ;
22 put @1 Last_Name
23 @21 First_Name
24 @36 Middle_Initial
25 @37 Social_Security_Number
26 @46 Month_of_Birth
27 @48 Day_of_Birth
28 @50 Year_of_Birth
29 @54 Father_Surname
30 @72 Age_Unit_at_death_
31 @73 Number_of_Age_Units_death
32 @75 Sex
33 @76 Race
34 @77 Marital_Status
35 @78 State_Residence
36 @80 State_Birth
37 @82 Control_ID_number
38 @92 Age_in_1991
NOTE: Variable Last_Name is uninitialized.
NOTE: Variable First_Name is uninitialized.
NOTE: Variable Middle_Initial is uninitialized.
NOTE: Variable Social_Security_Number is uninitialized.
2 The SAS System 15:31 Tuesday, August 19, 2014
NOTE: Variable Month_of_Birth is uninitialized.
NOTE: Variable Day_of_Birth is uninitialized.
NOTE: Variable Year_of_Birth is uninitialized.
NOTE: Variable Father_Surname is uninitialized.
NOTE: Variable Age_Unit_at_death_ is uninitialized.
NOTE: Variable Number_of_Age_Units_death is uninitialized.
NOTE: Variable Marital_Status is uninitialized.
NOTE: Variable State_Residence is uninitialized.
NOTE: Variable State_Birth is uninitialized.
NOTE: Variable Control_ID_number is uninitialized.
NOTE: Variable Age_in_1991 is uninitialized.
ERROR: Insufficient authorization to access C:\Windows\system32\SBHW_text.txt.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.00 seconds
42 %LET _CLIENTTASKLABEL=;
43 %LET _CLIENTPROJECTPATH=;
44 %LET _CLIENTPROJECTNAME=;
45 %LET _SASPROGRAMFILE=;
48 ODS _ALL_ CLOSE;
NOTE: ODS PDF(EGPDF) printed no output.
(This sometimes results from failing to place a RUN statement before the ODS PDF(EGPDF)
51 QUIT; RUN;
08-21-2014 05:01 PM
So the error messages are saying the the variable names used in the PUT statement are not in the SAS dataset that you are trying to output.
Check the names of the variables that you actually have.
For example run
PROC CONTENTS data=sbhw.SBHW_SAS; run;
It does not matter to the text file what names the variables have in your SAS data set, but it does matter to the SAS code that is trying to write it.
08-21-2014 07:11 PM
Yes, i checked and fixed those.
But now one error still shows up; which is-
Not sure what this means..
ERROR: Insufficient authorization to access C:\Windows\system32\SBHW_text.txt.
08-21-2014 09:17 PM
You might try writing the file to another folder.
From your logs it looks like you are using Enterprise Guide.
You should search for how to generate a text file in Enterprise Guide and tell it to return the file from the SAS server back to the machine that is running Enterprise Guide.
Here is an example. You basically just need to replace the PROC EXPORT in the example with the DATA step that you now have working.
08-26-2014 07:47 PM
So I used the following code and the text file was created. Now the last issue (hopefully) I am facing here is how to make sure that there are 100 positions per record.
Where can I mention that in the code about the number of positions? I counted manually the spaces in the records; they are 99. Any suggestions?
filename NDItext 'C:\Users\SBHW\SBHWtext_1.txt';
put @1 Last_Name
08-26-2014 08:20 PM
you could either force the variable to use three spaces (columns 98,99 and 100)
@98 Blank_field $3.
Or if the last variable, BLANK_FIELD, is literally three blank spaces then you do not even need that variable and instead your could just write the spaces.
@98 ' '