Hi all,
I am trying to correctly reference the path to the file so that I can use the code below, however, the INFILE path appears to be a problem (for some reason invalid - the script does not return an error but file with nearly all blank values). Any suggestions? Thanks
%Macro m(country);
DATA WORK.&country._newsp2;
LENGTH
PK_COUNTRY_ID $ 2
CAMPAIGN_CD $ 8
CELL_PACKAGE_SK $ 16
BK_CONSUMER_ID 8
CONTACT_DT $ 11 ;
FORMAT
PK_COUNTRY_ID $CHAR2.
CAMPAIGN_CD $CHAR8.
CELL_PACKAGE_SK $CHAR16.
BK_CONSUMER_ID BEST9.
CONTACT_DT $CHAR11. ;
INFORMAT
PK_COUNTRY_ID $CHAR2.
CAMPAIGN_CD $CHAR8.
CELL_PACKAGE_SK $CHAR16.
BK_CONSUMER_ID BEST9.
CONTACT_DT $CHAR11. ;
INFILE "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\&COUNTRY._NEWSP2.txt"
LRECL=50
ENCODING="UTF-8"
TERMSTR=CRLF
DLM='7F'x
MISSOVER
DSD ;
INPUT
PK_COUNTRY_ID : $CHAR2.
CAMPAIGN_CD : $CHAR8.
CELL_PACKAGE_SK : $CHAR16.
BK_CONSUMER_ID : ?? BEST9.
CONTACT_DT : $CHAR11. ;
RUN;
%mend m;
%m(SK);
I uploaded that file to my SAS server, and used
data test;
infile '$HOME/sascommunity/txt_example.txt' dlm=';' dsd truncover firstobs=2;
input
PK_COUNTRY_ID : $CHAR2.
CAMPAIGN_CD : $CHAR8.
CELL_PACKAGE_SK : $CHAR16.
BK_CONSUMER_ID : ?? BEST9.
CONTACT_DT : $CHAR11.
;
run;
proc print data=test noobs;
run;
The result looks like this:
PK_COUNTRY_ CAMPAIGN_ BK_CONSUMER_ ID CD CELL_PACKAGE_SK ID CONTACT_DT BG NEWS816 BG0816ENCCS01WG 557681 2016/08/16/
The first culprit in your code was the use of hex 7F as the delimiter.
If the infile name was the problem, you'd have an ERROR in the log.
Please post the log (if you get many repeating messages, just a few of them are sufficent), and some example lines from the .txt file, so we can run the code against it for testing.
You can attach a .txt file to your post, so the orignal contents will be preserved.
I added the log and the example of manually imported data (with wizard).
From your log:
NOTE: 514553 records were read from the infile "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\SK_NEWSP2.txt". The minimum record length was 50. The maximum record length was 50. One or more lines were truncated. NOTE: The data set WORK.SK_NEWSP2 has 514553 observations and 5 variables.
it means that your code, including infile path worked fine.
If you see blanks in the result dataset, it means that your encoding is not 'UTF-8'.
Check again your delimiter, is it realy '7F'x ? or maybe it was damaged while defining 'UTF-8' ?!
Beyond try running with option truncover instead missover.
Post screenshot of your result dataset.
I have post the result data set with empty records (picture2). The script was created by wizard, I simply put a macro variable in it so that I could run it for all countries I have available.
Without the source file "as is", that is useless. It just shows symptoms, but gives us NO means to determine what goes wrong.
@Uknown_user wrote:
I have post the result data set with empty records (picture2).
I have uploaded the sample txt file.
That text file is a semicolon delimited file format text file. The datastep you have used to read it in is fixed width file format, hence why you get nothing. The simplest method would be to run proc import on the file, then look at the code generated in your log and copy that and modify to your needs. It will be something like:
data want; infile "<file>,txt" dlm=";"; format ...; informat...; input var1 $ var2 var3 $; run;
I uploaded that file to my SAS server, and used
data test;
infile '$HOME/sascommunity/txt_example.txt' dlm=';' dsd truncover firstobs=2;
input
PK_COUNTRY_ID : $CHAR2.
CAMPAIGN_CD : $CHAR8.
CELL_PACKAGE_SK : $CHAR16.
BK_CONSUMER_ID : ?? BEST9.
CONTACT_DT : $CHAR11.
;
run;
proc print data=test noobs;
run;
The result looks like this:
PK_COUNTRY_ CAMPAIGN_ BK_CONSUMER_ ID CD CELL_PACKAGE_SK ID CONTACT_DT BG NEWS816 BG0816ENCCS01WG 557681 2016/08/16/
The first culprit in your code was the use of hex 7F as the delimiter.
Thanks, it helped.
You can even run next code to check the encoding:
filename txt "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\SK_NEWSP2.txt";
data test;
infile txt LRECL=50 ENCODING="UTF-8" TERMSTR=CRLF TRUNCOVER DSD OBS=5;
input a_line $50. ;
put a_line;
put a_line $hex100.;
run;
Check - can you see the text ? can you locate the delimiters ? are they realy '7F'x ?
Try to drop the encoding, what do you get ?
Well, both the log and the example in the png look OK to me.
To illustrate where the data in the dataset differs from the data in the txt file, you would have to post the excerpt from the txt file (not as screenshot, as attachment!), and a proc print result from the dataset after import.
Sorry, that code is a total mess and virtually impossible to read. Try using lower case programming so it doesn't sound like your shouting, use indentations, etc. and post code and logs into your post by using the code window either {i} or the one next to it. That log shows there is no errors, hence your datastep import is not reading the data correctly, however without seeing the text from the file (post a couple of lines from it - again in a code window - so we can see.
Also, why do you need a macro for this? Is there any value in wrapping it in a macro just to get country as dataset name - far better to read all files into one dataset with a variable for country - then you can use built in by group processing for faster and simpler coding.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.