BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Uknown_user
Quartz | Level 8

Hi all,

 

I am trying to correctly reference the path to the file so that I can use the code below, however, the INFILE path appears to be a problem (for some reason invalid - the script does not return an error but file with nearly all blank values). Any suggestions? Thanks

 

%Macro m(country);

DATA WORK.&country._newsp2;

LENGTH

PK_COUNTRY_ID $ 2

CAMPAIGN_CD $ 8

CELL_PACKAGE_SK $ 16

BK_CONSUMER_ID 8

CONTACT_DT $ 11 ;

FORMAT

PK_COUNTRY_ID $CHAR2.

CAMPAIGN_CD $CHAR8.

CELL_PACKAGE_SK $CHAR16.

BK_CONSUMER_ID BEST9.

CONTACT_DT $CHAR11. ;

INFORMAT

PK_COUNTRY_ID $CHAR2.

CAMPAIGN_CD $CHAR8.

CELL_PACKAGE_SK $CHAR16.

BK_CONSUMER_ID BEST9.

CONTACT_DT $CHAR11. ;

INFILE "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\&COUNTRY._NEWSP2.txt"

LRECL=50

ENCODING="UTF-8"

TERMSTR=CRLF

DLM='7F'x

MISSOVER

DSD ;

INPUT

PK_COUNTRY_ID : $CHAR2.

CAMPAIGN_CD : $CHAR8.

CELL_PACKAGE_SK : $CHAR16.

BK_CONSUMER_ID : ?? BEST9.

CONTACT_DT : $CHAR11. ;

RUN;

%mend m;

%m(SK);

 


Picture.pngPicture2.png
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

I uploaded that file to my SAS server, and used

data test;
infile '$HOME/sascommunity/txt_example.txt' dlm=';' dsd truncover firstobs=2;
input
  PK_COUNTRY_ID : $CHAR2.
  CAMPAIGN_CD : $CHAR8.
  CELL_PACKAGE_SK : $CHAR16.
  BK_CONSUMER_ID : ?? BEST9.
  CONTACT_DT : $CHAR11.
;
run;

proc print data=test noobs;
run;

The result looks like this:

PK_COUNTRY_    CAMPAIGN_                       BK_CONSUMER_
    ID            CD        CELL_PACKAGE_SK         ID         CONTACT_DT

    BG          NEWS816     BG0816ENCCS01WG       557681       2016/08/16/

The first culprit in your code was the use of hex 7F as the delimiter.

View solution in original post

12 REPLIES 12
Kurt_Bremser
Super User

If the infile name was the problem, you'd have an ERROR in the log.

Please post the log (if you get many repeating messages, just a few of them are sufficent), and some example lines from the .txt file, so we can run the code against it for testing.

You can attach a .txt file to your post, so the orignal contents will be preserved.

Uknown_user
Quartz | Level 8

I added the log and the example of manually imported data (with wizard).

Shmuel
Garnet | Level 18

From your log:

NOTE: 514553 records were read from the infile 
      "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\SK_NEWSP2.txt".
      The minimum record length was 50.
      The maximum record length was 50.
      One or more lines were truncated.
NOTE: The data set WORK.SK_NEWSP2 has 514553 observations and 5 variables.

it means that your code, including infile path worked fine.

 

If you see blanks in the result dataset, it means that your encoding is not 'UTF-8'.

Check again your delimiter, is it realy '7F'x ? or maybe it was damaged while defining 'UTF-8' ?!

Beyond try running with option truncover instead missover.

 

Post screenshot of your result dataset.

Uknown_user
Quartz | Level 8

I have post the result data set with empty records (picture2). The script was created by wizard, I simply put a macro variable in it so that I could run it for all countries I have available.

Kurt_Bremser
Super User

Without the source file "as is", that is useless. It just shows symptoms, but gives us NO means to determine what goes wrong.


@Uknown_user wrote:

I have post the result data set with empty records (picture2).


 

Uknown_user
Quartz | Level 8

I have uploaded the sample txt file.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

That text file is a semicolon delimited file format text file.  The datastep you have used to read it in is fixed width file format, hence why you get nothing.  The simplest method would be to run proc import on the file, then look at the code generated in your log and copy that and modify to your needs.  It will be something like:

data want;
  infile "<file>,txt" dlm=";";
  format ...;
  informat...;
  input  var1 $
         var2 
         var3 $;
run;
Kurt_Bremser
Super User

I uploaded that file to my SAS server, and used

data test;
infile '$HOME/sascommunity/txt_example.txt' dlm=';' dsd truncover firstobs=2;
input
  PK_COUNTRY_ID : $CHAR2.
  CAMPAIGN_CD : $CHAR8.
  CELL_PACKAGE_SK : $CHAR16.
  BK_CONSUMER_ID : ?? BEST9.
  CONTACT_DT : $CHAR11.
;
run;

proc print data=test noobs;
run;

The result looks like this:

PK_COUNTRY_    CAMPAIGN_                       BK_CONSUMER_
    ID            CD        CELL_PACKAGE_SK         ID         CONTACT_DT

    BG          NEWS816     BG0816ENCCS01WG       557681       2016/08/16/

The first culprit in your code was the use of hex 7F as the delimiter.

Uknown_user
Quartz | Level 8

Thanks, it helped.

Shmuel
Garnet | Level 18

You can even run next code to check the encoding:

 

filename txt  "F:\Data\Baskin\Analytics\Data_Discovery\External_Files\Archive\JCH\Newsletter\Newsletter_custfiles\SK_NEWSP2.txt";

data test;

         infile txt   LRECL=50  ENCODING="UTF-8" TERMSTR=CRLF  TRUNCOVER DSD  OBS=5;

         input a_line $50. ;

         put a_line;

         put a_line  $hex100.;

run;

 

Check - can you see the text ? can you locate the delimiters ? are they realy '7F'x ?

Try to drop the encoding, what do you get ?

 

 

Kurt_Bremser
Super User

Well, both the log and the example in the png look OK to me.

To illustrate where the data in the dataset differs from the data in the txt file, you would have to post the excerpt from the txt file (not as screenshot, as attachment!), and a proc print result from the dataset after import.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, that code is a total mess and virtually impossible to read.  Try using lower case programming so it doesn't sound like your shouting, use indentations, etc. and post code and logs into your post by using the code window either {i} or the one next to it.  That log shows there is no errors, hence your datastep import is not reading the data correctly, however without seeing the text from the file (post a couple of lines from it - again in a code window - so we can see.

Also, why do you need a macro for this?  Is there any value in wrapping it in a macro just to get country as dataset name - far better to read all files into one dataset with a variable for country - then you can use built in by group processing for faster and simpler coding.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1290 views
  • 1 like
  • 4 in conversation