BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
river1
Obsidian | Level 7

Hello

I am using SAS Enterprise Guide (Ver 8.3.8.206)  and I have imported a delimited file, which imported with no problems. However, in the column with free text there are sometimes tilde characters that are followed by text. If the tilde character was alone in the cell then it was read - but rows that had a tilde character preceding text have been ignored by SAS. Is this because SAS considers the character to be a logical not? Can I tell SAS to read these rows or do I need to data clean using different software to detect any special characters before I can use SAS? The dataset exceeds Excel's limits.

 

The data looked something like this:

EVENT_ID|FREE_TEXT|SECOND TEXT|CODE 

35434554654|~Description of free text|Second text box|23

 

Thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions
river1
Obsidian | Level 7

Hello

 

Thanks for your help, that worked perfectly. I had a date variable so I had to add a formatted informat to the code. It took a while to generate enough code to view the rows with the tilde issue. I still have a few problems with the dataset but I’ve started a new question thread.

 

data want  ;
infile "D:\file_location.txt" dlm="|" dsd firstobs=2 truncover;
length
  EVENT_ID $20. /* adapt if necessary */
  Num_1 4.
  Num_2 5.
  CHAR_1 $5.
  CHAR_2 $32767.
  CHAR_3 $32767.
  CHAR_4 $32767.
  CHAR_5 $32767.
  DATE_1 8.
  UNSURE $3.
  NUM_3 5.;
     informat DATE_1 mmddyy10.;
     format DATE_1 yymmddn8. ;

input
  EVENT_ID
  Num_1
  Num_2
  Num_1
  CHAR_1
  CHAR_2
  CHAR_3
  CHAR_4
  CHAR_5
  DATE_1
  UNSURE
  NUM_3
;
run;

View solution in original post

7 REPLIES 7
Kurt_Bremser
Super User

Try this code:

data want (compress=yes);
infile "path_to_your_file" dlm="|" dsd firstobs=2 truncover;
length
  event_id $11 /* adapt if necessary */
  free_text $32767
  second_text $32767
  code $2 /* adapt if necessary */
;
input
  event_id
  free_text
  second_text
  code
;
run;

If this does not work as intended, post the log by copy/pasting it into a window opened with this button:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

river1
Obsidian | Level 7

Hello

 

Thanks for your help, that worked perfectly. I had a date variable so I had to add a formatted informat to the code. It took a while to generate enough code to view the rows with the tilde issue. I still have a few problems with the dataset but I’ve started a new question thread.

 

data want  ;
infile "D:\file_location.txt" dlm="|" dsd firstobs=2 truncover;
length
  EVENT_ID $20. /* adapt if necessary */
  Num_1 4.
  Num_2 5.
  CHAR_1 $5.
  CHAR_2 $32767.
  CHAR_3 $32767.
  CHAR_4 $32767.
  CHAR_5 $32767.
  DATE_1 8.
  UNSURE $3.
  NUM_3 5.;
     informat DATE_1 mmddyy10.;
     format DATE_1 yymmddn8. ;

input
  EVENT_ID
  Num_1
  Num_2
  Num_1
  CHAR_1
  CHAR_2
  CHAR_3
  CHAR_4
  CHAR_5
  DATE_1
  UNSURE
  NUM_3
;
run;

river1
Obsidian | Level 7
I've added in: (compress=char) this solved the problem of the long processing time.

(compress=yes) only gave 74 rows of data
Kurt_Bremser
Super User

@river1 wrote:
I've added in: (compress=char) this solved the problem of the long processing time.

(compress=yes) only gave 74 rows of data

No difference there, CHAR and YES are the same, see COMPRESS= Data Set Option 

The COMPRESS= data set option only influences the space requirement (and, indirectly, the time needed to write it) of the resulting dataset. It does NOT have an impact on the number of observations read from the external file; this can be influenced by the TERMSTR= option of the INFILE statement.

 

If two different versions of code give results that puzzle you, post the complete logs as already mentioned.

river1
Obsidian | Level 7

My mistake compress=yes ran faster than compress=char. I think it was a problem with my computer when I ran the initial code

 

 

 

compress=yes

Log:
NOTE: DATA statement used (Total process time):
      real time           4:33.32
      cpu time            2:53.39

compress=char

log: NOTE: DATA statement used (Total process time): real time 4:49.08 cpu time 3:33.51

 

 

Tom
Super User Tom
Super User

What do you mean by "imported"? 

Did you use some GUI tool to import a file?  What code did it generate?

Did you write some code yourself?

What did the LOG look like?

 

The string you posted does not cause any trouble for SAS to read.

data test;
  infile datalines dsd dlm='|' truncover firstobs=2;
  input EVENT_ID :$11. FREE_TEXT :$40. SECOND_TEXT :$30. CODE ;
datalines;
EVENT_ID|FREE_TEXT|SECOND TEXT|CODE 
35434554654|~Description of free text|Second text box|23
;

Result

Obs     EVENT_ID              FREE_TEXT              SECOND_TEXT      CODE

 1     35434554654    ~Description of free text    Second text box     23

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1526 views
  • 2 likes
  • 3 in conversation