BookmarkSubscribeRSS Feed
sassavy
Calcite | Level 5

I tried to read a file with french accent words using sas unicode version. Some of the values are not read correctly. especially the  the data row  for August is not read correctly. Please help.

****** Data in my text  file (CA_French_Month.txt) ************

janv|janvier|January

févr|février|February

déc |décembre |December

août | août | August

mars|mars|March

avril|avril|April

mai|mai|May

juin|juin|June

juil|juillet|July

sept|septembre|September

oct|octobre|October

nov|novembre|November

déc |décembre |December

************* My sas code ************

data frch_desc;

length  frc_desc $20 frc_abbr $30 mth_desc $20;

infile "c:\CA_French_Month.txt" dlm= "|" truncover;

input

  frc_abbr $

        frc_desc $

  mth_desc $

  ;

run;

proc print;

run;

**************output  ****************

 


                                     The SAS System           18:04 Monday, April 15, 2013   1

                       Obs    frc_desc     frc_abbr                mth_desc

                         1    janvier      janv                    January
                         2    février      févr                    February
                         3    décembre     déc                     December
                         4                 août | août | August
                         5    mars         mars                    March
                         6    avril        avril                   April
                         7    mai          mai                     May
                         8    juin         juin                    June
                         9    juillet      juil                    July
                        10    septembre    sept                    September
                        11    octobre      oct                     October
                        12    novembre     nov                     November
                        13    décembre     déc                     Decembe

3 REPLIES 3
PGStats
Opal | Level 21

If I cut and paste from your posting, everything reads perfectly (I'm using version 9.3 on Windows 7)

                                   The SAS System   21:10 Monday, April 15, 2013   1

                     Obs    frc_desc     frc_abbr    mth_desc

                       1    janvier       janv       January
                       2    février       févr       February
                       3    décembre      déc        December
                       4    août          août       August
                       5    mars          mars       March
                       6    avril         avril      April
                       7    mai           mai        May
                       8    juin          juin       June
                       9    juillet       juil       July
                      10    septembre     sept       September
                      11    octobre       oct        October
                      12    novembre      nov        November
                      13    décembre      déc        December

If I convert the file from ANSI to UTF-8 with Notepad++ and read it again, I get the message

NOTE: A byte order mark in the file "*********************

      ****************\CA_French_Month.txt" (for fileref "#LN00011") indicates

      that the data is encoded in "utf-8".  This encoding will be used to process

      the file.

and the file reads flawlesly.

Try posting your txt file.

PG
Andre
Obsidian | Level 7

Sassavy,

I tried too of course under a french sas 9.3.2 opened in ut8 mode   in W7 32b  in french language

reading from a text editor (with correct local writing of your french source).

with

infile "d:\CA_French_Month.txt" dlm= "|" truncover  encoding="pcoem850";

2013-04-16 12_06_40-SAS.png

everything inside sas seems to work but is the accented character real utf8?

Ultraedit show me the equivalent of what is visible  in the output sas windows   after a copy

                            1     janvier       janv       January
                            2     fÚvrier      fÚvr      February
                            3     dÚcembre     dÚc       December
                            4     ao¹t         ao¹t      August
                            5     mars          mars       March
                            6     avril         avril      April
                            7     mai           mai        May
                            8     juin          juin       June
                            9     juillet       juil       July
                           10     septembre     sept       September
                           11     octobre       oct        October
                           12     novembre      nov        November
                           13     dÚcembre     dÚc       December

But i encounter a problem similar to yours with august

if i use your code or  this one

infile "d:\CA_French_Month.txt" dlm= "|" truncover  encoding="ansi";

2013-04-16 12_09_56-SAS.png

in ultraedit i see then after a copy:

                       1     janvier      janv                January
                        2     février      févr                February
                        3     décembre     déc                 December
                        4                  août|août|August
                        5     mars         mars                March
                        6     avril        avril               April
                        7     mai          mai                 May
                        8     juin         juin                June
                        9     juillet      juil                July
                       10     septembre    sept                September
                       11     octobre      oct                 October
                       12     novembre     nov                 November
                       13     December     déc|décembre

the original data were  without blanks  like

2013-04-16 12_14_34-D__CA_French_Month.png

sassavy
Calcite | Level 5

PGStats,

     I am using sas 9.2 and here is the text file i used. .

I give more details,  This file is in  default (ANSI) encoding. I have read this file and eventually load the french words into Oracle database (11g) . I am still curious why all other rows are read correctly and not August alone.

Any suggstions would help me trmendously.

Thanks

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 3851 views
  • 0 likes
  • 3 in conversation