BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jen123
Fluorite | Level 6

I have a .xlsx file that I have to convert to .csv to use in SAS EG.  It is my understanding that SASEG 4.3 does not allow import of .xlsx files.

There is one column in the file with data of varying lengths.  Here is a sample of the column (with is alpha-numeric).  As you can see, there are 7-digit, 8-digit and 9-digit numbers.  The cells with 4-digits are missing the leading 6 zeros.

GOC_Number

2051
2098
1554630
1994104
2510107
10351204
10351235
100026820
100026823

I tried Text-to-Column in the .xlsx file then save it as .csv.  That did not work - it did not save the column as text.

I also tried this code change the variable length to z10.  That added zeros to the cells that are 7,8 or 9 digits lone.

Any thoughts/advice?

1 ACCEPTED SOLUTION

Accepted Solutions
AncaTilea
Pyrite | Level 9

Smiley Happy

data test;

input goc_number $;

cards;

2051

2098

1554630

1994104

2510107

10351204

10351235

100026820

100026823

;

data temp;

     set test;

     length_goc=length(goc_number);

     if length_goc=4 then do;

        temp_goc = compress('000000' || goc_number);

            format temp_goc $10.;

     end;

     if temp_goc ne "" then final_goc = temp_goc;

        else final_goc = goc_number;

                    format final_goc $10.;

     run;

Is this what you'd like?

View solution in original post

13 REPLIES 13
AncaTilea
Pyrite | Level 9


Hi.

I am not familair with EG, but if possible, you could calculate the length of your variable:

var_length = length(your_var);

if var_length = 4 then do;

     format your_var z10.;

end;

So this snippet should only add the leading zeroes to the entries that are 4 digits.

Good luck!

Anca.

jen123
Fluorite | Level 6

I tried this code but got an error.  Any advice please.

data test;

     set jen1.emp_current

     primary_goc=length(primary_goc);

     if primary_goc=4 then do;

     format primary_goc z10.;

     run;

15         data test;

16              set jen1.emp_current

17              primary_goc=length('primary_goc');

                       _

                       22

                       76

ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, (, -, :, ;, END, INDSNAME, KEY, KEYS, NOBS,

              OPEN, POINT, _DATA_, _LAST_, _NULL_. 

ERROR 76-322: Syntax error, statement will be ignored.

SASKiwi
PROC Star

Missing semicolon in line 16:

data test;

set jen1.emp_current;

primary_goc=length('primary_goc');

AncaTilea
Pyrite | Level 9

Hi Jen,

In this line of code:

primary_goc = length('primary_goc');

you are replacing the value of primary_goc with its length.

I think you should define a new variable

l_primary_goc = length(primary_goc); if l_primary_goc = 4 then....


Just a thought.

Regards,

Anca.

jen123
Fluorite | Level 6

I fixed the code and added the semi-colon.

data test;

     set jen1.emp_current;

     primary_goc=length(primary_goc);

     if primary_goc=4 then do;

     format primary_goc z10.;

     run;

I got the following error:

24              if primary_goc=4 then do;

25              format primary_goc z10.;

                               ____

                               484

NOTE 484-185: Format $Z was not found or could not be loaded.

26              run;

26              run;

               _

               117

ERROR 117-185: There was 1 unclosed DO block.

I changed z10. to $10. and that seemed to fix the issue.  However, I still got the same error on line 26.

           

AncaTilea
Pyrite | Level 9

After your format statement you need an end;

data test;

     set jen1.emp_current;

     primary_goc=length(primary_goc);

     if primary_goc=4 then do;

     format primary_goc z10.;

     end;

     run;

 

Thanks.

Anca.

jen123
Fluorite | Level 6

Hi Anca,

I ran this code:

data test;

     set jen1.emp_current;

     a_primary_goc=length(primary_goc);

     if a_primary_goc=4 then do;

     format primary_goc $10.;

     end;

     run;

Thoughts?

The results:

a_primary_goc column is = to 4 for all rows

primary_goc still in 4-digits.

AncaTilea
Pyrite | Level 9

Smiley Happy

data test;

input goc_number $;

cards;

2051

2098

1554630

1994104

2510107

10351204

10351235

100026820

100026823

;

data temp;

     set test;

     length_goc=length(goc_number);

     if length_goc=4 then do;

        temp_goc = compress('000000' || goc_number);

            format temp_goc $10.;

     end;

     if temp_goc ne "" then final_goc = temp_goc;

        else final_goc = goc_number;

                    format final_goc $10.;

     run;

Is this what you'd like?

Tom
Super User Tom
Super User

Looking at the various error messages you have posted your variable PRIMARY_GOC is a character variable and not a numeric variable.

Also you seem to be trying to treat the FORMAT statement as if it was an executable statement.  The FORMAT statements are compiled so putting them inside of an IF/THEN conditional will have no effect.

So do you want the character strings in PRIMARY_GOC that look like they might be numbers to be left padded with zeros?

primary_goc = put( input(primary_goc,10.),z10.);

jen123
Fluorite | Level 6

Tom - You are correct.  It's a character variable since it's alpha-numeric.

As I stated in the original post, the issue is that in this column there are cells of different lengths.  Some are 7, 8, 9 and 10 digits long.  The ones that are 4-digits long should be 10 digits.

If I used as you suggested, it would zeros in front of the 7,8 and 9 digit cells, which I do not want.

That data is what it is...

Tom
Super User Tom
Super User

If the issue is that you want all values that are four characters long to be prefixed with six zeros then that answer was posted above.

if length(primary_goc) = 4 then primary_goc='000000'||primary_goc;

twocanbazza
Quartz | Level 8

Why you trying to determine the length, wouldn't yuo just read the csv in delimited by ',' and define the variable in SAS as numeric 8.? and then apply the z10. format or put as char using z10.

Also if you just save teh file as CSV it should work, or try saving as 2003-2007 xls.

jen123
Fluorite | Level 6

I would prefer to work in 2003-2007 .xls, but the file has more than 65K rows. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1648 views
  • 0 likes
  • 5 in conversation