BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dingdang
Fluorite | Level 6

Hi,

i finanlly read an excel table with 5000 columns and 100 rows into SAS Smiley Happy now the task is to correct the variable types, informats and formats, since SAS probably didnt get them all right. To this purpose, i have another excel table, say informationtable, documenting the right types and formates etc. (you can image a table with 2 colums and 5000 rows, the first column lists the variables names and the second column contains statements like 'long text', 'number without decimal places', 'procent with two decimal places', 'dates' and so on).

now i would like to correct the data with the values from the informationtable. I dont know how to do it with such a big number of variables that i am having. I am thinking about joining the tables together and using 'where'. can somehow help me with the code?

thanks very much.

BR  Dingdang

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Now that you have the list of variables you can use them to standardize your data. The problem with your data step is that you never referenced the data you wanted it to use, hence the uninitialized error message.  Also FORMAT is for how you want the data displayed and is really NOT what you want to use for these types of long character variables.  Normally for character variables you will get better results in SAS if you do NOT attach any permanent format to them. You can use the FORMAT statement to remove any formats that PROC IMPORT might have mistakenly attached.

To recreate the data with the new variable definitions use the same name in the DATA and SET statements.

data libl.have_kopie;

  length &textvar $300 ;

  set libl.have_kopie;

  INFORMAT &textvar ;

  FORMAT &textvar ;

RUN;

View solution in original post

14 REPLIES 14
Dingdang
Fluorite | Level 6

out of my naive SAS knowledge I did the following: I saved the variables with property 'long text' into a macro variable :textvar and tried to use it in a data step:

DATA libl.have_kopie;  /*this is the dataset, whose variable formats i want to correct.

INFORMAT &textvar $300.;

FORMAT &textvar $300.;

RUN;

this didnt work. I guess i am not supposed to use a macro variable this way.

Can someone help me correct the code? or maybe is this way of solving the problem a nonsence?

BR  Dingdang

PaigeMiller
Diamond | Level 26

Without more details about what the error was, and how you created &textvar, we really can't help.

There is no general problem using &textvar as you have done.

Do you really want 5000 variables to each have informat and format identically equal to $300. ?? That would take up a huge amount of storage space.

--
Paige Miller
Dingdang
Fluorite | Level 6

this is how i created &textvar:

PROC SQL;

SELECT Field    / *this is the column that contains all the variablenames from Inforamtiontable*/

INTO :textvar SEPARATED BY ''

FROM libl.Informationtable

WHERE Formats_SAS="Text";    /*Formats_SAS is the second column in informationtable, it contains the property like 'Text' 'Dates' etc. */

QUIT;

I ran my code before and in the log stands:

DATA libl.have_kopie;

INFORMAT &textvar $300.;

ERROR: The Variable        RTF00_0080_040RTF00_0120_040RTF00_0240_040RTF00_0260_040RTF00_0280_040RTF01_0100_030RTF01_0        110_030RTF01_0120_030RTF01_0130_030RTF01_0140_030RTF01_0150_030RTF01_0160_030RTF01_0170_030        RTF01_0180_030RTF01_0190_030RTF01_0100_050RTF01_0110_050RTF01_0120_050R has more than 32 characters.

So I am guessing I cannot use a macro variable like this.

I have about 1000 variables with the property 'Text' and unfortunately there are some with really long texts in them. I will later identify those ones and leave them more space and the others less so I dont waste so much storage space. But thanks for the hint.

BR  Dingdang

Tom
Super User Tom
Super User

You need to put spaces between the individual variable names.  It looks like perhaps you put underscores instead and so your 500 variable names became one variable name that was way too long.

Dingdang
Fluorite | Level 6

hi Tom,

that was a good point. I now put a space into the statement :SEPARATED BY ' '.

Then I ran the code again:

DATA libl.have_kopie;  /*this is the dataset, whose variable formats i want to correct.

INFORMAT &textvar $300.;

FORMAT &textvar $300.;

RUN;

and I got the note: the variable... is not initialised.

and the result is that libl.have_kopie turned into a dataset with only one (empty) observation (so all my original 100 observations are gone...)  and the variales from &textvar.

BR  Dingdang

Tom
Super User Tom
Super User

Now that you have the list of variables you can use them to standardize your data. The problem with your data step is that you never referenced the data you wanted it to use, hence the uninitialized error message.  Also FORMAT is for how you want the data displayed and is really NOT what you want to use for these types of long character variables.  Normally for character variables you will get better results in SAS if you do NOT attach any permanent format to them. You can use the FORMAT statement to remove any formats that PROC IMPORT might have mistakenly attached.

To recreate the data with the new variable definitions use the same name in the DATA and SET statements.

data libl.have_kopie;

  length &textvar $300 ;

  set libl.have_kopie;

  INFORMAT &textvar ;

  FORMAT &textvar ;

RUN;

Dingdang
Fluorite | Level 6

hi Tom,

your code solves the problem of this post. Thanks very much Smiley Happy

I will try other suggestions as well with regards to importing data.

BR  Dingdang

data_null__
Jade | Level 19

I would not want PROC IMPORT doing that much guessing fore me.  If you know the attributes of the fields it would seem easier to me to have XL create a flat file "CSV" perhaps and write your own data step to read the file.  You can define the attributes and get them right with no need to fix it the mess make by PROC IMPORT.

Dingdang
Fluorite | Level 6

hi data_null_,

because the number of variables I have (5000) I was hoping there is another way around the 'typing in all the informats and formating myself'. But if there is no better way, guess i will have to do that.

BR  Dingdang

PaigeMiller
Diamond | Level 26

because the number of variables I have (5000) I was hoping there is another way around the 'typing in all the informats and formating myself'. But if there is no better way, guess i will have to do that.

But of course there are better ways.

First, as Tom points out, you didn't separate the variable names. Your coding error is here:

PROC SQL;

SELECT Field    / *this is the column that contains all the variablenames from Inforamtiontable*/

INTO :textvar SEPARATED BY ''    <------ ERROR ERROR ERROR

FROM libl.Informationtable

WHERE Formats_SAS="Text";    /*Formats_SAS is the second column in informationtable, it contains the property like 'Text' 'Dates' etc. */

QUIT;

It should say SEPARATED BY ' ' — note should be a space between the two single quotes

As data _null_ has pointed out, there may still be even smarter ways to handle this, maybe even using a CSV file.

As I point out, there may still be even smarter ways to handle this, do you really want to have 5000 columns with the same INFORMAT and FORMAT? Your initial problem statement seemed to indicate otherwise.

--
Paige Miller
Dingdang
Fluorite | Level 6

hi PaigeMiller, data _null_

I think I didnt quite understand what you meant by using a CSV file.

Actually I dont need all the variables with attribute 'long text' to have the same Format and Informat. I just want SAS to read my data correctly. I saved my original excel table into a .txt file with tab and imported into SAS with Proc import. But because SAS only reads the first 20 observations and guesses the format there must be some variables that got the wrong format. so I thought i could correct this with my informationtable to avoid typing in the formats and informats.

maybe this was not a very smart way to solve my problem. Any idea how i can make this easier and better?

thank you so much!

BR  Dingdang


data_null__
Jade | Level 19

Maybe all you need to do is adjust the GUESSINGROWS parameter.

Dingdang
Fluorite | Level 6

hi data_null_

in my case i will change the guessingrows to 100. I dont really have much experience with SAS. If I do this, SAS will make the right guessing?

data_null__
Jade | Level 19

I don't know if it will guess right but there will be more data. RTM.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 3210 views
  • 7 likes
  • 4 in conversation