BookmarkSubscribeRSS Feed
SASSOSmak
Calcite | Level 5

Hey Tom, I revised your solution to fit my needs as seen below. It seems to create the output, but I still get an error message. Any guidance would be appreciated. Note the "+(date)" line in the log on line 100. Confused as to why that is populating.

 
%macro help(date);
data test;
set &date.;

	%do i=1 %to 78;
	
	call symputx('typ',vtype(var&i.));
	
	%if N=&typ %then %do;
		new_var&i.=put(var&i.,$100.);
	%end;
	%else %do;
		new_var&i.=var&i.;
	%end;
	%end;

run;
%mend;
%help(data_20200727);

 

 

Log:

1 %studio_hide_wrapper;
82 %macro help(date);
83 data test;
84 set &date.;
85
86 %do i=1 %to 78;
87
88 call symputx('typ',vtype(var&i.));
89
90 %if N=&typ %then %do;
91 new_var&i.=put(var&i.,$100.);
92 %end;
93 %else %do;
94 new_var&i.=var&i.;
95 %end;
96 %end;
97
98 run;
99 %mend;
100 %help(data_20200727);
100 +(date);
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.
ERROR 76-322: Syntax error, statement will be ignored.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
11 REPLIES 11
Tom
Super User Tom
Super User

 

The macro processor runs and passes the generated text onto SAS to interpret as actual SAS code (as opposed to macro instructions). 

So the %IF statement is trying to reference a macro variable, TYP, that you have not set a value to yet.  And if it did have a value it would be the same for every iteration of the %DO loop.

 

What you you trying to do?

SASSOSmak
Calcite | Level 5

General Purpose: I need all the variable types in my dataset to be character.

 

Therefore, I'm trying to creating a new character variable for every variable in my dataset, and then delete the old ones.

 

Original variables: var1, var, var3...varn (in this specific case, varn=var78);

New Variables: new_var1, new_var2, new_var3... new_varn

 

i.e. If var1 is numeric then new_var1=put(var1,$100.);

i.e. If var2 is character then new_var2=var2;

 

Afterwards all variables with prefix "var" would be deleted and only the "new_" variables would be retained. Hope this made sense.

Reeza
Super User
data want;
set have;

array _num_vars(*)  var1-_numeric_-var78;
array _char_vars(*) new_var1-new-var78;

do i=1 to dim(_num_vars);
_char_vars =  putn(_num_vars(i), vformat(_num_vars(i)));
end;

drop var1-var78;
run;

Do you really need macro logic here?

Something like above will work better it's just a matter of dynamically generating the names for the array, if needed.

ballardw
Super User

@SASSOSmak wrote:

General Purpose: I need all the variable types in my dataset to be character.

 

Quite often this points to an improper approach to reading data into SAS in the first place. Read the data as needed is quite often much easier than "fixing" it later.

 

So, how did you read the data into SAS?

SASSOSmak
Calcite | Level 5
proc import out = data_&date.
datafile = "/location/&date..txt"
dbms = DLM replace;
delimiter = "|";
getnames = NO;
guessingrows=max;
run;

%mend;
%imp(20200727);

I am using a macro because later I will be looping the macro so it imports multiple datasets. I think the "reading"/importing of that data is fine. I cannot use infile to specify variable type or length because it is dynamic and every dataset is different.

Tom
Super User Tom
Super User

I don't understand.  Why did you use PROC IMPORT if you want to make character variables and the files don't even have header rows to help with understanding what names to use?

If you know the files will never have more than 100 variables just use something like:

data data_&date ;
  infile  "/location/&date..txt" dsd dlm='|' truncover ;
  input (var1-var100) (:$100.);
run;

You can then analyze the file your self and figure out how many variables it actually had.

 

Of just read it into a "tall" structure.

data data_&date ;
  infile  "/location/&date..txt" dsd dlm='|' truncover length=ll column=cc;
  row+1;
  do col=1 by 1 until (cc>ll);
     input value :$100. @;
     output;
  end;
  input;
run;
SASSOSmak
Calcite | Level 5

That's a very helpful solution, thank you. The only issue is that there technically isn't an upper limit on the amount of variables it could be but I suppose I can make the amount very large to the point it hopefully would never be reached.

ballardw
Super User

@SASSOSmak wrote:

That's a very helpful solution, thank you. The only issue is that there technically isn't an upper limit on the amount of variables it could be but I suppose I can make the amount very large to the point it hopefully would never be reached.


This is at a point where I would question why am I processing data that files are so different and not documented.

 

If someone is getting paid to make these files you read then they are overpaid, IMHO, as they a passing their work off to someone else. If it is programmed process then the process likely has a number of other questionable behaviors that you haven't discovered yet.

 

If the use of these files are going to require combining them in the future I see many headaches coming. Such as same named variables attached to different sorts of values which is going to make this 'conversion' a trivial exercise. Or finding what the other variables are that should have the same names for combining.

 

If there is any date or time related data you will next be wanting that to be converted to SAS numeric date, time or datetime values to work with them.

Reeza
Super User
IME the work of understanding the file formats and structures is something you have to do at some point. You can choose to try and code your way out of it, which is more fun as a programmer but for time efficiency its definitely faster and more efficient to do the leg work up front to understand and document the data. Then this makes everything else after faster which is why it's worth the work up front.

Tom
Super User Tom
Super User

@SASSOSmak wrote:

That's a very helpful solution, thank you. The only issue is that there technically isn't an upper limit on the amount of variables it could be but I suppose I can make the amount very large to the point it hopefully would never be reached.


See update to read into a row/col/value tall format instead. 

If you have a LOT of variables then make sure to include a large LRECL setting on the INFILE statement.

Tom
Super User Tom
Super User

Here are a couple of ways to convert all variables to character.

 

One is use the VVALUE() function. Example:

data want;
  set have;
  length new1 new2 $100;
  new1=vvalue(old1);
  new2=vvalue(old2);  
run;

To generate that type of code you just need the list of variable names. Easy to get from PROC CONTENTS, DICTIONARY.COLUMNS in SQL or SASHELP.VCOLUMN view to the same informat, or even just PROC TRANSPOSE with obs=0 dataset option.

 

Another method is to dump the data to a text file and read it back into your new structure.

You will still probably need to know the names you want to use for variables.

filename dump temp;
data _null_;
  set have ;
  file dump dsd lrecl=2000000 ;
  put (_all_) (+0);
run;

data want;
  length var1 var2 ..... $100 ;
  infile dump dsd truncover lrecl=2000000;
  input (_all_) (+0);
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1093 views
  • 11 likes
  • 4 in conversation