Hi,
I get warning messages in log when I run the following codes with the data:
dataset- have, can be simplified as below:
CompanyName | Date | TableName | Year | var | data |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Road and equipment | 5367994 |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Invest, in affil. cos | 34700 |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Misc. physical property | 1580 |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Other Investments | 50 |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Cash | 754253 |
ALABAMA & VICKSBURG RY | 6026 | Assets- | 1914 | Marketable securities | 4247 |
proc sort data=have;
by CompanyName TableName Date Year;
run;
proc transpose data=have out=want (drop=_NAME_);
by CompanyName TableName Date Year;
var data;
id var;
run;
The warning message is as follow:
WARNING: Variable Cash already exists on file WORK.WANT.
WARNING: Variable Total_Income already exists on file WORK.WANT
WARNING: Variable Net_earnings already exists on file WORK.WANT
WARNING: Variable Other_income already exists on file WORK.WANT.
.
.
.
.
.
.
.
.
From one of SAS support document, I have read that there should be no duplicate values in id variable within a by group.
and I could take care of the the following error message:
ERROR: The ID value "PassengerEarnings" occurs twice in the same BY group.
However, I couldn't figure it out why SAS shows me the warning message above.
Any comment or suggestion will be greatly appreciated.
Thanks,
Minsoo
You need to standardize your variable names. The easiest is to use UPCASE or lowcase function.
You also need to check if you have multiple records for the same variable.
The extra data step below will do both.
proc sort data=have1;
by companyname tablename date year;
run;
data have2 ;
set have1;
by companyname tablename date year;
if first.year then recno=0;
recno + 1;
var = lowcase(var);
run;
proc transpose data=have2 out=want (drop=_name_);
by companyname tablename date year recno;
var data;
id var;
run;
Have you carefully looked at your have dataset? I ran the following without any error:
data have;
informat CompanyName TableName var $30.;
input CompanyName & Date TableName Year var & data;
cards;
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Road and equipment 5367994
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Invest, in affil. cos 34700
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Misc. physical property 1580
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Other Investments 50
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Cash 754253
ALABAMA & VICKSBURG RY 6026 Assets- 1914 Marketable securities 4247
;
proc sort data=have;
by CompanyName TableName Date Year;
run;
proc transpose data=have out=want (drop=_NAME_);
by CompanyName TableName Date Year;
var data;
id var;
run;
Hi Art,
The have dataset I have posted is the first few lines of the dataset, and yes, it works fine with it, however it keeps making the warning message if I run it with the entire dataset.
The following link shows you the dataset.
And here is the code that I have used for this dataset:
PROC IMPORT OUT= WORK.have
DATAFILE= "C:\Users\.......\1916_b_Macro.xls"
DBMS=EXCEL REPLACE;
SHEET="Sheet1$";
GETNAMES=NO;
MIXED=YES;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
data have1;
set have;
array vars(100) $200.;
array vars_in(100) $200. f1-f100;
retain vars;
if compress(f1,,'kfd') eq "CompanyName" then do;
call missing(of vars(*));
do i=1 to 100;
if compress(vars_in(i),,'kfd') ne "" then
vars(i)=vars_in(i);
end;
end;
else if compress(f1,,'kf') ne "" then do;
CompanyName=vars_in(1);
Date=vars_in(2);
TableName=vars_in(3);
Year=vars_in(4);
do i=5 to 100;
if compress(vars_in(i),,'kd') ne "" then do;
var=vars(i);
data=vars_in(i);
output;
end;
end;
end;
run;
proc sort data=have1;
by CompanyName TableName Date Year;
run;
proc transpose data=have1 out=want (drop=_NAME_);
by CompanyName TableName Date Year;
var data;
id var;
run;
Thanks,
Minsoo
How do you want file want to appear (just one or two records would suffice)?
I need want file to have CompnayName, TalbeName, Date, Year on the first 4 columns and then have various variables such as Roda_and_equipemnt, etc. on the following columns as below:
CompanyName TableName Date Year Road_and_equipment Invest__in_affil__cos ... ..... ....
The previous code works fine except for the variables shown in the warning messages.
Thanks,
Minsoo
Looks like you have two values of VAR that SAS wants to convert to the same variable name. Such as 'CASH' and 'cash', 'Net_earnings' and 'Net_Earnings'.
Do a proc freq on the variable VAR and see what you have.
Hi Tom,
Yes, your comment seems to be the reason why it shows me the warning message.
It also makes sense that SAS doesn't make warning message when the case sentitive variable names are exactly the same, however, it gives an warning message when the variable has the same spell but different case for the variable names.
Is there any way I can go around this warning message except chaning the variable names into another one?
Thanks,
Minsoo
You can correct it with on line changed to your datastep, namely (about 5 rows up from the bottom of the datastep)):
var=upcase(vars(i));
You need to standardize your variable names. The easiest is to use UPCASE or lowcase function.
You also need to check if you have multiple records for the same variable.
The extra data step below will do both.
proc sort data=have1;
by companyname tablename date year;
run;
data have2 ;
set have1;
by companyname tablename date year;
if first.year then recno=0;
recno + 1;
var = lowcase(var);
run;
proc transpose data=have2 out=want (drop=_name_);
by companyname tablename date year recno;
var data;
id var;
run;
Tom and I seem to agree on the solution. One thing I wanted to mention though. If you also get rid of the unnecessary data in your datastep, the sort and all will run much, much faster. I ran it with the following code:
PROC IMPORT OUT= WORK.have
DATAFILE= "C:\art\1916_b_Macro.xls"
DBMS=EXCEL REPLACE;
SHEET="Sheet1$";
GETNAMES=NO;
MIXED=YES;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
data have1 (keep=CompanyName TableName Date Year var data);
set have;
array vars(100) $200.;
array vars_in(100) $200. f1-f100;
retain vars;
if compress(f1,,'kfd') eq "CompanyName" then do;
call missing(of vars(*));
do i=1 to 100;
if compress(vars_in(i),,'kfd') ne "" then
vars(i)=vars_in(i);
end;
end;
else if compress(f1,,'kf') ne "" then do;
CompanyName=vars_in(1);
Date=vars_in(2);
TableName=vars_in(3);
Year=vars_in(4);
do i=5 to 100;
if compress(vars_in(i),,'kd') ne "" then do;
var=upcase(vars(i));
data=vars_in(i);
output;
end;
end;
end;
run;
proc sort data=have1;
by CompanyName TableName Date Year;
run;
proc transpose data=have1 out=want (drop=_NAME_);
by CompanyName TableName Date Year;
var var data;
id var;
run;
Great!
You make it happen!
Thank you very much for your solutions, Tom and Art!
Minsoo
Just as what you found:
ERROR: The ID value "PassengerEarnings" occurs twice in the same BY group.
Why not add a prefix for your id variable in proc transpose to avoid this error?
proc transpose data=have out=want (drop=_NAME_) prefix=_;
by CompanyName TableName Date Year;
var data;
id var;
run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.