I have the following two codes:
proc transpose data=rank_score out=rank_score_tr
(drop=_name_ _suffix_);
by url;
id _term_;
var N;
run;
option VALIDVARNAME=ANY;
proc transpose data=rank_score out=rank_score_tr2
(drop=_name_ _suffix_) prefix=rank_ ;
by url;
id _term_;
var rankcount;
run;The dataset include in the _term_ column combinations of maximum two letters, with the chance of having both "a" and " a" or "a ", including spaces.
The first transpose works like a charm.
The second, adding prefix, seems compressing the blanks and provide error of duplicate at BY level
I checked the documentation but no track of having this kind of compression using the prefix statement.
Anyone knowing why it is happening and how to stop the space compression?
Actually I could solve adding the prefix generating a new variable and then transposing it as ID, but I was curios to understand why the transpose via prefix was generating this kind of behavior
Can't do that as we can also have "_a" or " a" and both of them must count as separate variables
Actually I could solve adding the prefix generating a new variable and then transposing it as ID, but I was curios to understand why the transpose via prefix was generating this kind of behavior
Your second statement says you want to create one row per url and use the content of _term_ as new variables (named rank_A_A and rank_A) and the value of these variables has to be rankcount.
So,
first, since "rank_A A" is not a valid variable name, SAS inserts an underscores to match the restrictions and creates 'rank_A_A'
second, if for a same url you have more than one time the same _term_ SAS will not know which of the rankcount values it has to put as new value of the rank_ variables. That's why you get the errors in your log. Your combination of by group and id group needs to be unique.
Consider posting a simple example of the data you have and what you'd expect
- Cheers -
oh yeah I see, you use VALIDVARNAME=ANY and get the results you want.
I acknowledge your trick works.
As far as I know, this means you have to work with SAS named literals if you further want to proceed the variables.
I'd get chills just thinking about it but I haven't the big picture of what you're trying to do and it's your code in the end.
data test1;
length url $30 _term_ $10;
url='kjahskdhfkh';
_term_=' A A';
rankcount=1;
output;
url='kjahskdhfkh';
_term_='A A';
rankcount=2;
output;
url='kjahskdhfkh';
_term_='A';
rankcount=3;
output;
url='kjahskdhfkh';
_term_=' A';
rankcount=1;
output;
url='kjahskdhfkh';
_term_='_A';
rankcount=1;
output;
run;
data test1;
   set test1;
   _term_=cat('_',_term_);
run;
option VALIDVARNAME=ANY;
proc transpose data=test1 out=test2
(drop=_name_ _suffix_) prefix=rank_ ;
by url;
id _term_;
var rankcount;
run;
data test3;
   set test2;
   'rank__ A A'n='rank__ A A'n+100;
run;
- Cheers -
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
