Solved: Rename a numeric column after transpose

CathyVI · Posted 04-16-2024 08:50 AM

Hi,

I transposed a data and i got a numeric column data that looks like this below:

For my next data step, I tried to keep them in a data step but it didn't work. Here is my idea

data test2 (keep=_2023 _2024);
set test1;
run;

Apparently SAS did not recognize them as _2023. I want to rename them as Y2023 or rename them as _2023 so I can use them for % calculation but again SAS is not recognizing them as a variable i think because they are just like a number.

Sorry I don't have a sample code because the one i created also did not work except I put Y infront of it. e.g., Y2023.

Please any help will be appreciated.

Reeza · Posted 04-16-2024 11:01 AM

Assuming you used PROC TRANSPOSE, you can add the PREFIX option to have it added to the variable name.

Ensure that validvarname is also set to V7 to help avoid name literals.

options validvarname=v7;

proc transpose data=have out=want prefix=Y;
where ...;
var count;
id year;
run;

@CathyVI wrote:

@PaigeMiller @maguiremq Thanks for your response. This is the table before I transpose it

After I transposed I got this: @PaigeMiller How do I calculate each year which is my next step if i dont transpose?

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:
data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;
@maguiremq I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here. Except I did not understand your logic. Please help out if i am wrong.

View solution in original post

PaigeMiller · Posted 04-16-2024 09:12 AM

Actually, I think the mistake was using TRANSPOSE in the first place. Maxim 19

Why do you need to do this transpose? What will the next analysis or programming done on this data? Normally transposing just makes the next steps harder.

--
Paige Miller

maguiremq · Posted 04-16-2024 09:29 AM

All of it can be done in PROC TRANSPOSE.

You can also do it in a DATA step.

But best practice would be to get familiar with PROC DATASETS as it doesn't have to read in the entire file.

For this case, it is probably ok to not use PROC DATASETS, but if your files start to get much larger, then PROC DATASETS will be much faster. It doesn't have to read in the entire data set from your SET statement. It will modify the variable names on the data set directly.

data have;
input _name_ $ "2023"n "2024"n;
datalines;
count 1 2
;
run;

proc datasets library = work memtype = data nolist;
	modify have;
	rename "2023"n = _2023
		   "2024"n = _2024
	;
run;
quit;

data want;
	set have (rename = ("2023"n = _2023 "2024"n = _2024));
run;

CathyVI · Posted 04-16-2024 10:18 AM

@PaigeMiller @maguiremq Thanks for your response. This is the table before I transpose it

After I transposed I got this: @PaigeMiller How do I calculate each year which is my next step if i dont transpose?

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:

data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;

@maguiremq I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here. Except I did not understand your logic. Please help out if i am wrong.

PaigeMiller · Posted 04-16-2024 10:26 AM

This looks like PROC MEANS or PROC SUMMARY output. Is it? If yes, can you show us the code? If yes, can you show us the columns to the left of YEAR? If yes, can you show us a portion of the data that was used by PROC MEANS or PROC SUMMARY?

Asking again as I did in your previous threads: What is the numerator and what is the denominator? Okay, the numerator seems very clear. But we don't know what the denominator is, we don't know what T_2021 or T_2022 is, there is no such thing in this data.

I suspect, but can't say for sure, that there is no need for TRANSPOSE here, and in fact TRANSPOSE is actually a mistake here. Once you provide the information requested, I may be able to say with certainty.

--
Paige Miller

CathyVI · Posted 04-16-2024 11:13 AM

@PaigeMiller This is a portion of the data before the proc summary

Yes it is proc summary and this is the code:

proc summary data=test1;
class sex age race service component year;
var count;
output out=test2 sum= count;

The denominator was calculated using this code after the transpose:

data test3 (keep= _NAME_  _2023 _2024 rename=( _2023 = T_2023 _2024= T_2024));
set test2;
if varname='total';
run;

This code was extracted and the person was able to make their 2023 in to _2023 but I wasn't after following all the steps. Thanks

PaigeMiller · Posted 04-16-2024 11:24 AM

All the pieces are not fitting together. Now you show

rename=( _2023 = T_2023 _2024= T_2024)

whereas in your previous post you were trying to use different years T_2021 and T_2022. So which years are you wanting to use for the denominator? And where do we find these T_2021 or T_2022 or T_2023 or T_2024 in the data you have shown??? You have not told us.

In addition, the output from that PROC SUMMARY contains counts by sex, age, race, service, component and year; and it contains counts of all two-way combinations of those variables; and it contains counts of all three-way combinations of those variables; and so on, up to all six way combinations of those variables. All of that is in there, most of which doesn't seem relevant to the question you asked originally, as far as I can tell, which was just about YEAR.

So, we are back to not knowing what the numerator should be (which rows of the PROC SUMMARY output data set) and we do not know what the denominator is (which rows of the PROC SUMMARY output data set).

Can you describe in words (and without using SAS code and without showing us screen captures) the entire problem, start to finish: what is the question you want to answer?

By the way, you should be describing the problem in words, start to finish, in every post, regardless. It is now obvious to me that this is not a problem about renaming columns after a transpose, and we have spent a lot of time trying to get to the bottom of it that we would not have spent if we had this problem description in words start to finish. After providing this description, then show us (a portion of) the original data and your proposed code.

--
Paige Miller

CathyVI · Posted 04-16-2024 11:37 AM

@PaigeMiller @Reeza @ballardw @maguiremq

Sorry I copied the data and was changing the year to fit my year 2023 and 2024.

In words: I have a data set which i want a count of the variables e.g. age group, sex, by year '2023, 2024). I have create the counts and the year. I transpose it so I can have each variable by year count and calculate the percentage for each year.

I encountered an issue when the year (I transposed) was not recognized by SAS as a numeric variable. The code I was coping did the same thing. They got _2023, _2024 (numerator) when they transposed and they used the Total to find T_2023 which was the denominator. Then they calculated the percentage with a data step. This is what am hoping to do. Thanks

ballardw · Posted 04-16-2024 11:44 AM

@CathyVI wrote:

@PaigeMiller @Reeza @ballardw @maguiremq

Sorry I copied the data and was changing the year to fit my year 2023 and 2024.

In words: I have a data set which i want a count of the variables e.g. age group, sex, by year '2023, 2024). I have create the counts and the year. I transpose it so I can have each variable by year count and calculate the percentage for each year.

I encountered an issue when the year (I transposed) was not recognized by SAS as a numeric variable. The code I was coping did the same thing. They got _2023, _2024 (numerator) when they transposed and they used the Total to find T_2023 which was the denominator. Then they calculated the percentage with a data step. This is what am hoping to do. Thanks

Reiterate: provide example data of before you did the "transpose" and provide what you want.

Obviously it is highly unlikely to be "a count of the variables e.g. age group, sex". That would be 2. I suspect you mean count of levels of these variables by year. And transpose is not needed in any way. Plus this still sounds like an intermediate step. Provide a description of the final result, not one step.

And again, actual working data that behaves like your data goes a long way to avoid repeated question/partial answer, question/incomplete answer, question/not relevant answer cycles.

PaigeMiller · Posted 04-16-2024 11:46 AM

I transpose it so I can have each variable by year count and calculate the percentage for each year.

This is simply not necessary, and makes your programming more difficult. Please let's stop asking about transposing and renaming. But, one more time, you have many rows in the data set output from PROC SUMMARY, which are the numerators, and which are the denominators? Please be specific. In the data you showed,

please give us an example: for one such desired percent, which row contains the numerator and which row contains the denominator?

Until this is cleared up, this is an absolute showstopper. I can not even try to write code for this, and this has nothing to do with SAS, I could not tell you how to do this pencil and paper either.

--
Paige Miller

Reeza · Posted 04-16-2024 12:37 PM

Did you try the solutions I posted? That should solve your problem if you're sticking with this methodology.

ballardw · Posted 04-16-2024 10:52 AM

@CathyVI wrote:

@PaigeMiller @maguiremq Thanks for your response. This is the table before I transpose it

After I transposed I got this: @PaigeMiller How do I calculate each year which is my next step if i dont transpose?

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:
data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;
@maguiremq I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here. Except I did not understand your logic. Please help out if i am wrong.

To get _type_ with Proc Means/Summary there have to be at least 3 CLASS variables (and would be several more types involved)

Suggestion: Instead of showing intermediate data sets, that you seem not quite sure what you want to do yet, provide examples of the RAW data and describe the desired output. If you can't provide some example data as a data step use on of the SAS supplied data sets like SASHELP.Class (is small enough to do manual calculations for examples) or SASHELP.Cars that has quite a few observations and multiple class and numeric variables.

You data step manipulating this likely needs some things set conditionally based on the _type_. Do you understand what that _type_ variable tells you? _Type_=0 is the summary over the entire data set. So that would have the "count" of everything. _type_=1 is going to be the summary of ONE level of one of the class variables. _Type_=2 is going to be summary of the second Class variable (which your picture shows as removed, which is likely going to be hard to use) . _type_=3 if there are exactly 2 Class variables will have the combinations of the values of the two class variables.

Reeza · Posted 04-16-2024 11:01 AM

Assuming you used PROC TRANSPOSE, you can add the PREFIX option to have it added to the variable name.

Ensure that validvarname is also set to V7 to help avoid name literals.

options validvarname=v7;

proc transpose data=have out=want prefix=Y;
where ...;
var count;
id year;
run;

@CathyVI wrote:

@PaigeMiller @maguiremq Thanks for your response. This is the table before I transpose it

After I transposed I got this: @PaigeMiller How do I calculate each year which is my next step if i dont transpose?

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:
data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;
@maguiremq I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here. Except I did not understand your logic. Please help out if i am wrong.

Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Re: Rename a numeric column after transpose

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away