BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
CathyVI
Pyrite | Level 9

Hi,

I transposed a data and i got a numeric column data that looks like this below:

CathyVI_0-1713271188141.png

 

For my next data step, I tried to keep them in a data step but it didn't work. Here is my idea

data test2 (keep=_2023 _2024);
set test1;
run;

Apparently SAS did not recognize them as _2023. I want to rename them as Y2023 or rename them as _2023 so I can use them for % calculation but again SAS is not recognizing them as a variable i think because they are just like a number.

Sorry I don't have a sample code because the one i created also did not work except I put Y infront of it. e.g., Y2023.

Please any help will be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Assuming you used PROC TRANSPOSE, you can add the PREFIX option to have it added to the variable name. 

Ensure that validvarname is also set to V7 to help avoid name literals.

 

options validvarname=v7;

proc transpose data=have out=want prefix=Y;
where ...;
var count;
id year;
run;

@CathyVI wrote:

@PaigeMiller @maguiremq  Thanks for your response. This is the table before I transpose it

CathyVI_0-1713276480986.png

 

After I transposed I got this: @PaigeMiller  How do I calculate each year which is my next step if i dont transpose?

CathyVI_1-1713276802860.png

 

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:

data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;

@maguiremq  I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here.  Except I did not understand your logic. Please help out if i am wrong.




View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Actually, I think the mistake was using TRANSPOSE in the first place. Maxim 19

 

Why do you need to do this transpose? What will the next analysis or programming done on this data? Normally transposing just makes the next steps harder.

--
Paige Miller
maguiremq
SAS Super FREQ

All of it can be done in PROC TRANSPOSE.

 

You can also do it in a DATA step.

 

But best practice would be to get familiar with PROC DATASETS as it doesn't have to read in the entire file.

 

For this case, it is probably ok to not use PROC DATASETS, but if your files start to get much larger, then PROC DATASETS will be much faster. It doesn't have to read in the entire data set from your SET statement. It will modify the variable names on the data set directly.

 

data have;
input _name_ $ "2023"n "2024"n;
datalines;
count 1 2
;
run;

proc datasets library = work memtype = data nolist;
	modify have;
	rename "2023"n = _2023
		   "2024"n = _2024
	;
run;
quit;

data want;
	set have (rename = ("2023"n = _2023 "2024"n = _2024));
run;
CathyVI
Pyrite | Level 9

@PaigeMiller @maguiremq  Thanks for your response. This is the table before I transpose it

CathyVI_0-1713276480986.png

 

After I transposed I got this: @PaigeMiller  How do I calculate each year which is my next step if i dont transpose?

CathyVI_1-1713276802860.png

 

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:

data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;

@maguiremq  I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here.  Except I did not understand your logic. Please help out if i am wrong.

PaigeMiller
Diamond | Level 26

This looks like PROC MEANS or PROC SUMMARY output. Is it? If yes, can you show us the code? If yes, can you show us the columns to the left of YEAR? If yes, can you show us a portion of the data that was used by PROC MEANS or PROC SUMMARY?

 

Asking again as I did in your previous threads: What is the numerator and what is the denominator? Okay, the numerator seems very clear. But we don't know what the denominator is, we don't know what T_2021 or T_2022 is, there is no such thing in this data.

 

I suspect, but can't say for sure, that there is no need for TRANSPOSE here, and in fact TRANSPOSE is actually a mistake here. Once you provide the information requested, I may be able to say with certainty.

--
Paige Miller
CathyVI
Pyrite | Level 9

@PaigeMiller This is a portion of the data before the proc summary

CathyVI_0-1713280006724.png

 

Yes it is proc summary and this is the code:

proc summary data=test1;
class sex age race service component year;
var count;
output out=test2 sum= count;

 The denominator  was calculated using this code after the transpose:

data test3 (keep= _NAME_  _2023 _2024 rename=( _2023 = T_2023 _2024= T_2024));
set test2;
if varname='total';
run;

This code was extracted and the person was able to make their 2023 in to _2023 but I wasn't after following all the steps. Thanks

 

PaigeMiller
Diamond | Level 26

All the pieces are not fitting together. Now you show

 

 

rename=( _2023 = T_2023 _2024= T_2024)

 

 

whereas in your previous post you were trying to use different years T_2021 and T_2022. So which years are you wanting to use for the denominator? And where do we find these T_2021 or T_2022 or T_2023 or T_2024 in the data you have shown??? You have not told us.

 

In addition, the output from that PROC SUMMARY contains counts by sex, age, race, service, component and year; and it contains counts of all two-way combinations of those variables; and it contains counts of all three-way combinations of those variables; and so on, up to all six way combinations of those variables. All of that is in there, most of which doesn't seem relevant to the question you asked originally, as far as I can tell, which was just about YEAR.

 

So, we are back to not knowing what the numerator should be (which rows of the PROC SUMMARY output data set) and we do not know what the denominator is (which rows of the PROC SUMMARY output data set).

 

Can you describe in words (and without using SAS code and without showing us screen captures) the entire problem, start to finish: what is the question you want to answer?

 

By the way, you should be describing the problem in words, start to finish, in every post, regardless. It is now obvious to me that this is not a problem about renaming columns after a transpose, and we have spent a lot of time trying to get to the bottom of it that we would not have spent if we had this problem description in words start to finish. After providing this description, then show us (a portion of) the original data and your proposed code.

--
Paige Miller
CathyVI
Pyrite | Level 9

@PaigeMiller @Reeza @ballardw @maguiremq 

Sorry I copied the data and was changing the year to fit my year 2023 and 2024.

In words: I have a data set which i want a count of the variables e.g. age group, sex,  by year '2023, 2024). I have create the counts and the year. I transpose it so I can have each variable by year count and calculate the percentage for each year. 

I encountered an issue when the year (I transposed) was not recognized by SAS as a numeric variable. The code I was coping did the same thing. They got _2023, _2024 (numerator) when they transposed and they used the Total to find T_2023 which was the denominator. Then they calculated the percentage with a data step. This is what am hoping to do. Thanks

 

ballardw
Super User

@CathyVI wrote:

@PaigeMiller @Reeza @ballardw @maguiremq 

Sorry I copied the data and was changing the year to fit my year 2023 and 2024.

In words: I have a data set which i want a count of the variables e.g. age group, sex,  by year '2023, 2024). I have create the counts and the year. I transpose it so I can have each variable by year count and calculate the percentage for each year. 

I encountered an issue when the year (I transposed) was not recognized by SAS as a numeric variable. The code I was coping did the same thing. They got _2023, _2024 (numerator) when they transposed and they used the Total to find T_2023 which was the denominator. Then they calculated the percentage with a data step. This is what am hoping to do. Thanks

 


Reiterate: provide example data of before you did the "transpose" and provide what you want.

Obviously it is highly unlikely to be "a count of the variables e.g. age group, sex". That would be 2. I suspect you mean count of levels of these variables by year. And transpose is not needed in any way. Plus this still sounds like an intermediate step. Provide a description of the final result, not one step.

And again, actual working data that behaves like your data goes a long way to avoid repeated question/partial answer, question/incomplete answer, question/not relevant answer cycles.

PaigeMiller
Diamond | Level 26

 I transpose it so I can have each variable by year count and calculate the percentage for each year. 

 

This is simply not necessary, and makes your programming more difficult. Please let's stop asking about transposing and renaming. But, one more time, you have many rows in the data set output from PROC SUMMARY, which are the numerators, and which are the denominators? Please be specific. In the data you showed, 

 

CathyVI_0-1713276480986.png

 

please give us an example: for one such desired percent, which row contains the numerator and which row contains the denominator?

 

Until this is cleared up, this is an absolute showstopper. I can not even try to write code for this, and this has nothing to do with SAS, I could not tell you how to do this pencil and paper either.

--
Paige Miller
Reeza
Super User
Did you try the solutions I posted? That should solve your problem if you're sticking with this methodology.
ballardw
Super User

@CathyVI wrote:

@PaigeMiller @maguiremq  Thanks for your response. This is the table before I transpose it

CathyVI_0-1713276480986.png

 

After I transposed I got this: @PaigeMiller  How do I calculate each year which is my next step if i dont transpose?

CathyVI_1-1713276802860.png

 

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:

data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;

@maguiremq  I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here.  Except I did not understand your logic. Please help out if i am wrong.


To get _type_ with Proc Means/Summary there have to be at least 3 CLASS variables (and would be several more types involved)

 

Suggestion: Instead of showing intermediate data sets, that you seem not quite sure what you want to do yet, provide examples of the RAW data and describe the desired output. If you can't provide some example data as a data step use on of the SAS supplied data sets like SASHELP.Class (is small enough to do manual calculations for examples) or SASHELP.Cars that has quite a few observations and multiple class and numeric variables.

 

You data step manipulating this likely needs some things set conditionally based on the _type_. Do you understand what that _type_ variable tells you? _Type_=0 is the summary over the entire data set. So that would have the "count" of everything. _type_=1 is going to be the summary of ONE level of one of the class variables. _Type_=2 is going to be summary of the second Class variable (which your picture shows as removed, which is likely going to be hard to use) . _type_=3 if there are exactly 2 Class variables will have the combinations of the values of the two class variables.

 

 

Reeza
Super User

Assuming you used PROC TRANSPOSE, you can add the PREFIX option to have it added to the variable name. 

Ensure that validvarname is also set to V7 to help avoid name literals.

 

options validvarname=v7;

proc transpose data=have out=want prefix=Y;
where ...;
var count;
id year;
run;

@CathyVI wrote:

@PaigeMiller @maguiremq  Thanks for your response. This is the table before I transpose it

CathyVI_0-1713276480986.png

 

After I transposed I got this: @PaigeMiller  How do I calculate each year which is my next step if i dont transpose?

CathyVI_1-1713276802860.png

 

So I want to have the count of each year 2023, 2024, find the total so I can calculate percentage using this next data set:

data test4 (drop= _NAME_  T_2023 T_2024);
set test3;
percent23 = (_2023/T_2021)*100;
percent24 = (_2024/T_2022)*100;
run;

@maguiremq  I don't think I can follow your step since I did not just create the data. There are multiple steps before I got here.  Except I did not understand your logic. Please help out if i am wrong.




sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 535 views
  • 1 like
  • 5 in conversation