Hi,
I have a dataset one with company code and their counts for each month for last 5 months.
I want to get change the dataset one to have structure like dataset two. The year month in dataset one is a character column.
what is the best way to achieve this?
dataset one
Year_Mnth comp_code code_type count
201910 001 Medium 15
201911 001 Medium 9
201912 001 Medium 6
202001 001 Medium 18
202002 001 Medium 3
201910 002 High 11
201911 002 High 8
201912 002 High 12
202001 002 High 16
202002 002 High 15
dataset two
comp_code code_type mnth1 mnth2 mnth3 mnth4 mnth5
001 Medium 15 9 6 18 3
002 High 11 8 12 16 15
Thanks,
Nick
Generally, this is a poor structure for most types of data, making the subsequent coding much harder; unless for some reason, you are trying to create a presentation or display. What is the goal of having such a structure?
Hi @nickspencer
data have;
input (Year_Mnth comp_code code_type) ($) count;
cards;
201910 001 Medium 15
201911 001 Medium 9
201912 001 Medium 6
202001 001 Medium 18
202002 001 Medium 3
201910 002 High 11
201911 002 High 8
201912 002 High 12
202001 002 High 16
202002 002 High 15
;
proc transpose data=have out=want(drop=_:) prefix=mnth;
by comp_code code_type;
var count;
run;
hi @novinosrin
How can I sort that date values in ascending or descending order? It is in character datatype in the source.
which one is in chartype?
year_mnth is character type. And it is not sorted in the original dataset to have mnth1, mnth2...
Hi @nickspencer How about?
data temp;
set have;
_temp_yr_mnth=input(Year_Mnth,yymmn6.);
run;
proc sort data=temp ;
by comp_code code_type _temp_yr_mnth;
run;
/*And then the transpose using TEMP sorted as source*/
Plus, the numeric variable is better as you can comfortably use for calculating any time series related problems
ONE example of why you may want to consider have date values stored as SAS dates: Note that by changing the Format for the date valued Year_month you can create different summary reports. This is also a reason NOT to create a wide data set. To create a summary report for calendar quarter or year total will require reprocessing your data. And if you keep the data in a wide format the code to reprocess that will change every time a month is added to your data. The reports I have below will adjust with no additional coding if additional rows are added. As well as taking into account when someone decides to add a "Low" code type, or new Comp_code values.
data have; input Year_Mnth :yymmn6. comp_code $3. code_type $ count; cards; 201910 001 Medium 15 201911 001 Medium 9 201912 001 Medium 6 202001 001 Medium 18 202002 001 Medium 3 201910 002 High 11 201911 002 High 8 201912 002 High 12 202001 002 High 16 202002 002 High 15 ; proc tabulate data=have; class year_mnth comp_code code_type; format Year_Mnth yymmn6.; var count; table comp_code*code_type, year_mnth='Month' * count /misstext=' ' ; run; proc tabulate data=have; class year_mnth comp_code code_type; format Year_Mnth yyQ6.; var count; table comp_code*code_type, year_mnth='Quarter' * count /misstext=' ' ; run; proc tabulate data=have; class year_mnth comp_code code_type; format Year_Mnth year4.; var count; table comp_code*code_type, year_mnth='Year' * count /misstext=' ' ; run;
If the purpose of the wide data is to create a report that people read, don't do it. Use the report procedure tools to create the columns based on the values of the date variable.
Hi @nickspencer If you intend to use ID variable, the values of ID variable would become your transposed variable names. So expecting the transposed variables as Month1----MonthN is not viable. This is something you have to think about and decide what is desired as final. Please let us know.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.