BookmarkSubscribeRSS Feed
nickspencer
Obsidian | Level 7

Hi,

I have a dataset one with company code and their counts for each month for last 5 months.

I want to get change the dataset one to have structure like dataset two. The year month in dataset one is a character column.

what is the best way to achieve this?

 


dataset one
Year_Mnth  comp_code  code_type  count
201910        001                Medium   15
201911        001                Medium    9
201912        001                Medium    6
202001        001                Medium   18
202002        001                Medium    3
201910        002                 High        11
201911        002       High      8
201912        002       High     12
202001        002       High     16
202002        002       High     15

 

 

dataset two

comp_code code_type mnth1 mnth2 mnth3 mnth4 mnth5
001                 Medium      15     9          6      18     3
002                    High      11       8          12      16    15


Thanks,
Nick

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Generally, this is a poor structure for most types of data, making the subsequent coding much harder; unless for some reason, you are trying to create a presentation or display. What is the goal of having such a structure?

--
Paige Miller
novinosrin
Tourmaline | Level 20

Hi @nickspencer 

 

data have;
input (Year_Mnth  comp_code  code_type) ($)  count;
cards;
201910        001                Medium   15
201911        001                Medium    9
201912        001                Medium    6
202001        001                Medium   18
202002        001                Medium    3
201910        002                 High        11
201911        002       High      8
201912        002       High     12
202001        002       High     16
202002        002       High     15
;

proc transpose data=have out=want(drop=_:) prefix=mnth;
by comp_code code_type;
var count;
run;
nickspencer
Obsidian | Level 7

hi @novinosrin 

 How can I sort that date values in ascending or descending order? It is in character datatype in the source.

novinosrin
Tourmaline | Level 20

which one is in chartype?

nickspencer
Obsidian | Level 7

@novinosrin 

year_mnth is character type. And it is not sorted in the original dataset to have mnth1, mnth2...

novinosrin
Tourmaline | Level 20

Hi @nickspencer  How about?

 

data temp;
 set have;
 _temp_yr_mnth=input(Year_Mnth,yymmn6.);
run;

proc sort data=temp ;
by comp_code  code_type _temp_yr_mnth;
run;
/*And then the transpose using TEMP sorted as source*/

 Plus, the numeric variable is better as you can comfortably use for calculating any time series related problems

Reeza
Super User
Doesn't matter, it'll sort correctly with a standard PROC SORT.
ballardw
Super User

ONE example of why you may want to consider have date values stored as SAS dates: Note that by changing the Format for the date valued Year_month you can create different summary reports. This is also a reason NOT to create a wide data set. To create a summary report for calendar quarter or year total will require reprocessing your data. And if you keep the data in a wide format the code to reprocess that will change every time a month is added to your data. The reports I have below will adjust with no additional coding if additional rows are added. As well as taking into account when someone decides to add a "Low" code type, or new Comp_code values.

data have;
input Year_Mnth :yymmn6. comp_code $3.   code_type $  count;
cards;
201910 001 Medium 15
201911 001 Medium 9
201912 001 Medium  6
202001 001 Medium 18
202002 001 Medium  3
201910 002 High   11
201911 002 High   8
201912 002 High   12
202001 002 High   16
202002 002 High   15
;

proc tabulate data=have;
   class year_mnth comp_code code_type;
   format Year_Mnth yymmn6.;
   var count;
   table comp_code*code_type,
         year_mnth='Month' * count
         /misstext=' '
   ;

run;

proc tabulate data=have;
   class year_mnth comp_code code_type;
   format Year_Mnth yyQ6.;
   var count;
   table comp_code*code_type,
         year_mnth='Quarter' * count
         /misstext=' '
   ;
run;

proc tabulate data=have;
   class year_mnth comp_code code_type;
   format Year_Mnth year4.;
   var count;
   table comp_code*code_type,
         year_mnth='Year' * count
         /misstext=' '
   ;
run;

If the purpose of the wide data is to create a report that people read, don't do it. Use the report procedure tools to create the columns based on the values of the date variable.

 

 

nickspencer
Obsidian | Level 7
@ballardw This is helpful. What if I have dataset with full date and want to calculate the count for each month year?
nickspencer
Obsidian | Level 7
@novinosrin proc transpose works perfectly for me. I tried to use the statement ID year_mnth. This works great for me since it will work if any dates are missing. But when I use the ID and prefix it gets column name with prefix + ID .
Is there a way to still name them mnth1, mnth2 , mnth3...using ID statement?
Reeza
Super User
Have you tried adding Yrmonth or a month variable to the ID statement?
novinosrin
Tourmaline | Level 20

Hi @nickspencer  If you intend to use ID variable, the values of ID variable would become your transposed variable names. So expecting the transposed variables as Month1----MonthN is not viable. This is something you have to think about and decide what is desired as final. Please let us know. 

 

 

 

 

 

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1419 views
  • 6 likes
  • 5 in conversation