BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
brianathrift
Fluorite | Level 6

Hello! 

 

Thank you all in advance for your help. I am working from SAS 9.4, and am having trouble transposing data. My dataset is currently in a "wide" format and I would like to make it a "long format". Below is an example of my imported data that I am trying to transpose. 

 

Traffic_Fatal_Image.JPG

 

I have successfully transposed the data from wide to long using the code below:

*Transpose Data from wide to long, Needed to sort first;
Proc sort data=Traffic_Ranks_2;
by Counties;
Run;
Proc Transpose data=Traffic_Ranks_2 out=long_Ranks1 prefix=Pop;
   by Counties;
   var Pop_2014-Pop_2018;
run;
Proc Transpose data=Traffic_Ranks_2 out=long_Ranks2 prefix=Traffic_Count;
   by Counties;
   var Traffic_Count_2014-Traffic_Count_2018;
run;
Proc Transpose data=Traffic_Ranks_2 out=long_Ranks3 prefix=Traffic_FR;
   by Counties;
   var Traffic_FR_2014-Traffic_FR_2018;
run;
proc print data=long_Ranks2 (obs=2);
run;

*Merge Transposed Datasets;
Data Traffic_Ranks_Long;
   merge long_Ranks1 (rename=(Pop1=Pop) drop=_name_) long_Ranks2(rename=(Traffic_Count1=Traffic_Count) drop=_name_)
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR) drop=_name_);
   by Counties;
   Year=input(substr(_name_, 15), 4.);
   drop _name_;
   drop Pop1 Traffic_Count1 Traffic_FR1;
run;

Proc Print data=Traffic_Ranks_Long (obs=1);
Run;

The printed version came out as such:

 

Traffic_Fatal_Image2.JPG

 

I am needing help extracting the year correctly, after tranposing the data. I have researched substr syntax and found teh following: the <variable=>SUBSTR(stringposition <, length>), however no numbers that I use are correctly extracting the year. I would like this output dataset to reflect as the above, but have the full year (4 digits).

 

Thank you again for your help and I look forward to hearing from you!

 

Best,

Briana Thrift

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

It's not 'subract\ but, rather, substring.

You're currently using:

Data Traffic_Ranks_Long;
   merge long_Ranks1 (rename=(Pop1=Pop)) long_Ranks2(rename=(Traffic_Count1=Traffic_Count))
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR));
   by Counties;
   Year=input(substr(_name_, 11), 4.);
   drop Pop1 Traffic_Count1 Traffic_FR1;
run;

No need to drop Pop1 Count1 and Traffic_FR1, as they no longer exist since you renamed them in the merge statement.

 

My suggestion would be to (1)add a drop=_name_ option to two of your merge files (i.e.:

Data Traffic_Ranks_Long;
   merge long_Ranks1 (drop=_name_ rename=(Pop1=Pop)) long_Ranks2(drop=_name_ rename=(Traffic_Count1=Traffic_Count))
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR));
   by Counties;
   Year=input(substr(_name_, 12), 4.);
   drop _name_;
run;

That way you would always be taking the year from 'Traffic_FR_XXXX', beginning with the 12th character.

 

Art, CEO, AnalystFinder.com

 

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Show us a few typical values of variable _NAME_ from the last data set.

 
--
Paige Miller
brianathrift
Fluorite | Level 6

Thank you for your response.

 

This is the output of _name_ for the first transposed dataset:

Name_1.JPG

 

The output of _name_ for the second transposed dataset:

name_2.JPG

And finally, the output of _name_ for the last transposed dataset:

name_3.JPG

 

In the merged dataset of all three separate transposed datasets, I dropped the _name_ variable and renamed for each variable to have their own column (Pop, Traffic_Count, Traffic_FR).

 

Thank you!

PaigeMiller
Diamond | Level 26

If you are dropping _NAME_ in the MERGE statement, you have nothing for SUBSTR to work with. So you always get missing as the result.

 
 
--
Paige Miller
art297
Opal | Level 21

Here is a method that accomplishes the whole task without requiring any sort or multiple steps:

filename ut url 'http://tiny.cc/untranspose_macro';
%include ut ;
%untranspose(data=Traffic_Ranks_2, out=want, by=County,
  id=year, var=Pop Traffic_Count Traffic_FR, delimiter=_)

Conversely, yes, given your current method you can easily extract year, using the substr function, as long as you keep the _name_ variable.

 

Art, CEO, AnalystFinder.com

 

brianathrift
Fluorite | Level 6

Thank you for all your response. My issue is knowing what numbers to use in the "subtract" syntax to correctly extract the year variables. The current code I have, after dropping _name_, is the following:

 

 

*Merge Transposed Datasets;
Data Traffic_Ranks_Long;
   merge long_Ranks1 (rename=(Pop1=Pop)) long_Ranks2(rename=(Traffic_Count1=Traffic_Count))
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR));
   by Counties;
   Year=input(substr(_name_, 11), 4.);
   drop Pop1 Traffic_Count1 Traffic_FR1;
run;

Proc Print data=Traffic_Ranks_Long (obs=5);
Run;

My output is still missing year, as the following: 

 

pic3.JPG

 

I think my main issue in the "subtract" syntax is the number "11 and 4". I know 4 is correct, because I was the year to be formatted to 4 digits (i.e.: year 2020, instead of year 20). However, I am having a hard time figuring out what the first number should be. I thought the first number represented how many characters/numbers to subtract from an original variable name to yield the variable name of interest. For example, subtract 5 numbers from "Pop_2018" (subtracting letters P,O,P,_) to start on "2" of "2018" and yield 2018 as my final variable name. However, no number seems to yield the year variable correctly.

 

I hope this explanation makes sense. Please let me know if you have any questions, and I look forward to your answer. Prefer non-macro answers please, as I do not understand SAS macro coding yet.

 

Thank you!

art297
Opal | Level 21

It's not 'subract\ but, rather, substring.

You're currently using:

Data Traffic_Ranks_Long;
   merge long_Ranks1 (rename=(Pop1=Pop)) long_Ranks2(rename=(Traffic_Count1=Traffic_Count))
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR));
   by Counties;
   Year=input(substr(_name_, 11), 4.);
   drop Pop1 Traffic_Count1 Traffic_FR1;
run;

No need to drop Pop1 Count1 and Traffic_FR1, as they no longer exist since you renamed them in the merge statement.

 

My suggestion would be to (1)add a drop=_name_ option to two of your merge files (i.e.:

Data Traffic_Ranks_Long;
   merge long_Ranks1 (drop=_name_ rename=(Pop1=Pop)) long_Ranks2(drop=_name_ rename=(Traffic_Count1=Traffic_Count))
		 long_Ranks3(rename=(Traffic_FR1=Traffic_FR));
   by Counties;
   Year=input(substr(_name_, 12), 4.);
   drop _name_;
run;

That way you would always be taking the year from 'Traffic_FR_XXXX', beginning with the 12th character.

 

Art, CEO, AnalystFinder.com

 

brianathrift
Fluorite | Level 6

Thank you so much! That worked perfectly. My last question is, how do I know which variable to refer to for the substring syntax?

 

For instance, do I use the following syntax to take year from 'Traffic_FR_XXXX', beginning with the 12th character?:

 

   Year=input(substr(_name_, 12), 4.);

Or if I should use the following syntax to take year from 'Traffic_Count_XXXX', beginning with the 15th character?: 

 

   Year=input(substr(_name_, 15), 4.);

 Or if I should use the following syntax take year from 'Pop_XXXX', beginning with the 5th character? :

   Year=input(substr(_name_, 5), 4.);

Thank you again for your help!

Best,

Briana Thrift

art297
Opal | Level 21

You are merging three files that each have an _name_ variable. However, as suggested, dropping the _name_ variable from two of the three files results in your only having one name variable (namely the one that wasn't dropped). Given the code you posted, the third file will always have an _name_ equal to Traffic_FR_XXXX, where XXXX represents the year).

 

Art, CEO, AnalystFinder.com

brianathrift
Fluorite | Level 6
Thank you! I really appreciate your help!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 945 views
  • 3 likes
  • 3 in conversation