Hi Team,
I was wondering if I could get help to convert Rows to Columns. Below is the input table:
| Location | Type Combined | Population | 
| Collin County, TX | Uninsured_Pop_Num | 500 | 
| Collin County, TX | Uninsured_Pop_Den | 15000 | 
| Plano, TX | Hispanic_Male_Num | 200 | 
| Plano, TX | Hispanic_Male_Den | 10000 | 
I would need to convert it to as follows:
| Location | Type | Numerator | Denominator | 
| Collin County, TX | Uninsured_Pop | 500 | 15000 | 
| Plano, TX | Hispanic_Male | 200 | 10000 | 
Thanks in advance!
Hi ... if your data are always in pairs of locations, you can one-to-one merge with no by.
Using Ksharp's data ...
data temp;
input Location & $200. Type : $40. Population ;
cards;
Collin County, TX Uninsured_Pop_Num 500
Collin County, TX Uninsured_Pop_Den 15000
Plano, TX Hispanic_Male_Num 200
Plano, TX Hispanic_Male_Den 10000
;
run;
data x;
merge
temp (where=(type like '%Num') rename=(population=num))
temp (where=(type like '%Den') rename=(population=den))
;
type = tranwrd(type,'_Den','');
run;
First you need to split the column Type_combined. It seems from your sample data it would be a great job for the scan function (used in SQL or in a data step).
Done with that, you could use proc transpose to convert your columns, see online help for syntax and samples.
/Linus
Sure . Of course.
data temp;
input Location & $200. Type : $40. Population ;
cards;
Collin County, TX   Uninsured_Pop_Num 500
Collin County, TX   Uninsured_Pop_Den 15000
Plano, TX       Hispanic_Male_Num    200
Plano, TX       Hispanic_Male_Den   10000
;
run;
proc sort data=temp; by location;run;
data _null_;
 set temp end=last;
 by location;
 if _n_ eq 1 then call execute('data want;length location type $ 200;');
 if first.location then 
  call execute('location="'||location||'"; type="'||substr(type,1,findc(type,'_','b')-1)||'";');
  call execute(scan(type,-1,'_')||'='||population||';');
 if last.location then call execute('output;call missing(of _all_);');
 if last then call execute('run;');
run;
Ksharp
This should do it in one step:
data want(drop = Type_Combined Population) ;
do until (last.Location) ;
set have ;
by Location ;
Type = substr(Type_Combined,1,length(Type_Combined)-4) ;
select ( scan(Type_Combined,-1,'_') ) ;
when ('Num') Numerator = Population ;
when ('Den') Denominator = Population ;
otherwise ;
end ;
end;
run ;
Hi. Howles.
Your code can work because you have already know that it will generate two variables and these two variable 's name is Num or Den. But if You do not know how many variables will be generated and you also do not know what variable's name it is. Your way can't walk.
Ksharp
Hi ... if your data are always in pairs of locations, you can one-to-one merge with no by.
Using Ksharp's data ...
data temp;
input Location & $200. Type : $40. Population ;
cards;
Collin County, TX Uninsured_Pop_Num 500
Collin County, TX Uninsured_Pop_Den 15000
Plano, TX Hispanic_Male_Num 200
Plano, TX Hispanic_Male_Den 10000
;
run;
data x;
merge
temp (where=(type like '%Num') rename=(population=num))
temp (where=(type like '%Den') rename=(population=den))
;
type = tranwrd(type,'_Den','');
run;
Hi Team,
Thanks everyone for your time. It was my fault thatI didn’t provide the details about the data source.
The provided code only shows one location if the locations are repeated. My data source has some locations that are repeated several times as below:
data new;
input Location: $200. Type : $40.Population ;
cards;
Collin,TX Uninsured_Pop_Num 500
Collin,TX Uninsured_Pop_Den 15000
Collin,TX Insured_Pop_Num 14500
Collin,TX Insured_Pop_Den 15000
Collin,TX White_Female_Num 500
Collin,TX White_Female_Den 15000
Plano,TX Hispanic_Female_Num 200
Plano,TX Hispanic_Femlale_Den 10000
Plano,TX Hispanic_Male_Num 300
Plano,TX Hispanic_Male_Den 10000
Richardson,TX Flu_Patient_Num 145
Richardson,TX Flu_Patient_Den 12000
;
run;
Is there a way to get all the locations with your code?
Below is the output that I’m looking for:
| Location | Type | Numerator | Denominator | 
| Collin, TX | Uninsured_Pop | 500 | 15000 | 
| Collin, TX | Insured_Pop | 200 | 10000 | 
| Collin, TX | White_Female | 500 | 15000 | 
| Collin, TX | White_Male | 12000 | 15000 | 
| Plano, TX | Hispanic_Female | 200 | 10000 | 
| Plano, TX | Hispanic_Male | 300 | 10000 | 
| Richardson, TX | Flu_Patient | 145 | 12000 | 
Thanks again.
Roger
Roger,
Is there any reason why you couldn't have tested the suggestions yourself?
Regardless, I didn't test all of the suggestions, but Mike suggested code appears to do what you want. Although, to test it, I had to correct one of your data entries and your desired results aren't 'totally' there, as one category doesn't exist in you sample data. Below is the code I ran:
data new;
input Location: $200. Type : $40.Population ;
cards;
Collin,TX Uninsured_Pop_Num 500
Collin,TX Uninsured_Pop_Den 15000
Collin,TX Insured_Pop_Num 14500
Collin,TX Insured_Pop_Den 15000
Collin,TX White_Female_Num 500
Collin,TX White_Female_Den 15000
Plano,TX Hispanic_Female_Num 200
Plano,TX Hispanic_Female_Den 10000
Plano,TX Hispanic_Male_Num 300
Plano,TX Hispanic_Male_Den 10000
Richardson,TX Flu_Patient_Num 145
Richardson,TX Flu_Patient_Den 12000
;
run;
data want;
merge
new (where=(type like '%Num')
rename=(population=numerator))
new (where=(type like '%Den')
rename=(population=denominator))
;
type = tranwrd(type,'_Den','');
run;
You guys are awesome!
I didn't realize that "%Num' & '%Den' were case sensitive. My original data set had _num & _den. It's finally working. Thanks to Mike, Art, Howles, Linus & Ksharp!
--Roger
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
