## DATA Step, Macro, Functions and more

Solved
Occasional Contributor
Posts: 16

Hi Team,

I was wondering if I could get help to convert Rows to Columns. Below is the input table:

 Location Type Combined Population Collin   County, TX Uninsured_Pop_Num 500 Collin   County, TX Uninsured_Pop_Den 15000 Plano,   TX Hispanic_Male_Num 200 Plano,   TX Hispanic_Male_Den 10000

I would need to convert it to as follows:

 Location Type Numerator Denominator Collin   County, TX Uninsured_Pop 500 15000 Plano,   TX Hispanic_Male 200 10000

Accepted Solutions
Solution
‎10-29-2011 04:01 PM
Valued Guide
Posts: 765

Hi ... if your data are always in pairs of locations, you can one-to-one merge with no by.

Using Ksharp's data ...

data temp;

input Location & \$200. Type : \$40. Population ;

cards;

Collin County, TX   Uninsured_Pop_Num 500

Collin County, TX   Uninsured_Pop_Den 15000

Plano, TX       Hispanic_Male_Num    200

Plano, TX       Hispanic_Male_Den   10000

;

run;

data x;

merge

temp (where=(type like '%Num') rename=(population=num))

temp (where=(type like '%Den') rename=(population=den))

;

type = tranwrd(type,'_Den','');

run;

All Replies
Super User
Posts: 5,884

First you need to split the column Type_combined. It seems from your sample data it would be a great job for the scan function (used in SQL or in a data step).

Done with that, you could use proc transpose to convert your columns, see online help for syntax and samples.

/Linus

Data never sleeps
Super User
Posts: 10,784

Sure . Of course.

```data temp;
input Location & \$200. Type : \$40. Population ;
cards;
Collin County, TX   Uninsured_Pop_Num 500
Collin County, TX   Uninsured_Pop_Den 15000
Plano, TX       Hispanic_Male_Num    200
Plano, TX       Hispanic_Male_Den   10000
;
run;
proc sort data=temp; by location;run;
data _null_;
set temp end=last;
by location;
if _n_ eq 1 then call execute('data want;length location type \$ 200;');
if first.location then
call execute('location="'||location||'"; type="'||substr(type,1,findc(type,'_','b')-1)||'";');
call execute(scan(type,-1,'_')||'='||population||';');
if last.location then call execute('output;call missing(of _all_);');
if last then call execute('run;');
run;

```

Ksharp

Regular Contributor
Posts: 184

This should do it in one step:

data want(drop = Type_Combined Population) ;

do until (last.Location) ;

set have ;

by Location ;

Type = substr(Type_Combined,1,length(Type_Combined)-4) ;

select ( scan(Type_Combined,-1,'_') ) ;

when ('Num') Numerator   = Population ;

when ('Den') Denominator = Population ;

otherwise ;

end ;

end;

run ;

Super User
Posts: 10,784

Hi. Howles.

Your code can work because you have already know that it will generate two variables and these two variable 's name is Num or Den. But if You do not know how many variables will be generated and you also do not know what variable's name it is. Your way can't walk.

Ksharp

Solution
‎10-29-2011 04:01 PM
Valued Guide
Posts: 765

Hi ... if your data are always in pairs of locations, you can one-to-one merge with no by.

Using Ksharp's data ...

data temp;

input Location & \$200. Type : \$40. Population ;

cards;

Collin County, TX   Uninsured_Pop_Num 500

Collin County, TX   Uninsured_Pop_Den 15000

Plano, TX       Hispanic_Male_Num    200

Plano, TX       Hispanic_Male_Den   10000

;

run;

data x;

merge

temp (where=(type like '%Num') rename=(population=num))

temp (where=(type like '%Den') rename=(population=den))

;

type = tranwrd(type,'_Den','');

run;

Occasional Contributor
Posts: 16

Hi Team,

Thanks everyone for your time. It was my fault thatI didn’t provide the details about the data source.

The provided code only shows one location if the locations are repeated. My data source has some locations that are repeated several times as below:

data new;

input Location: \$200. Type : \$40.Population ;

cards;

Collin,TX Uninsured_Pop_Num 500

Collin,TX Uninsured_Pop_Den 15000

Collin,TX Insured_Pop_Num 14500

Collin,TX Insured_Pop_Den 15000

Collin,TX White_Female_Num 500

Collin,TX White_Female_Den 15000

Plano,TX Hispanic_Female_Num 200

Plano,TX Hispanic_Femlale_Den 10000

Plano,TX Hispanic_Male_Num 300

Plano,TX Hispanic_Male_Den 10000

Richardson,TX Flu_Patient_Num 145

Richardson,TX Flu_Patient_Den 12000

;

run;

Is there a way to get all the locations with your code?

Below is the output that I’m looking for:

 Location Type Numerator Denominator Collin, TX Uninsured_Pop 500 15000 Collin, TX Insured_Pop 200 10000 Collin, TX White_Female 500 15000 Collin, TX White_Male 12000 15000 Plano, TX Hispanic_Female 200 10000 Plano, TX Hispanic_Male 300 10000 Richardson,   TX Flu_Patient 145 12000

Thanks again.

Roger

PROC Star
Posts: 8,165

Roger,

Is there any reason why you couldn't have tested the suggestions yourself?

Regardless, I didn't test all of the suggestions, but Mike suggested code appears to do what you want.  Although, to test it, I had to correct one of your data entries and your desired results aren't 'totally' there, as one category doesn't exist in you sample data.  Below is the code I ran:

data new;

input Location: \$200. Type : \$40.Population ;

cards;

Collin,TX Uninsured_Pop_Num 500

Collin,TX Uninsured_Pop_Den 15000

Collin,TX Insured_Pop_Num 14500

Collin,TX Insured_Pop_Den 15000

Collin,TX White_Female_Num 500

Collin,TX White_Female_Den 15000

Plano,TX Hispanic_Female_Num 200

Plano,TX Hispanic_Female_Den 10000

Plano,TX Hispanic_Male_Num 300

Plano,TX Hispanic_Male_Den 10000

Richardson,TX Flu_Patient_Num 145

Richardson,TX Flu_Patient_Den 12000

;

run;

data want;

merge

new (where=(type like '%Num')

rename=(population=numerator))

new (where=(type like '%Den')

rename=(population=denominator))

;

type = tranwrd(type,'_Den','');

run;

Occasional Contributor
Posts: 16

You guys are awesome!

I didn't realize that "%Num' & '%Den' were case sensitive. My original data set had _num & _den. It's finally working. Thanks to Mike, Art, Howles, Linus & Ksharp!

--Roger

🔒 This topic is solved and locked.