Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Discussion on Proc transpose

Accepted Solution Solved
Reply
Contributor
Posts: 57
Accepted Solution

Discussion on Proc transpose

Dear Sir/Friends,

   I am Anil working in Some international research institute i have one question regarding an Proc transpose. That is there are variables called person  ID, Item name, quantity, Value in  dataset1 and there are person ID and some information on person family  in  dataset2. So in  dataset2 contains 40 observations and  dataset1 contains 1600 observations so i want to merge these two datasets but we can't because in dataset1 the ID repeats 40 times so i planned to transpose dataset1 like all itemnames  as column names and using by variable as a Person ID.Here is the  small sample of the dataset1.

proc transpose data=dataset1 out=dataset2;
id Item_name;
var  Quantity Total value;
by Vdsid_hhid ;
run;

Vdsid_hhid               Item_name    Quantity    Total value

INMHKAL100213000    Sorghum    66    528

INMHKAL100213000    Wheat    15    75

INMHKAL100213000    Rice    10    60

INMHKAL100213000    Pigeonpea    4    119

INMHKAL100213000    Chickpea    1    12

INMHKAL100213000    Matki/Cowpea    1    28

INMHKAL100213000    Edible oil    4    208

INMHKAL100213000    All types of vegetables    0    115

INMHKAL100213000    All types of fruits    0    30

INMHKAL100214000    PDS Rice    10    60

INMHKAL100214000    Sorghum    165    1155

INMHKAL100214000    Wheat    20    100

INMHKAL100214000    Pigeonpea    5    133

INMHKAL100214000    Chickpea    2    48

INMHKAL100214000    Matki/Cowpea    4    94.5

INMHKAL100214000    Groundnut oil    6    276

INMHKAL100214000    All types of vegetables    0    70

INMHKAL100214000    All types of fruits    216    216

so i need transpose like this

Vdsid_hhid   Sorghum            Wheat Rice        Pigeonpea          Chickpea       

                  quantity  value    quantity  value    quantity  value    quantity  value

I tried in many ways but i am failed to get these results. Please can anybody give some idea on this.

here i attached the data for your kind reference.

Thank you

Best Regards,

Anil

Attachment

Accepted Solutions
Solution
‎08-14-2013 05:26 AM
Super Contributor
Posts: 297

Re: Discussion on Proc transpose

How about this?

Create test scenario.

DATA HAVE;

LENGTH VDSID_HHID $40. ITEM_NAME $40.;

INFILE DATALINES DELIMITER=",";

INPUT  VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;

DATALINES;

INMHKAL100213000,SORGHUM,66,528

INMHKAL100213000,WHEAT,15,75

INMHKAL100213000,RICE,10,60

INMHKAL100213000,PIGEONPEA,4,119

INMHKAL100213000,CHICKPEA,1,12

INMHKAL100213000,MATKI/COWPEA,1,28

INMHKAL100213000,EDIBLE OIL,4,208

INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115

INMHKAL100213000,ALL TYPES OF FRUITS,0,30

INMHKAL100214000,PDS RICE,10,60

INMHKAL100214000,SORGHUM,165,1155

INMHKAL100214000,WHEAT,20,100

INMHKAL100214000,PIGEONPEA,5,133

INMHKAL100214000,CHICKPEA,2,48

INMHKAL100214000,MATKI/COWPEA,4,94.5

INMHKAL100214000,GROUNDNUT OIL,6,276

INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70

INMHKAL100214000,ALL TYPES OF FRUITS,216,216

RUN;

Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.

PROC SUMMARY DATA=HAVE NWAY MISSING;

  CLASS VDSID_HHID ITEM_NAME;

  VAR QUANTITY TOTAL_VALUE;

  OUTPUT OUT=HAVESUM (drop = _Smiley Happy SUM=;

RUN;

Transpose data to have a single column called COL1 containing the values of both QUANTITY and  TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.

PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;

BY ITEM_NAME VDSID_HHID NOTSORTED;

VAR QUANTITY TOTAL_VALUE;

RUN;

Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.

PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;

ID ITEM_NAME _NAME_;

VAR COL:;

BY VDSID_HHID NOTSORTED;

RUN;

You can now merge your 2 datasets.

You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.

Regards,

Scott

View solution in original post


All Replies
Contributor
Posts: 45

Re: Discussion on Proc transpose

Hi Anil,

From your description of the problem it seems like a simple DATA step merge would work, although if you could attack your dataset2 that would help to clarify things.

The following program should add further details from dataset2 onto each Product ID in dataset1.

data one;

input PID $1-16 NAME $ quantity value;
datalines;
INMHKAL100213000 Sorghum 66 528
INMHKAL100213000 Wheat 15 75
INMHKAL100213000 Chickpea 1 12
INMHKAL100213000 Matki/Cowpea 1 28
INMHKAL100213000 Edible 4 208
INMHKAL100213000 All_types_of 0 115
INMHKAL100213000 All_types_of 0 30
INMHKAL100213000 Milk/Cur 30 300
INMHKAL100213000 Sugar 5 100
INMHKAL100213000 Groundnut 3 75
INMHKAL100214000 PDS_Rice 10 60
INMHKAL100214000 Sorghum 165 1155
INMHKAL100214000 Wheat 20 100
INMHKAL100214000 Pigeonpea 5 133
INMHKAL100214000 Chickpea 2 48

;

data two;

input PID $ 1-16 Additional $17-35;
datalines;
INMHKAL100213000 Info on first PID
INMHKAL100214000 Info on second PID
;

data three;
merge one two;
by PID;
run;

PIDNAMEquantityvalueAdditional
INMHKAL100213000Sorghum66528Info on first PID
INMHKAL100213000Wheat1575Info on first PID
INMHKAL100213000Chickpea112Info on first PID
INMHKAL100213000Matki/Co128Info on first PID
INMHKAL100213000Edible4208Info on first PID
INMHKAL100213000All_type0115Info on first PID
INMHKAL100213000All_type030Info on first PID
INMHKAL100213000Milk/Cur30300Info on first PID
INMHKAL100213000Sugar5100Info on first PID
INMHKAL100213000Groundnu375Info on first PID
INMHKAL100214000PDS_Rice1060Info on second PID
INMHKAL100214000Sorghum1651155Info on second PID
INMHKAL100214000Wheat20100Info on second PID
INMHKAL100214000Pigeonpe5133Info on second PID
INMHKAL100214000Chickpea248Info on second PID
Contributor
Posts: 57

Re: Discussion on Proc transpose

Dear Murry,

Thanks a lot for your reply actually my problem is not merging i need to transpose the item names in to column names. Please give me an solution for this.

Actual results i need like this

PID                                   Sorghum            Wheat                Chickpea               PDS_Rice

                                  quantity   value     quantity   value     quantity   value     quantity   value

INMHKAL100213000      66           528        21         434        12           121

INMHKAL100214000     

So for this i used sas codes like this

proc transpose data=dataset1 out=transpose;

id Item_name;

var  Quantity Value;

by HHID;

run;

Please give me yourmail ID i will forward the sample dataset for your reference.

Thanks

Regards,

Anil

Super User
Super User
Posts: 6,500

Re: Discussion on Proc transpose

Looks like a report and not a transpose.  Use PROC REPORT.

Contributor
Posts: 45

Re: Discussion on Proc transpose

So dataset 2 is not relevant to this operation? I will have another look at this on my lunch break. Thanks for additional info.

Contributor
Posts: 57

Re: Discussion on Proc transpose

Ya dataset 2 is not relevant i need only transpose for dataset 1 which i have sent by attachment earlier.

Thank you

Anil

Super User
Posts: 5,256

Re: Discussion on Proc transpose

To me your desired result looks like a report, not a database table.

You might want to look at PROC TABULATE or PROC REPORT. From these you could even store the result in an output table if you wish.

Data never sleeps
Solution
‎08-14-2013 05:26 AM
Super Contributor
Posts: 297

Re: Discussion on Proc transpose

How about this?

Create test scenario.

DATA HAVE;

LENGTH VDSID_HHID $40. ITEM_NAME $40.;

INFILE DATALINES DELIMITER=",";

INPUT  VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;

DATALINES;

INMHKAL100213000,SORGHUM,66,528

INMHKAL100213000,WHEAT,15,75

INMHKAL100213000,RICE,10,60

INMHKAL100213000,PIGEONPEA,4,119

INMHKAL100213000,CHICKPEA,1,12

INMHKAL100213000,MATKI/COWPEA,1,28

INMHKAL100213000,EDIBLE OIL,4,208

INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115

INMHKAL100213000,ALL TYPES OF FRUITS,0,30

INMHKAL100214000,PDS RICE,10,60

INMHKAL100214000,SORGHUM,165,1155

INMHKAL100214000,WHEAT,20,100

INMHKAL100214000,PIGEONPEA,5,133

INMHKAL100214000,CHICKPEA,2,48

INMHKAL100214000,MATKI/COWPEA,4,94.5

INMHKAL100214000,GROUNDNUT OIL,6,276

INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70

INMHKAL100214000,ALL TYPES OF FRUITS,216,216

RUN;

Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.

PROC SUMMARY DATA=HAVE NWAY MISSING;

  CLASS VDSID_HHID ITEM_NAME;

  VAR QUANTITY TOTAL_VALUE;

  OUTPUT OUT=HAVESUM (drop = _Smiley Happy SUM=;

RUN;

Transpose data to have a single column called COL1 containing the values of both QUANTITY and  TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.

PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;

BY ITEM_NAME VDSID_HHID NOTSORTED;

VAR QUANTITY TOTAL_VALUE;

RUN;

Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.

PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;

ID ITEM_NAME _NAME_;

VAR COL:;

BY VDSID_HHID NOTSORTED;

RUN;

You can now merge your 2 datasets.

You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.

Regards,

Scott

Contributor
Posts: 57

Re: Discussion on Proc transpose

Dear Scott,

Thanks a lot for your idea now it's working. i will keep in touch with you.

thanks a lot.

My mail id - anilgvdbm@gmail.com

Super Contributor
Posts: 297

Re: Discussion on Proc transpose

Glad that I could help.  Good luck with the rest of your task.

Super Contributor
Posts: 307

Re: Discussion on Proc transpose

Maybe I'm asking something that everyone already knows the answer to, so I apologize if I'm asking out-of-turn.

Is your data ("dataset1") already in a SAS dataset? Or, is it in the form of text as per your "Real data.txt.zip"?

Contributor
Posts: 57

Re: Discussion on Proc transpose

dataset1 is in the form of text which i attached in Zip file as a text file. Dataset2 i forgot to attach. even i need transpose for dataset 1. so in this operation dataset2 is not required.

Super Contributor
Posts: 307

Re: Discussion on Proc transpose

Ok - so the first thing you need to do is to turn your text file into a SAS dataset. Your text file appears to be tab delimited. Try the following code. Once you have a SAS dataset, then try PROC TRANSPOSE.

(Second try - code didn't show up the first time . . . )

proc import datafile='some path\real data.txt'

     out=datafile1

     dbms=dlm

     replace;

     delimiter='09'x;

run;

Super Contributor
Posts: 307

Re: Discussion on Proc transpose

I forgot to mention that your variable names will not show up as you expect them to when you use the proc import code.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 582 views
  • 3 likes
  • 6 in conversation