BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anilgvdbm
Quartz | Level 8

Dear Sir/Friends,

   I am Anil working in Some international research institute i have one question regarding an Proc transpose. That is there are variables called person  ID, Item name, quantity, Value in  dataset1 and there are person ID and some information on person family  in  dataset2. So in  dataset2 contains 40 observations and  dataset1 contains 1600 observations so i want to merge these two datasets but we can't because in dataset1 the ID repeats 40 times so i planned to transpose dataset1 like all itemnames  as column names and using by variable as a Person ID.Here is the  small sample of the dataset1.

proc transpose data=dataset1 out=dataset2;
id Item_name;
var  Quantity Total value;
by Vdsid_hhid ;
run;

Vdsid_hhid               Item_name    Quantity    Total value

INMHKAL100213000    Sorghum    66    528

INMHKAL100213000    Wheat    15    75

INMHKAL100213000    Rice    10    60

INMHKAL100213000    Pigeonpea    4    119

INMHKAL100213000    Chickpea    1    12

INMHKAL100213000    Matki/Cowpea    1    28

INMHKAL100213000    Edible oil    4    208

INMHKAL100213000    All types of vegetables    0    115

INMHKAL100213000    All types of fruits    0    30

INMHKAL100214000    PDS Rice    10    60

INMHKAL100214000    Sorghum    165    1155

INMHKAL100214000    Wheat    20    100

INMHKAL100214000    Pigeonpea    5    133

INMHKAL100214000    Chickpea    2    48

INMHKAL100214000    Matki/Cowpea    4    94.5

INMHKAL100214000    Groundnut oil    6    276

INMHKAL100214000    All types of vegetables    0    70

INMHKAL100214000    All types of fruits    216    216

so i need transpose like this

Vdsid_hhid   Sorghum            Wheat Rice        Pigeonpea          Chickpea       

                  quantity  value    quantity  value    quantity  value    quantity  value

I tried in many ways but i am failed to get these results. Please can anybody give some idea on this.

here i attached the data for your kind reference.

Thank you

Best Regards,

Anil

1 ACCEPTED SOLUTION

Accepted Solutions
Scott_Mitchell
Quartz | Level 8

How about this?

Create test scenario.

DATA HAVE;

LENGTH VDSID_HHID $40. ITEM_NAME $40.;

INFILE DATALINES DELIMITER=",";

INPUT  VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;

DATALINES;

INMHKAL100213000,SORGHUM,66,528

INMHKAL100213000,WHEAT,15,75

INMHKAL100213000,RICE,10,60

INMHKAL100213000,PIGEONPEA,4,119

INMHKAL100213000,CHICKPEA,1,12

INMHKAL100213000,MATKI/COWPEA,1,28

INMHKAL100213000,EDIBLE OIL,4,208

INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115

INMHKAL100213000,ALL TYPES OF FRUITS,0,30

INMHKAL100214000,PDS RICE,10,60

INMHKAL100214000,SORGHUM,165,1155

INMHKAL100214000,WHEAT,20,100

INMHKAL100214000,PIGEONPEA,5,133

INMHKAL100214000,CHICKPEA,2,48

INMHKAL100214000,MATKI/COWPEA,4,94.5

INMHKAL100214000,GROUNDNUT OIL,6,276

INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70

INMHKAL100214000,ALL TYPES OF FRUITS,216,216

RUN;

Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.

PROC SUMMARY DATA=HAVE NWAY MISSING;

  CLASS VDSID_HHID ITEM_NAME;

  VAR QUANTITY TOTAL_VALUE;

  OUTPUT OUT=HAVESUM (drop = _:) SUM=;

RUN;

Transpose data to have a single column called COL1 containing the values of both QUANTITY and  TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.

PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;

BY ITEM_NAME VDSID_HHID NOTSORTED;

VAR QUANTITY TOTAL_VALUE;

RUN;

Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.

PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;

ID ITEM_NAME _NAME_;

VAR COL:;

BY VDSID_HHID NOTSORTED;

RUN;

You can now merge your 2 datasets.

You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.

Regards,

Scott

View solution in original post

13 REPLIES 13
Murray_Court
Quartz | Level 8

Hi Anil,

From your description of the problem it seems like a simple DATA step merge would work, although if you could attack your dataset2 that would help to clarify things.

The following program should add further details from dataset2 onto each Product ID in dataset1.

data one;

input PID $1-16 NAME $ quantity value;
datalines;
INMHKAL100213000 Sorghum 66 528
INMHKAL100213000 Wheat 15 75
INMHKAL100213000 Chickpea 1 12
INMHKAL100213000 Matki/Cowpea 1 28
INMHKAL100213000 Edible 4 208
INMHKAL100213000 All_types_of 0 115
INMHKAL100213000 All_types_of 0 30
INMHKAL100213000 Milk/Cur 30 300
INMHKAL100213000 Sugar 5 100
INMHKAL100213000 Groundnut 3 75
INMHKAL100214000 PDS_Rice 10 60
INMHKAL100214000 Sorghum 165 1155
INMHKAL100214000 Wheat 20 100
INMHKAL100214000 Pigeonpea 5 133
INMHKAL100214000 Chickpea 2 48

;

data two;

input PID $ 1-16 Additional $17-35;
datalines;
INMHKAL100213000 Info on first PID
INMHKAL100214000 Info on second PID
;

data three;
merge one two;
by PID;
run;

PIDNAMEquantityvalueAdditional
INMHKAL100213000Sorghum66528Info on first PID
INMHKAL100213000Wheat1575Info on first PID
INMHKAL100213000Chickpea112Info on first PID
INMHKAL100213000Matki/Co128Info on first PID
INMHKAL100213000Edible4208Info on first PID
INMHKAL100213000All_type0115Info on first PID
INMHKAL100213000All_type030Info on first PID
INMHKAL100213000Milk/Cur30300Info on first PID
INMHKAL100213000Sugar5100Info on first PID
INMHKAL100213000Groundnu375Info on first PID
INMHKAL100214000PDS_Rice1060Info on second PID
INMHKAL100214000Sorghum1651155Info on second PID
INMHKAL100214000Wheat20100Info on second PID
INMHKAL100214000Pigeonpe5133Info on second PID
INMHKAL100214000Chickpea248Info on second PID
anilgvdbm
Quartz | Level 8

Dear Murry,

Thanks a lot for your reply actually my problem is not merging i need to transpose the item names in to column names. Please give me an solution for this.

Actual results i need like this

PID                                   Sorghum            Wheat                Chickpea               PDS_Rice

                                  quantity   value     quantity   value     quantity   value     quantity   value

INMHKAL100213000      66           528        21         434        12           121

INMHKAL100214000     

So for this i used sas codes like this

proc transpose data=dataset1 out=transpose;

id Item_name;

var  Quantity Value;

by HHID;

run;

Please give me yourmail ID i will forward the sample dataset for your reference.

Thanks

Regards,

Anil

Tom
Super User Tom
Super User

Looks like a report and not a transpose.  Use PROC REPORT.

Murray_Court
Quartz | Level 8

So dataset 2 is not relevant to this operation? I will have another look at this on my lunch break. Thanks for additional info.

anilgvdbm
Quartz | Level 8

Ya dataset 2 is not relevant i need only transpose for dataset 1 which i have sent by attachment earlier.

Thank you

Anil

LinusH
Tourmaline | Level 20

To me your desired result looks like a report, not a database table.

You might want to look at PROC TABULATE or PROC REPORT. From these you could even store the result in an output table if you wish.

Data never sleeps
Scott_Mitchell
Quartz | Level 8

How about this?

Create test scenario.

DATA HAVE;

LENGTH VDSID_HHID $40. ITEM_NAME $40.;

INFILE DATALINES DELIMITER=",";

INPUT  VDSID_HHID $ ITEM_NAME $ QUANTITY TOTAL_VALUE;

DATALINES;

INMHKAL100213000,SORGHUM,66,528

INMHKAL100213000,WHEAT,15,75

INMHKAL100213000,RICE,10,60

INMHKAL100213000,PIGEONPEA,4,119

INMHKAL100213000,CHICKPEA,1,12

INMHKAL100213000,MATKI/COWPEA,1,28

INMHKAL100213000,EDIBLE OIL,4,208

INMHKAL100213000,ALL TYPES OF VEGETABLES,0,115

INMHKAL100213000,ALL TYPES OF FRUITS,0,30

INMHKAL100214000,PDS RICE,10,60

INMHKAL100214000,SORGHUM,165,1155

INMHKAL100214000,WHEAT,20,100

INMHKAL100214000,PIGEONPEA,5,133

INMHKAL100214000,CHICKPEA,2,48

INMHKAL100214000,MATKI/COWPEA,4,94.5

INMHKAL100214000,GROUNDNUT OIL,6,276

INMHKAL100214000,ALL TYPES OF VEGETABLES,0,70

INMHKAL100214000,ALL TYPES OF FRUITS,216,216

RUN;

Summarise multiple observations with the same VDSID_HHID and ITEM_NAME.

PROC SUMMARY DATA=HAVE NWAY MISSING;

  CLASS VDSID_HHID ITEM_NAME;

  VAR QUANTITY TOTAL_VALUE;

  OUTPUT OUT=HAVESUM (drop = _:) SUM=;

RUN;

Transpose data to have a single column called COL1 containing the values of both QUANTITY and  TOTAL_VALUE and a variable called _NAME_ containing the variable names transposed into COL1.

PROC TRANSPOSE DATA=HAVESUM OUT=TRANSPOSE;

BY ITEM_NAME VDSID_HHID NOTSORTED;

VAR QUANTITY TOTAL_VALUE;

RUN;

Transpose the data a second time, using the ITEM_NAME and _NAME_ variables in the ID statement to name the created variables in a format similar to what you desired.

PROC TRANSPOSE DATA=TRANSPOSE OUT=TRANSPOSE1 (DROP=_NAME_) DELIMITER=_;

ID ITEM_NAME _NAME_;

VAR COL:;

BY VDSID_HHID NOTSORTED;

RUN;

You can now merge your 2 datasets.

You could do something similar in a datastep, however I am sure you know how to create a series of IF ELSE statements.

Regards,

Scott

anilgvdbm
Quartz | Level 8

Dear Scott,

Thanks a lot for your idea now it's working. i will keep in touch with you.

thanks a lot.

My mail id - anilgvdbm@gmail.com

Scott_Mitchell
Quartz | Level 8

Glad that I could help.  Good luck with the rest of your task.

Fugue
Quartz | Level 8

Maybe I'm asking something that everyone already knows the answer to, so I apologize if I'm asking out-of-turn.

Is your data ("dataset1") already in a SAS dataset? Or, is it in the form of text as per your "Real data.txt.zip"?

anilgvdbm
Quartz | Level 8

dataset1 is in the form of text which i attached in Zip file as a text file. Dataset2 i forgot to attach. even i need transpose for dataset 1. so in this operation dataset2 is not required.

Fugue
Quartz | Level 8

Ok - so the first thing you need to do is to turn your text file into a SAS dataset. Your text file appears to be tab delimited. Try the following code. Once you have a SAS dataset, then try PROC TRANSPOSE.

(Second try - code didn't show up the first time . . . )

proc import datafile='some path\real data.txt'

     out=datafile1

     dbms=dlm

     replace;

     delimiter='09'x;

run;

Fugue
Quartz | Level 8

I forgot to mention that your variable names will not show up as you expect them to when you use the proc import code.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 1794 views
  • 3 likes
  • 6 in conversation