I need some help combing data in two columns into one. Below is an example.
I have below.
ID | curmonth | prmonth | test1 | test2 | vege |
34158693 | 201608 | 201608 | 1 | 0 | Potato |
34158693 | 201608 | 201609 | 1 | 0 | Potato |
34158693 | 201610 | 201611 | 1 | 0 | Potato |
I would like to convert as below.
ID | yearmonth | test1 | test2 | vege |
34158693 | 201608 | 1 | 0 | Potato |
34158693 | 201609 | 1 | 0 | Potato |
34158693 | 201610 | 1 | 0 | Potato |
34158693 | 201611 | 1 | 0 | Potato |
I used transpose but it does not work.
proc transpose data=data1 out= data2 (rename=(col1=yearmonth) drop=_name_);
var curmonth prmonth;
by id vege test1 test2;
run;
Please help.
Thanks
Proc expand should be good
data have;
input ID curmonth prmonth test1 test2 vege$;
yearmonth=input(cats(put(prmonth,best.),'01'),yymmdd8.);
cards;
34158693 201608 201608 1 0 Potato
34158693 201608 201609 1 0 Potato
34158693 201610 201611 1 0 Potato
;
proc expand data=have out=want(drop=curmonth prmonth) to=month;
id yearmonth;
by id;
format yearmonth yymmn.;
run;
Hi Jagdish, thanks for your reply.
I have not tried your solution yet because in you code you are reading the data where as I already have 'have' table so trying to figure out how to add the 'yearmonth' field in the 'have' table.
Also, looking at the way you are creating the 'yearmonth' field, would it be putting value '201610' in the 'yearmonth' field because the '201610' is in the 'curmonth' field not the 'prmonth'. I would like to see below values in 'yearmonth' field.
yearmonth |
201608 |
201609 |
201610 |
201611 |
Thanks again for your reply.
Please try the below code to get the expected output as you mentioned
data have;
input ID curmonth prmonth test1 test2 vege$;
cards;
34158693 201608 201608 1 0 Potato
34158693 201608 201609 1 0 Potato
34158693 201610 201611 1 0 Potato
;
data want1;
set have;
yearmonth=input(cats(put(prmonth,best.),'01'),yymmdd8.);
run;
proc expand data=want1 out=want2(drop=curmonth prmonth) to=month;
id yearmonth;
by id;
format yearmonth yymmn.;
run;
Thanks Jagdish.
I tried to run the proc expand code and I got below error.
ERROR: Procedure EXPAND not found.
Looks like, I my SAS EG does not support proc expand. Strange
I will have to ask my coworkers about the it.
any other way this can be done?
Thanks
It is difficult to propose a good solution without knowing what do the variables mean.
Is curmonth means period start ? - I shall assume - YES;
Is prmonth means period end ? - I shall assume - YES;
Are those two sas date variables with yymmn6. format
or just numeric variables or just strings ?
If dates are strings the use: input(datex||'01', yymmdd.) - to convert it into sas date;
If date is just numeric use: input(put(daten,6.)||'01', yymmdd.) - to convert it into sas date;
Having above assumptions, try next code:
data temp;
set have;
by id;
yearmonth = curmonth;
do while (yearmonth le prmonth);
output;
yearmonth = intnx('month',yearmonth,1);
end;
format yearmonth yymmn6.;
drop curmonth prmonth;
run;
proc sort data=temp out=want nodupkey;
by id yearmonth;
run;
Thanks Shumel for your reply.
The curmonth and prmonth are character fields. Looking back at my table, looks like below is how the field names and data should look like.
ID | prmonth | curmonth | test1 | test2 | vege |
34158693 | 201608 | 201608 | 1 | 0 | Potato |
34158693 | 201608 | 201609 | 1 | 0 | Potato |
34158693 | 201610 | 201611 | 1 | 0 | Potato |
I have switched the labels for second and third columns. Both month fields are like start and end month. Below is what I would like to see.
ID | yearmonth | test1 | test2 | vege |
34158693 | 201608 | 1 | 0 | Potato |
34158693 | 201609 | 1 | 0 | Potato |
34158693 | 201610 | 1 | 0 | Potato |
34158693 | 201611 | 1 | 0 | Potato |
Thanks
Is your table a sas dataset ? - assuming yes, then as the dates are given in character variables,
here is the adapted code:
data temp;
set have(rename=(curmonth=date_from prmonth=date_upto));
by id;
curmonth = input(date_from||'01', yymmdd.);
prmonth = input(date_upto||'01', yymmdd.);
yearmonth = curmonth;
do while (yearmonth le prmonth);
output;
yearmonth = intnx('month',yearmonth,1);
end;
format yearmonth yymmn6.;
drop curmonth prmonth;
run;
proc sort data=temp out=want nodupkey;
by id yearmonth;
run;
Run this code.
If need post both your log and your output (of the sample);
Thanks Shumel for the new version of the code. Sorry for not replying sooner. I was on pto.
I tried your logic but for some reason, it ran for a longtime and it came back with the insufficient space for the temp table. The table has only 8,000+ records so not sure the reason. I ended up seperating the two date fields in two tables. After that I appended one table into another. I believe that worked.
Thanks again for your help.
If you just append 2nd date to 1st date then you miss yearmonth in between.
I assume that TEMP file overflow is a result of overlapping periods in observations of same id.
Then I'll try overcome it by slight changes in my code:
proc sort data=have; by id curmonth prmonth;
data want;
set have(rename=(curmonth=date_from prmonth=date_upto));
by id;
retain yearmonth;
curmonth = input(date_from||'01', yymmdd.);
prmonth = input(date_upto||'01', yymmdd.);
if first.id then yearmonth = curmonth;
else if curmonth < yearmonth then
curmonth = yearmonth.
do while (yearmonth le prmonth);
output;
yearmonth = intnx('month',yearmonth,1);
end;
format yearmonth yymmn6.;
drop curmonth prmonth;
run;
Thanks for your reply. Unfortunately, I do not have license for SAS/ETS.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.