As a new user, I have a problem. Then I discribe it in brief.
data t1;
input id $;
cards;
A20152017
B20162018
C20152018
;
run;
data t2;
input year A B C;
cards;
2014 1 1 1
2015 2 1 3
2016 1 1 2
2017 3 2 3
2018 1 2 1
2019 2 3 3
;
run;
I have 2 tables, t1 and t2. In t1, the first charactor(A, B and C) of id is name, the second to the fifth and the sixth to the ninth mean starting year and ending year respectively. In t2, there are points in these years for A, B and C.
I want to get the average points during the years shown in t1 and merge them with set1.
Thank you for your help!
I am sorry that there are someting wrong with the codes, but we understand it.
@cx2019 wrote:
As a new user, I have a problem. Then I discribe it in brief.
data t1; input id $; cards; A20152017 B20162018 C20152018 ; run; data t2; input year A B C; cards;
2014 1 1 1 2015 2 1 3 2016 1 1 2 2017 3 2 3 2018 1 2 1
2019 2 3 3 ; run;I have 2 tables, t1 and t2. In t1, the first charactor(A, B and C) of id is name, the second to the fifth and the sixth to the ninth mean starting year and ending year respectively. In t2, there are points in these years for A, B and C.
I want to get the average points during the years shown in t1 and merge them with set1.
Are you receiving the data in this bizarre format? If so you have no choice but to work with in this format. Or are you creating the data in this format? (in which case, don't do it like this, there are much better formats).
With the data as is, some re-arranging of the data will be needed.
data t1;
input id $10.;
startyear=input(substr(id,2,4),4.);
endyear=input(substr(id,6,5),4.);
id=substr(id,1,1);
cards;
A20152017
B20162018
C20152018
;
run;
data t2;
input year A B C;
value=a; id='A'; output;
value=b; id='B'; output;
value=c; id='C'; output;
drop a b c;
cards;
2014 1 1 1
2015 2 1 3
2016 1 1 2
2017 3 2 3
2018 1 2 1
2019 2 3 3
;
run;
proc sql;
create table t3 as select t2.id,mean(t2.value) as mean from t2
left join t1 on t1.id=t2.id
where t2.year>=t1.startyear and t2.year<=t1.endyear
group by t2.id;
quit;
Thank you! One question of my homework is like this and actually, data were given in 2 excel books in this format. You re-arranged the data when creating table2. What should I do if I use "infile" to import data or the data have been imported through menu.
INFILE doesn't change the way the data has to be re-arranged.
There are more 1000 observations(or 1000 people) in table1, so it is not easy to re-arrange table2 like this.
How to simplify the code ? with a loop?
I am sorry that I did not tell all at one time.
So instead of t2 in real life, you have an Excel spreadsheet where there are (approximately) 1000 columns, each one with a header of it's ID value? Can you speak to the person who created this monstrosity and ask them to re-arrange it into a format similar to what I show in t2?
Otherwise:
UNTESTED CODE
proc transpose data=t2 out=t2t;
by year;
run;
proc sql;
create table t3 as select t2t._name_,mean(t2t.col1) as mean from t2t
left join t1 on t1.id=t2t._name_
where t2t.year>=t1.startyear and t2t.year<=t1.endyear
group by t2t._name_;
quit;
My mistakes.
Table1 needs change like following
A20152017
B20162018
C20152018
A20142018
D20132016
...
A person appears more than one time. There are more than 1000 rows in table1 and more than 30 columns in table2.
Please explain how to handle the start year and end year when an ID appears more than once.
I need average points during different periods(startyear to endyear) for each person.
So in your example, ID A would get an average for 2015 to 2017, and another average for 2014 to 2018?
Yes. The reality is a little different and I just take it as an example.
Excel books t1 and t2 are attached.
In t1, first two letters are the province and last twelve digits are startyear-month and endyear-month.
In t2, there are gdp growth rates.
calculation is the same(simple arithmetic mean).
I am sorry to trouble you.
My company does not allow me to download Excel files, they are a security risk.
You can make them .csv files, and provide code to read the files, that ought to work. (and chop them down some, I don't need all 1000 rows)
proc import datafile='E:\table1.csv' out=dataset1; run;
proc import datafile='E:\table2.csv' out=dataset2; run;
table1
EventID
SC00XSQ0199302200102
QH00ZYZ0200301200702
TJ00CWD0197810198006
SD00LRN0197701197903
UB00GDC0197705197906
SJ00RBS0197904198303
JS00WMS0197509197810
YN00APS0197702198507
LN00CPR0197702198004
LN00LH00197803197909
UB00XDQ0197903198312
SH00PC00197901198003
AH00WL00197706197912
SC00ZZY0197510198003
QH00TQL0197702197912
GD00YSK0197912198011
GD00XZX0197812198011
NX00HSL0197701197902
LN00RZY0197712198011
table2
year BJ TJ SH HB NM UN SD GD UB XJ AH HN FJ GS YN GZ GX HA HL JL JS JX LN NX QH SX SJ SC XZ ZJ CQ
1978 20.9 15.8 14.5 8 16.4 10.11 1 13.5 0.39 11.3 17.8 13.21 21.67 23.7 11.7 11.1 12.77 24.7 13.3 10.7 8.9 14.93 11 17.6 0 7.2 21.9
1979 9.7 10 7.4 6.2 9.8 9.1 6.6 8.5 15.6 12.3 9.9 8.7 5.5 1.41 3.11 11 3.4 3.1 5.57 12 15.8 4.9 1.1 0 7.5 2.34 10.2 7.8 13.5
1980 11.8 10 8.4 3.2 1.7 5.2 12.24 16.6 6.4 7.2 2.38 15.4 18.4 9.08 8.52 4.4 10.2 10 6.45 4.8 4.2 9.2 7.9 17.78 7.3 2 9.5 22.4 16.2
1981 -1.5 4.8 5.6 1.02 10.6 5.5 5.77 9 6.5 8.5 17.78 7.8 15.5 -8.44 7.8 6.5 8 3.8 5.8 10.8 5.6 -1.6 2 -1.4 4.5 3.92 4.1 20.5 11.5
1982 7.4 4.3 7.2 11.84 18.6 9.4 11.25 23 12.1 9.8 9.37 4.3 9.3 8.92 15.5 15.8 12.5 6.6 7.6 9.8 9.3 5.3 8.9 11.8 9.1 6.35 10.9 0.3 11.4
1983 15.8 8.3 7.8 11.5 9.8 9.2 13.88 7.3 5.8 13.4 8.78 23.8 6.2 14.86 8.4 12.6 3.3 8.6 21.7 12.3 6.8 13.3 15.4 10.6 7.3 8.3 11.2 -5.2 8
1984 17.6 19.3 11.6 14.36 16.1 9.4 17.37 15.6 20.9 14 20.05 10.1 17.9 13.76 14.45 19.8 6.9 11.1 12.49 15.7 15.4 16.8 13.6 13.6 17.8 13.6 12.3 25.3 21.7
1985 8.6 10.6 13.4 12.51 17.2 12 11.41 18 16.2 16.9 15.45 13.5 17.6 13.19 13.03 7.9 11 6 6.8 17.3 14.8 13.3 17.8 10.84 16.5 7.1 11.9 15.4 21.7
1986 8.9 5.8 4.4 5.05 5.9 8.1 6.25 12.7 5.5 11.5 11.09 4.6 5.7 11 4.3 5.6 6.4 3.5 7.3 10.4 6.7 8.3 8.3 8.1 8.7 6.5 5.5 -9.2 12.1
1987 9.6 7.6 7.5 16.62 9 9.3 13.83 19.6 8.4 9.7 4.81 15 13.6 8.9 12.3 10.8 9.2 8.6 18.8 13.4 8.3 14.1 7.9 5.7 10 5.2 8.7 0.1 11.8
1988 12.8 5.8 10.1 12.22 9.8 8.2 12.54 15.8 7.8 9.6 5.39 9.8 14.3 13.7 16 8.6 4.5 10.3 8.6 15.9 19.6 11.4 11.7 12.1 7.7 21 7.8 7.5 4.2 11.2
1989 4.4 1.6 3 16.22 2.7 3.6 3.97 7.2 4.5 5.9 5.17 4.4 7.8 8.7 5.8 4.5 3.6 5.7 6.3 -2.5 2.5 6.1 3.1 7.7 1.2 3.3 5.2 2.6 8.4 -0.6
1990 5.4 5.4 3.5 7.9 7.5 4 5.25 11.6 5 11.7 2.87 4.5 7.5 5.6 8.7 4.3 7 10.6 5.8 3.4 5 4.5 1.1 3.7 3.7 3.4 5 4.3 8.9 3.9
1991 9.5 6 7.1 9.4 7.5 7.9 14.62 17.7 6.6 14.4 -0.93 6.9 14.2 6.6 6.6 9.2 12.7 14.9 6.9 6 8.3 8.2 6.1 4.8 4.7 7 4.2 8.7 1.6 17.8
1992 11.6 11.7 14.9 13.9 11 11.1 16.91 22.1 14.1 13.1 16.77 13.7 20.3 9.9 10.9 8.1 18.3 40.2 6.8 12.2 25.6 14.8 12.1 8.6 7.4 9.8 13.8 12.6 7.1 19
1993 12.1 12.1 14.9 16.5 10.6 12.6 21.88 22.3 14.3 10.3 20.99 15.8 24.1 11.6 10.6 9.9 21.2 20.9 7.6 12.8 19.8 13.7 14.9 10.1 9.6 12 12.2 13.9 8.2 22
1994 13.5 14.3 14.3 14.9 10.1 10.6 16.26 19.1 15.2 10.9 20.65 13.8 21.1 10.4 11.6 8.5 16 16 8.7 14.3 16.5 17 11.2 8.2 8.2 8.1 9.4 11.1 15.6 20
1995 12.5 14.9 14.1 13.8 9.1 10.4 14.15 14.9 14.6 9 14.3 14.8 15.1 9.9 11.2 7.5 11.4 4.3 9.6 9.7 15.4 14.5 7.1 9 8 9 11.1 10 17.9 16.7
1996 9.2 14.3 13 13.5 12.7 12.1 12.24 10.7 13.2 6.4 14.4 13.9 13.9 11.5 10.4 8.9 10.3 4.8 10.5 13.7 12.2 13.4 8.6 18.1 8.6 10.2 11 10.1 13.2 12.7
1997 9.6 12.1 12.7 12.5 9.7 10.7 11.24 10.6 13 11 12.7 10.4 14.6 8.5 9.4 9 8.1 6.7 10 9.2 12 11.5 8.9 7.6 9 9.2 10.5 10.2 11.3 11.1 11
1998 9.8 9.3 10.1 10.7 9.6 8.5 10.84 10.2 10.3 7.3 8.5 8.7 11.1 9.2 8 8.5 9.1 8.3 8.3 9 11 8.2 8.3 8.5 9 9.1 9 9.1 10.2 10.1 8.4
1999 10.2 10 10.2 9.1 7.8 8.3 10.14 9.5 8.3 7.1 8.1 8 10 8.3 7.2 8.3 7.7 8.6 7.5 8.1 10.1 7.8 8.2 8.7 8.2 8.4 5.1 5.6 9.6 10 7.6
2000 11 10.8 10.8 9.5 9.7 9 10.5 10.8 9.3 8.2 8.3 9.4 9.5 8.7 7.1 8.7 7.3 8.8 8.2 9.2 10.6 8 8.9 9.8 9 9 7.8 9 9.4 11 8.5
2001 11.23 12.04 10.2 8.7 9.56 9 10.05 9.61 9.1 8.1 8.63 9.05 9 9.36 6.5 8.85 8.2 8.87 9.3 9.35 10.2 8.8 9 10.1 12 9.1 8.4 9.2 12.8 10.5 9
2002 10.4 12.5 10.9 9.6 12.1 9 11.6 11.7 9.1 8.1 8.9 9.5 10.5 9.4 8.2 9.1 10.5 9.3 10.3 9.5 11.6 10.5 10.2 10.2 12.4 9.7 11.7 10.6 12.9 12.5 10.3
2003 10.7 14.8 11.8 11.6 16.8 9.6 13.7 14.3 9.4 10.8 9.2 10.8 11.6 10.1 8.6 10.1 10.2 10.5 10.3 10.2 13.6 13 11.5 12.2 12.1 10.9 13.9 11.8 12.1 14.4 11.5
2004 13.23 15.72 13.6 12.53 19.41 12.02 15.35 14.22 11.26 11.11 12.52 13.74 12.08 10.92 11.5 11.43 11.83 10.39 11.66 12.16 14.89 13.2 12.8 11 12.3 12.9 14.1 12.7 12.2 14.3 12.2
2005 11.1 14.5 11.1 13.4 21.6 11.6 15.2 12.5 11.4 10.9 11.8 14.1 11.3 11.7 9 11.5 12.7 10.1 11.6 12 14.5 12.8 12.3 10.3 12.2 12.6 12.5 12.6 12.2 12.4 11.5
2006 12.1 14.4 12.2 13.2 18.6 12 15 14.1 12.1 11.2 12.9 14 13.4 10.6 11.9 10.1 13.5 12.5 12 13 15 12.3 13 12.4 12.2 12.5 11.7 12.3 13.4 13.9 12.3
some data in table2 are missing
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.