BookmarkSubscribeRSS Feed
cx2019
Obsidian | Level 7

As a new user, I have a problem. Then I discribe it in brief.

data t1;
input id $;
cards;
A20152017
B20162018
C20152018
;
run;

data t2;
input year A B C;
cards;
2014 1 1 1 2015 2 1 3 2016 1 1 2 2017 3 2 3 2018 1 2 1
2019 2 3 3 ; run;

I have 2 tables, t1 and t2. In t1, the first charactor(A, B and C) of id is name,  the second to the fifth and the sixth to the ninth mean starting year and ending year respectively. In t2, there are points in these years for  A, B and C.

 

I want to get the average points during the years shown in t1 and merge them  with set1.

 

Thank you for your help!

20 REPLIES 20
cx2019
Obsidian | Level 7

I am sorry that there are someting wrong with the codes, but we understand it.

PaigeMiller
Diamond | Level 26

@cx2019 wrote:

As a new user, I have a problem. Then I discribe it in brief.

data t1;
input id $;
cards;
A20152017
B20162018
C20152018
;
run;

data t2;
input year A B C;
cards;
2014 1 1 1 2015 2 1 3 2016 1 1 2 2017 3 2 3 2018 1 2 1
2019 2 3 3 ; run;

I have 2 tables, t1 and t2. In t1, the first charactor(A, B and C) of id is name,  the second to the fifth and the sixth to the ninth mean starting year and ending year respectively. In t2, there are points in these years for  A, B and C.

 

I want to get the average points during the years shown in t1 and merge them  with set1.


Are you receiving the data in this bizarre format? If so you have no choice but to work with in this format. Or are you creating the data in this format? (in which case, don't do it like this, there are much better formats).

 

With the data as is, some re-arranging of the data will be needed.

 

data t1;
    input id $10.;
    startyear=input(substr(id,2,4),4.);
    endyear=input(substr(id,6,5),4.);
    id=substr(id,1,1);
cards;
A20152017
B20162018
C20152018
;
run;

data t2;
input year A B C;
value=a; id='A'; output;
value=b; id='B'; output;
value=c; id='C'; output;
drop a b c;
cards;
2014 1 1 1
2015 2 1 3
2016 1 1 2
2017 3 2 3
2018 1 2 1
2019 2 3 3
;
run;

proc sql;
	create table t3 as select t2.id,mean(t2.value) as mean from t2 
            left join t1 on t1.id=t2.id 
            where t2.year>=t1.startyear and t2.year<=t1.endyear
            group by t2.id;
quit;

 

 

--
Paige Miller
cx2019
Obsidian | Level 7

Thank you! One question of my homework is like this and actually, data were given in 2 excel books in this format. You re-arranged the data when creating table2. What should I do if I use "infile" to import data or the data have been imported through menu.

PaigeMiller
Diamond | Level 26

INFILE doesn't change the way the data has to be re-arranged.

--
Paige Miller
cx2019
Obsidian | Level 7

There are more 1000 observations(or 1000 people) in table1, so it is not easy to re-arrange table2 like this.

How to simplify the code ? with a loop?

I am sorry that I did not tell all at one time. 

PaigeMiller
Diamond | Level 26

So instead of t2 in real life, you have an Excel spreadsheet where there are (approximately) 1000 columns, each one with a header of it's ID value? Can you speak to the person who created this monstrosity and ask them to re-arrange it into a format similar to what I show in t2?

 

Otherwise:

 

UNTESTED CODE

 

proc transpose data=t2 out=t2t;
    by year;
run;

proc sql;
	create table t3 as select t2t._name_,mean(t2t.col1) as mean from t2t 
            left join t1 on t1.id=t2t._name_
            where t2t.year>=t1.startyear and t2t.year<=t1.endyear
            group by t2t._name_;
quit;

 

--
Paige Miller
cx2019
Obsidian | Level 7

My mistakes.

Table1 needs change like following

A20152017
B20162018
C20152018

A20142018

D20132016

...

A person appears more than one time. There are more than 1000 rows in table1 and more than 30 columns in table2.

PaigeMiller
Diamond | Level 26

Please explain how to handle the start year and end year when an ID appears more than once.

--
Paige Miller
cx2019
Obsidian | Level 7

I need average points during different periods(startyear to endyear) for each person.

PaigeMiller
Diamond | Level 26

So in your example, ID A would get an average for 2015 to 2017, and another average for 2014 to 2018?

--
Paige Miller
cx2019
Obsidian | Level 7

Yes. The reality is a little different and I just take it as an example.

cx2019
Obsidian | Level 7

Excel books t1 and t2 are attached.

In t1, first two letters are the province and last twelve digits are startyear-month and endyear-month.

In t2, there are gdp growth rates.

calculation is the same(simple arithmetic mean).

I am sorry to trouble you.

PaigeMiller
Diamond | Level 26

My company does not allow me to download Excel files, they are a security risk.

 

You can make them .csv files, and provide code to read the files, that ought to work. (and chop them down some, I don't need all 1000 rows)

--
Paige Miller
cx2019
Obsidian | Level 7
proc import datafile='E:\table1.csv' out=dataset1; run;
proc import datafile='E:\table2.csv' out=dataset2; run;

table1

 

EventID
SC00XSQ0199302200102
QH00ZYZ0200301200702
TJ00CWD0197810198006
SD00LRN0197701197903
UB00GDC0197705197906
SJ00RBS0197904198303
JS00WMS0197509197810
YN00APS0197702198507
LN00CPR0197702198004
LN00LH00197803197909
UB00XDQ0197903198312
SH00PC00197901198003
AH00WL00197706197912
SC00ZZY0197510198003
QH00TQL0197702197912
GD00YSK0197912198011
GD00XZX0197812198011
NX00HSL0197701197902
LN00RZY0197712198011

 

table2

 

year	BJ	TJ	SH	HB	NM	UN	SD	GD	UB	XJ	AH	HN	FJ	GS	YN	GZ	GX	HA	HL	JL	JS	JX	LN	NX	QH	SX	SJ	SC	XZ	ZJ	CQ
1978		20.9	15.8	14.5	8	16.4	10.11	1	13.5		0.39	11.3	17.8	13.21	21.67	23.7	11.7		11.1	12.77	24.7	13.3	10.7	8.9	14.93	11	17.6	0	7.2	21.9	
1979	9.7	10	7.4	6.2	9.8	9.1	6.6	8.5	15.6	12.3	9.9	8.7	5.5	1.41	3.11	11	3.4		3.1	5.57	12	15.8	4.9	1.1	0	7.5	2.34	10.2	7.8	13.5	
1980	11.8	10	8.4	3.2	1.7	5.2	12.24	16.6	6.4	7.2	2.38	15.4	18.4	9.08	8.52	4.4	10.2		10	6.45	4.8	4.2	9.2	7.9	17.78	7.3	2	9.5	22.4	16.2	
1981	-1.5	4.8	5.6	1.02	10.6	5.5	5.77	9	6.5	8.5	17.78	7.8	15.5	-8.44	7.8	6.5	8		3.8	5.8	10.8	5.6	-1.6	2	-1.4	4.5	3.92	4.1	20.5	11.5	
1982	7.4	4.3	7.2	11.84	18.6	9.4	11.25	23	12.1	9.8	9.37	4.3	9.3	8.92	15.5	15.8	12.5		6.6	7.6	9.8	9.3	5.3	8.9	11.8	9.1	6.35	10.9	0.3	11.4	
1983	15.8	8.3	7.8	11.5	9.8	9.2	13.88	7.3	5.8	13.4	8.78	23.8	6.2	14.86	8.4	12.6	3.3		8.6	21.7	12.3	6.8	13.3	15.4	10.6	7.3	8.3	11.2	-5.2	8	
1984	17.6	19.3	11.6	14.36	16.1	9.4	17.37	15.6	20.9	14	20.05	10.1	17.9	13.76	14.45	19.8	6.9		11.1	12.49	15.7	15.4	16.8	13.6	13.6	17.8	13.6	12.3	25.3	21.7	
1985	8.6	10.6	13.4	12.51	17.2	12	11.41	18	16.2	16.9	15.45	13.5	17.6	13.19	13.03	7.9	11		6	6.8	17.3	14.8	13.3	17.8	10.84	16.5	7.1	11.9	15.4	21.7	
1986	8.9	5.8	4.4	5.05	5.9	8.1	6.25	12.7	5.5	11.5	11.09	4.6	5.7	11	4.3	5.6	6.4		3.5	7.3	10.4	6.7	8.3	8.3	8.1	8.7	6.5	5.5	-9.2	12.1	
1987	9.6	7.6	7.5	16.62	9	9.3	13.83	19.6	8.4	9.7	4.81	15	13.6	8.9	12.3	10.8	9.2		8.6	18.8	13.4	8.3	14.1	7.9	5.7	10	5.2	8.7	0.1	11.8	
1988	12.8	5.8	10.1	12.22	9.8	8.2	12.54	15.8	7.8	9.6	5.39	9.8	14.3	13.7	16	8.6	4.5	10.3	8.6	15.9	19.6	11.4	11.7	12.1	7.7	21	7.8	7.5	4.2	11.2	
1989	4.4	1.6	3	16.22	2.7	3.6	3.97	7.2	4.5	5.9	5.17	4.4	7.8	8.7	5.8	4.5	3.6	5.7	6.3	-2.5	2.5	6.1	3.1	7.7	1.2	3.3	5.2	2.6	8.4	-0.6	
1990	5.4	5.4	3.5	7.9	7.5	4	5.25	11.6	5	11.7	2.87	4.5	7.5	5.6	8.7	4.3	7	10.6	5.8	3.4	5	4.5	1.1	3.7	3.7	3.4	5	4.3	8.9	3.9	
1991	9.5	6	7.1	9.4	7.5	7.9	14.62	17.7	6.6	14.4	-0.93	6.9	14.2	6.6	6.6	9.2	12.7	14.9	6.9	6	8.3	8.2	6.1	4.8	4.7	7	4.2	8.7	1.6	17.8	
1992	11.6	11.7	14.9	13.9	11	11.1	16.91	22.1	14.1	13.1	16.77	13.7	20.3	9.9	10.9	8.1	18.3	40.2	6.8	12.2	25.6	14.8	12.1	8.6	7.4	9.8	13.8	12.6	7.1	19	
1993	12.1	12.1	14.9	16.5	10.6	12.6	21.88	22.3	14.3	10.3	20.99	15.8	24.1	11.6	10.6	9.9	21.2	20.9	7.6	12.8	19.8	13.7	14.9	10.1	9.6	12	12.2	13.9	8.2	22	
1994	13.5	14.3	14.3	14.9	10.1	10.6	16.26	19.1	15.2	10.9	20.65	13.8	21.1	10.4	11.6	8.5	16	16	8.7	14.3	16.5	17	11.2	8.2	8.2	8.1	9.4	11.1	15.6	20	
1995	12.5	14.9	14.1	13.8	9.1	10.4	14.15	14.9	14.6	9	14.3	14.8	15.1	9.9	11.2	7.5	11.4	4.3	9.6	9.7	15.4	14.5	7.1	9	8	9	11.1	10	17.9	16.7	
1996	9.2	14.3	13	13.5	12.7	12.1	12.24	10.7	13.2	6.4	14.4	13.9	13.9	11.5	10.4	8.9	10.3	4.8	10.5	13.7	12.2	13.4	8.6	18.1	8.6	10.2	11	10.1	13.2	12.7	
1997	9.6	12.1	12.7	12.5	9.7	10.7	11.24	10.6	13	11	12.7	10.4	14.6	8.5	9.4	9	8.1	6.7	10	9.2	12	11.5	8.9	7.6	9	9.2	10.5	10.2	11.3	11.1	11
1998	9.8	9.3	10.1	10.7	9.6	8.5	10.84	10.2	10.3	7.3	8.5	8.7	11.1	9.2	8	8.5	9.1	8.3	8.3	9	11	8.2	8.3	8.5	9	9.1	9	9.1	10.2	10.1	8.4
1999	10.2	10	10.2	9.1	7.8	8.3	10.14	9.5	8.3	7.1	8.1	8	10	8.3	7.2	8.3	7.7	8.6	7.5	8.1	10.1	7.8	8.2	8.7	8.2	8.4	5.1	5.6	9.6	10	7.6
2000	11	10.8	10.8	9.5	9.7	9	10.5	10.8	9.3	8.2	8.3	9.4	9.5	8.7	7.1	8.7	7.3	8.8	8.2	9.2	10.6	8	8.9	9.8	9	9	7.8	9	9.4	11	8.5
2001	11.23	12.04	10.2	8.7	9.56	9	10.05	9.61	9.1	8.1	8.63	9.05	9	9.36	6.5	8.85	8.2	8.87	9.3	9.35	10.2	8.8	9	10.1	12	9.1	8.4	9.2	12.8	10.5	9
2002	10.4	12.5	10.9	9.6	12.1	9	11.6	11.7	9.1	8.1	8.9	9.5	10.5	9.4	8.2	9.1	10.5	9.3	10.3	9.5	11.6	10.5	10.2	10.2	12.4	9.7	11.7	10.6	12.9	12.5	10.3
2003	10.7	14.8	11.8	11.6	16.8	9.6	13.7	14.3	9.4	10.8	9.2	10.8	11.6	10.1	8.6	10.1	10.2	10.5	10.3	10.2	13.6	13	11.5	12.2	12.1	10.9	13.9	11.8	12.1	14.4	11.5
2004	13.23	15.72	13.6	12.53	19.41	12.02	15.35	14.22	11.26	11.11	12.52	13.74	12.08	10.92	11.5	11.43	11.83	10.39	11.66	12.16	14.89	13.2	12.8	11	12.3	12.9	14.1	12.7	12.2	14.3	12.2
2005	11.1	14.5	11.1	13.4	21.6	11.6	15.2	12.5	11.4	10.9	11.8	14.1	11.3	11.7	9	11.5	12.7	10.1	11.6	12	14.5	12.8	12.3	10.3	12.2	12.6	12.5	12.6	12.2	12.4	11.5
2006	12.1	14.4	12.2	13.2	18.6	12	15	14.1	12.1	11.2	12.9	14	13.4	10.6	11.9	10.1	13.5	12.5	12	13	15	12.3	13	12.4	12.2	12.5	11.7	12.3	13.4	13.9	12.3

some data in table2 are missing

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 20 replies
  • 2993 views
  • 1 like
  • 4 in conversation