Hi all,
I'm working with a raw dataset regarding U.S. foreign trade that includes a list of monthly export and import values from 1987 to 2015. I'm trying to create a new dataset that lists the total import values for each year. I've tried using do loops, first. and last. statements, but to no avail. After reading in the raw dataset, here is a sample screenshot.
My goal is to have a dataset that lists the year and its corresponding total import values. How should I go about this? Thanks.
Use PROC MEANS or a Summary Task (via the point and click interface) to add up the data to each year.
Apply a format to the date variable as a year to get the annual values.
Here's an example using the stock data aggregated to years.
proc means data=sashelp.stocks N NMISS min max mean median STACKODS;
class date;
format date year4.;
var open high low;
output out=want n= mean= sum= median= mean= / autoname autolabel;
ods output summary = want2;
run;
@rsiddh2 wrote:
Hi all,
I'm working with a raw dataset regarding U.S. foreign trade that includes a list of monthly export and import values from 1987 to 2015. I'm trying to create a new dataset that lists the total import values for each year. I've tried using do loops, first. and last. statements, but to no avail. After reading in the raw dataset, here is a sample screenshot.
My goal is to have a dataset that lists the year and its corresponding total import values. How should I go about this? Thanks.
Hi @rsiddh2 Not sure why there would be a need for loops, if my understanding is correct and the below is perhaps what you are after?
proc sql;
create table want as
select year(date) as year, sum(import) as total
from your_dataset
group by year;
quit;
data temp;
set your_dataset;
year=year(date);
run;
data want;
set temp;
by year;
if first.year then total=import;
else total+import;
if last.year;
run;
This returns a table that lists the year, but for some reason, it does not create the new total variable and the imports given are from the last month. Here is the output after running your code:
I think there is perhaps a gross misunderstanding on my part on the requirement. Can you plz post a sample as plain text(not pics) and a corresponding sample of your required output
Here is each month in 1987 with its corresponding import values. I want to create a new data set that lists the total imports for each year from 1987 to 2015.
Month Imports
January 1987 30,798.5
February 1987 29,810.3
March 1987 32,464.3
April 1987 32,291.1
May 1987 33,170.0
June 1987 35,358.0
July 1987 35,809.9
August 1987 33,826.2
September 1987 33,667.5
October 1987 38,075.6
November 1987 35,210.9
December 1987 35,758.7
TOTAL 1987 406,241.0
@rsiddh2 wrote:
Thank you! As a side note, is it possible to accomplish this using a data step? There are additional things I wish to perform with this new data.
Many times it is easier to break a problem into components, such as the total calculation, using an appropriate tool, like Proc Means and then merging the result together instead of attempting to force everything through a single data step.
And it may not hurt to describe what some of those "additional things" might be. For instance your picture of the data shows two other variables, Export and Balance. If I wanted to get sums of those I would use @Reeza's solution adding Export and Balance to the VAR statement and get the statistics for all 3 variables at one pass.
If you happened to have another variable such as Country the trade was with then you could add a CLASS statement to Proc Means/ or Summary with that variable and you would get the summary by country as well as an overall total.
So Output would look similar to:
1987 | 345,990.70
1988 | 324,383.90
@novinosrin posted the solution then.
Use PROC MEANS or a Summary Task (via the point and click interface) to add up the data to each year.
Apply a format to the date variable as a year to get the annual values.
Here's an example using the stock data aggregated to years.
proc means data=sashelp.stocks N NMISS min max mean median STACKODS;
class date;
format date year4.;
var open high low;
output out=want n= mean= sum= median= mean= / autoname autolabel;
ods output summary = want2;
run;
@rsiddh2 wrote:
Hi all,
I'm working with a raw dataset regarding U.S. foreign trade that includes a list of monthly export and import values from 1987 to 2015. I'm trying to create a new dataset that lists the total import values for each year. I've tried using do loops, first. and last. statements, but to no avail. After reading in the raw dataset, here is a sample screenshot.
My goal is to have a dataset that lists the year and its corresponding total import values. How should I go about this? Thanks.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.