Good morning!
I've taken SAS Base Programming 1 and 2, and am working through a few different SAS practice books. Now I have my first work project, and am struggling to find the right direction.
My work is tracking fluctuations in confidential quarterly data. With SAS 9.4 (Windows) I need the following results:
1) isolate observations(by accoun) that fluctuate +/- 40% from the prior quarter in four variables(each month plus a total).
2) compare macro changes current to prior quarter, with aggregated changes in those variables +/- 10% by region and SUBJECT variables
Here's the last of the code I wrote so far- this is reduced from a massive .txt file to include only relevant variables, I made extensive use of SAS community posts and other online resources to get this far.
/*this creates new variables for 2 digit,3 digit level SUBJECT codes, MONTH totals, and percent of quarterly SALES change*/
Data practice1_2;
Set practice1_2 (obs=5000);
SUBJ2=substr(SUBJ,1, 2);
SUBJ=substr(SUBJ,1, 3);
ave_TRANS= sum(MONTH1, MONTH2, MONTH3)/3;
pctchng = dif( SALES ) / lag( SALES ) * 100;
label pctchng = "quarterly SALES Change";
quartern=input(quarter, 2.);
pctchngn=input(pctchng, 6.2);
run;
This produces barely useful output, I can manually see changes per observation in a spreadsheet, but its clumsy, and I'm not sure how to get it into manageable output (total observations are approx 300K). On the macro level, 2 and 3 digit subject codes are needed to determine how narrowly/ broadly each subject is analysed.
The goal is to produce a report that tracks regional fluctuation and pinpoints which client account is causing it so the individual account can be researched. Sample data with relevant variables is attached.
Proc report seems like a good option, but I'm getting bogged down in where to start.
I would greatly appreciate any pointers on how to approach this- it feels fairly straightforward, but being new to programming and SAS, my vocabulary obviously has some gaps and perhaps I'm over complicating or looking in the wrong place.
Many many thanks!
Artifact
Showing variables like Month1, Month2 and Month3 may be an indication that your data is "spreadsheet" thinking and not in the best form for working with SAS data. You might be better off restructuring your data to have a date of value, as in an actual SAS date value for the value. Then getting summaries by calendar quarter can be pretty easy.
Example using the data set Sashelp.Stocks that you should have available:
proc means data=sashelp.stocks max min mean; class stock date; format date yyq6.; var open high low; run;
In this case each date represents one month in the data. The group created by the format summarizes values by calendar quarter. Similar output could be sent to a data set.
Then use a data step on the summarized data by your id variables
Your logic is not taking into account the changes of subject as well.
proc summary data=sashelp.stocks nway; class stock date; format date yyq6.; var open; output out=work.summary (drop= _type_ _freq_) mean=; run; data example; set work.summary; by stock; pctchng = dif( open ) / lag( open ) * 100; if first.stock then pctchng=.; run;
I'm not sure why you are playing games with multiplying the percent by 100 and then using an input .
If you want to ROUND a value then use the Round function: Pctchng= round(pctchng, 0.01) ; to round the value to 100ths. This could be done on the calculation that creates the value.
Attempting to get everything into one data step may be adding complexity as well.
@Artifact wrote:
Thank you. I'll look at this example. Feeling the learning curve right now for sure.
Probably similar to the one I felt when I was forced to do things in a spread sheet after I had used SAS for more than 10 years.
I was constantly driven nuts that there wasn't , at least not easily, a way to do By group processing.
And that the number of records (rows) was so limited (64K) that forced the proliferation of columns to do some things.
SAS display formats are one of the very powerful and less obvious tools. Groups created by format assignment are honored by most analysis, reporting and graphing procedures. So with a change of format you can, with date date, change from summaries by calendar day, year/week with some serious caveats about the definition of week, month and year, year, or with data that has multiple years all of the months (ALL Januaries for example). No additional variables need to be created (most times).
Suppose you have data with peoples age. You can get summaries by all sorts of age combinations just by creating a custom format to reflect the boundaries of the age groups. Or for the occasions that you may actually need to create a new variable using a custom format + Put function can keep code cleaner by having the group logic in a format, which is often much more compact then a lot of if/then/else statements. Which also has the added benefit of not having to change code in a bunch of places. Change the format in one place and every place you need it the new values are used.
Thank you so much for the additional comments. I'm pretty solid in Excel, so yes--
Our data, and my brain, definitely are stuck in spreadsheet mode.
The data comes from an ancient Progress database as .txt, and I modified an existing very long SAS program to bring only the variables needed into my dataset. From there, its been trial and error, and searching the SAS communities for ideas.
The Progress data is from drop down fields and/or codes for confidential data, along with counts and dollar amounts.
the spreadsheet attached with the OP shows the variables needed. I've used formats in the class/ practice setting, but hadn't considered that changing formats would simplify this process. I'll look into date formats, and look more closely at the example you shared, which I ran yesterday.
The goal is to track overall quarterly trends by subject, count, and $$, and be able to drill down into region, and account(if one account is drastically effecting the trend). We have the same process every quarter, so its well worth creating a program, or group of programs that can benefit the whole organization.
I realize my shared code and even my question may have looked like jibberish to skilled programmers, and greatly appreciate you taking the time to respond.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.