BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Christobal
Calcite | Level 5

Hello All,

I have the following question: I have a table that contains the utilisation of a server per day. This table contains data for several servers:

DateServerUsage
04.09.2013server1104
05.09.2013server1120
06.09.2013server1114
07.09.2013server1100
08.09.2013server199
09.09.2013server1100
10.09.2013server1103
11.09.2013server1103
12.09.2013server1100
04.09.2013server240
05.09.2013server241
06.09.2013server244
07.09.2013server239
08.09.2013server240
09.09.2013server240
10.09.2013server249
11.09.2013server241
12.09.2013server243

Now I want to find out the difference between the utilisation of each server per day in %. See this example:

DateServerUsage %
04.09.2013server1104
05.09.2013server1120 15,38%
06.09.2013server1114 -5,00%
07.09.2013server1100 -12,28%
08.09.2013server199 -1,00%
09.09.2013server1100 1,01%
10.09.2013server1103 3,00%
11.09.2013server1103 0,00%
12.09.2013server1100 -2,91%
04.09.2013server240
05.09.2013server241 2,50%
06.09.2013server244 7,32%
07.09.2013server239 -11,36%
08.09.2013server240 2,56%
09.09.2013server240 0,00%
10.09.2013server249 22,50%
11.09.2013server241 -16,33%
12.09.2013server243 4,88%

Is there any Procedure or something else that can help me with this question?

Background is, that I want to make an analysis about all the servers that have an increase/decrease in usage of more that 10% compared with the day before.

Thanks in advance, best regards

Christobal

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

I think you could use PROC EXPAND but I don't have the proper SAS bits that includes PROC EXPAND.  It is easy to do with data step.

data cpu;
   input Date:ddmmyy10. Server $ Usage;
   format date ddmmyyp10.;
  
cards;
04.09.2013 server1 104
05.09.2013 server1 120
06.09.2013 server1 114
07.09.2013 server1 100
08.09.2013 server1 99
09.09.2013 server1 100
10.09.2013 server1 103
11.09.2013 server1 103
12.09.2013 server1 100
04.09.2013 server2 40
05.09.2013 server2 41
06.09.2013 server2 44
07.09.2013 server2 39
08.09.2013 server2 40
09.09.2013 server2 40
10.09.2013 server2 49
11.09.2013 server2 41
12.09.2013 server2 43
;;;;
   run;
proc print;
  
run;
data cpu2;
   set cpu;
   by server date;
   dif = dif(usage);
   lag = lag(usage);
  
if first.server then call missing(dif,lag);
   pchg = dif/lag;
  
format pchg percentn10.2;
  
run;
proc print;
  
run;

View solution in original post

7 REPLIES 7
DJM_SAS
Calcite | Level 5

The first thing that comes to my mind is to sort by date/server, then use the data step with something like the lag function.

So for example, something like:


PROC SORT data=yourdataset;

BY ASCENDING date server; /* hope I've got that ascending syntax right */

run;

Data yourresults;

set yourdataset;

BY date server;

/* formula for a percentage as per your example...just format it into your required percentage format */

percentage = (usage/lag(usage))-1;

/* just a throwaway conditional to make the first observation missing rather than calculate the percentage change from the previous server when you're on the first observation of a new server */

/* I can't actually remember what happens when using the lag function on the first observation, so you might have to change all this, but it should be a good working framework solution */

if first.server then call missing(percentage);

run;

Then you can filter on those observations showing more than a 10% movement if you want to do further analysis...

Edit:  There's a SAS global forum paper on the lag function that appears to be doing almost exactly what you want to do here.

Edit2: Forgot the BY statement, woops!

DanK
Calcite | Level 5

Consider using a SAS Hash Object.  You can define your server list as a hash object.  Read in the same list.  For each server read from the SET, find the corresponding Hash record based on day-1 then calculate your percent difference and save the observation.

data_null__
Jade | Level 19

I think you could use PROC EXPAND but I don't have the proper SAS bits that includes PROC EXPAND.  It is easy to do with data step.

data cpu;
   input Date:ddmmyy10. Server $ Usage;
   format date ddmmyyp10.;
  
cards;
04.09.2013 server1 104
05.09.2013 server1 120
06.09.2013 server1 114
07.09.2013 server1 100
08.09.2013 server1 99
09.09.2013 server1 100
10.09.2013 server1 103
11.09.2013 server1 103
12.09.2013 server1 100
04.09.2013 server2 40
05.09.2013 server2 41
06.09.2013 server2 44
07.09.2013 server2 39
08.09.2013 server2 40
09.09.2013 server2 40
10.09.2013 server2 49
11.09.2013 server2 41
12.09.2013 server2 43
;;;;
   run;
proc print;
  
run;
data cpu2;
   set cpu;
   by server date;
   dif = dif(usage);
   lag = lag(usage);
  
if first.server then call missing(dif,lag);
   pchg = dif/lag;
  
format pchg percentn10.2;
  
run;
proc print;
  
run;
P_Sharma
Calcite | Level 5

You can use Lag function for that. proc sort data=cpu; by server date; run; data test (drop=pc); set cpu; by server date; u_lag=lag(usage); if first.server then u_lag = .; if u_lag ne . and usage ne . then pct=((usage/u_lag)-1)*100; if pct ne . then pc=abs(pct); if pc ge 10; run;

Haikuo
Onyx | Level 15

If there is no skipping in dates, the following Proc sQL seems to work:

data have;

input Date :ddmmyy10. Server$ Usage;

format date ddmmyy10.;

cards;

04.09.2013 server1 104

05.09.2013 server1 120

06.09.2013 server1 114

07.09.2013 server1 100

08.09.2013 server1 99

09.09.2013 server1 100

10.09.2013 server1 103

11.09.2013 server1 103

12.09.2013 server1 100

04.09.2013 server2 40

05.09.2013 server2 41

06.09.2013 server2 44

07.09.2013 server2 39

08.09.2013 server2 40

09.09.2013 server2 40

10.09.2013 server2 49

11.09.2013 server2 41

12.09.2013 server2 43

;

/*no skipping in dates*/

proc sql;

  create table _10percent as

select a.*, (a.usage-b.usage)/b.usage as Percent format=percent10.2 from have a

  left join have b

on a.server=b.server

  and a.date-b.date=1

/* and abs(calculated percent) >0.1*/ /*add this line if only want those larger than 10%*/

;

quit;

If there is skipping in dates, and you want to compare between rows, then you would need an additional step or monotonic() to put in an index for your data before using Proc SQL.

Christobal
Calcite | Level 5

Hello all,

thanks a lot for your answers and sorry for not answering earlier. I will try out your suggestions and give a feedback then.

Best regards

Christobal

Christobal
Calcite | Level 5

Ok, now I tried this solution:

data cpu2;
   set cpu;
   by server date;
   dif = dif(usage);
   lag = lag(usage);
  
if first.server then call missing(dif,lag);
   pchg = dif/lag;
  
format pchg percentn10.2;
  
run;


This is exactly what I was looking for. Thanks a lot again to all of you for your answers!!!!!!!!!!


Best regards


Christbal

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1036 views
  • 3 likes
  • 6 in conversation