官术网_书友最值得收藏!

Answering our initial question

We have finally arrived at some models that we think can represent the underlying process best. It is now a simple matter of finding out when our infrastructure will reach 100,000 requests per hour. We have to calculate when our model function reaches the value of 100,000. Because both models (degree 2 and 3) were so close together, we will do it for both.

With a polynomial of degree 2, we could simply compute the inverse of the function and calculate its value at 100,000. Of course, we would like to have an approach that is easily applicable to any model function.

This can be done by subtracting 100,000 from the polynomial, which results in another polynomial, and finding its root. SciPy's optimize module has the fsolve function to achieve this, when provided an initial starting position with the x0 parameter. As every entry in our input data file corresponds to one hour, and we have 743 of them, we set the starting position to some value after that. Let fbt2 be the winning polynomial of degree 2:

>>> fbt2 = np.poly1d(np.polyfit(xb[train], yb[train], 2))
>>> print("fbt2(x)= n%s" % fbt2)
fbt2(x)=

2
0.05404 x - 50.39 x + 1.262e+04

>>> print("fbt2(x)-100,000= n%s" % (fbt2-100000))
fbt2(x)-100,000=

2
0.05404 x - 50.39 x - 8.738e+04

>>> from scipy.optimize import fsolve
>>> reached_max = fsolve(fbt2-100000, x0=800)/(7*24)

>>> print("100,000 hits/hour expected at week %f" % reached_max[0])
100,000 hits/hour expected at week 10.836350

It is expected to have 100,000 hits/hour at week 10.836350, so our model tells us that, given the current user behavior and traction of our start-up, it will take a couple more weeks for us to reach our capacity threshold.

Of course, there is a certain uncertainty involved with our prediction. To get a real picture of it, one could draw in more sophisticated statistics to find the variance we can expect when looking further and further into the future.

There are also the user and underlying user behavior dynamics that we cannot model accurately. However, at this point, we are fine with the current prediction as it is good enough to answer our initial question of when we would have to increase the capacity of our system. If we then monitor our web traffic closely, we will see in time when we have to allocate new resources.

主站蜘蛛池模板: 瑞金市| 河北省| 赤壁市| 武夷山市| 天气| 南城县| 新和县| 扬中市| 汕尾市| 太保市| 杭锦旗| 浠水县| 安塞县| 夏津县| 屯门区| 西乌珠穆沁旗| 阿鲁科尔沁旗| 井研县| 宣化县| 扬州市| 武宁县| 新密市| 沈丘县| 静乐县| 玉山县| 靖西县| 怀远县| 广宁县| 麻江县| 合江县| 桓仁| 来凤县| 桃源县| 镇沅| 二连浩特市| 厦门市| 定西市| 宁乡县| 勃利县| 从江县| 收藏|