- Building Machine Learning Systems with Python
- Luis Pedro Coelho Willi Richert Matthieu Brucher
- 392字
- 2021-07-23 17:11:20
Answering our initial question
We have finally arrived at some models that we think can represent the underlying process best. It is now a simple matter of finding out when our infrastructure will reach 100,000 requests per hour. We have to calculate when our model function reaches the value of 100,000. Because both models (degree 2 and 3) were so close together, we will do it for both.
With a polynomial of degree 2, we could simply compute the inverse of the function and calculate its value at 100,000. Of course, we would like to have an approach that is easily applicable to any model function.
This can be done by subtracting 100,000 from the polynomial, which results in another polynomial, and finding its root. SciPy's optimize module has the fsolve function to achieve this, when provided an initial starting position with the x0 parameter. As every entry in our input data file corresponds to one hour, and we have 743 of them, we set the starting position to some value after that. Let fbt2 be the winning polynomial of degree 2:
>>> fbt2 = np.poly1d(np.polyfit(xb[train], yb[train], 2))
>>> print("fbt2(x)= n%s" % fbt2)
fbt2(x)=
2
0.05404 x - 50.39 x + 1.262e+04
>>> print("fbt2(x)-100,000= n%s" % (fbt2-100000))
fbt2(x)-100,000=
2
0.05404 x - 50.39 x - 8.738e+04
>>> from scipy.optimize import fsolve
>>> reached_max = fsolve(fbt2-100000, x0=800)/(7*24)
>>> print("100,000 hits/hour expected at week %f" % reached_max[0])
100,000 hits/hour expected at week 10.836350
It is expected to have 100,000 hits/hour at week 10.836350, so our model tells us that, given the current user behavior and traction of our start-up, it will take a couple more weeks for us to reach our capacity threshold.
Of course, there is a certain uncertainty involved with our prediction. To get a real picture of it, one could draw in more sophisticated statistics to find the variance we can expect when looking further and further into the future.
There are also the user and underlying user behavior dynamics that we cannot model accurately. However, at this point, we are fine with the current prediction as it is good enough to answer our initial question of when we would have to increase the capacity of our system. If we then monitor our web traffic closely, we will see in time when we have to allocate new resources.
- Arduino入門基礎教程
- 顯卡維修知識精解
- Instant uTorrent
- 嵌入式技術基礎與實踐(第5版)
- The Applied AI and Natural Language Processing Workshop
- 分布式系統與一致性
- 微軟互聯網信息服務(IIS)最佳實踐 (微軟技術開發者叢書)
- 筆記本電腦應用技巧
- Managing Data and Media in Microsoft Silverlight 4:A mashup of chapters from Packt's bestselling Silverlight books
- 單片微機原理及應用
- 單片機原理與技能訓練
- FreeSWITCH Cookbook
- 嵌入式系統原理及應用:基于ARM Cortex-M4體系結構
- 可編程邏輯器件項目開發設計
- Drupal Rules How-to