- Learning Quantitative Finance with R
- Dr. Param Jeet Prashant Vats
- 356字
- 2021-07-09 19:06:52
Sampling
When building any model in finance, we may have very large datasets on which model building will be very time-consuming. Once the model is built, if we need to tweak the model again, it is going to be a time-consuming process because of the volume of data. So it is better to get the random or proportionate sample of the population data on which model building will be easier and less time-consuming. So in this section, we are going to discuss how to select a random sample and a stratified sample from the data. This will play a critical role in building the model on sample data drawn from the population data.
Random sampling
Select the sample where all the observation in the population has an equal chance. It can be done in two ways, one without replacement and the other with replacement.
A random sample without replacement can be done by executing the following code:
> RandomSample <- Sampledata[sample(1:nrow(Sampledata), 10, >+ replace=FALSE),]
This generates the following output:
Figure 2.6: Table shows random sample without replacement
A random sample with replacement can be done by executing the following code. Replacement means that an observation can be drawn more than once. So if a particular observation is selected, it is again put into the population and it can be selected again:
> RandomSample <- Sampledata[sample(1:nrow(Sampledata), 10, >+ replace=TRUE),]
This generates the following output:
Figure 2.7: Table showing random sampling with replacement
Stratified sampling
In stratified sampling, we pide the population into separate groups, called strata. Then, a probability sample (often a simple random sample) is drawn from each group. Stratified sampling has several advantages over simple random sampling. With stratified sampling, it is possible to reduce the sample size in order to get better precision.
Now let us see how many groups exist by using Flag
and Sentiments
as given in the following code:
>library(sampling) >table(Sampledata$Flag,Sampledata$Sentiments)
The output is as follows:
Figure 2.8: Table showing the frequencies across different groups
Now you can select the sample from the different groups according to your requirement:
>Stratsubset=strata(Sampledata,c("Flag","Sentiments"),size=c(6,5, >+4,3), method="srswor") > Stratsubset
The output is as follows:
Figure 2.9: Table showing output for stratified sampling
- 高效能辦公必修課:Word圖文處理
- 大學(xué)計(jì)算機(jī)信息技術(shù)導(dǎo)論
- 輕松學(xué)Java
- 機(jī)器人智能運(yùn)動(dòng)規(guī)劃技術(shù)
- 城市道路交通主動(dòng)控制技術(shù)
- Maya極速引擎:材質(zhì)篇
- 完全掌握AutoCAD 2008中文版:綜合篇
- 四向穿梭式自動(dòng)化密集倉(cāng)儲(chǔ)系統(tǒng)的設(shè)計(jì)與控制
- 中國(guó)戰(zhàn)略性新興產(chǎn)業(yè)研究與發(fā)展·增材制造
- Applied Data Visualization with R and ggplot2
- 基于神經(jīng)網(wǎng)絡(luò)的監(jiān)督和半監(jiān)督學(xué)習(xí)方法與遙感圖像智能解譯
- Learning Linux Shell Scripting
- 中國(guó)戰(zhàn)略性新興產(chǎn)業(yè)研究與發(fā)展·數(shù)控系統(tǒng)
- PowerPoint 2010幻燈片制作高手速成
- 計(jì)算機(jī)硬件技術(shù)基礎(chǔ)學(xué)習(xí)指導(dǎo)與練習(xí)