官术网_书友最值得收藏!

How it works...

We begin the recipe by importing the Markovify library, a library for Markov chain computations, and reading in text, which will inform our Markov model (step 1). In step 2, we create a Markov chain model using the text. The following is a relevant snippet from the text object's initialization code:

class Text(object):

reject_pat = re.compile(r"(^')|('$)|\s'|'\s|[\"(\(\)\[\])]")

def __init__(self, input_text, state_size=2, chain=None, parsed_sentences=None, retain_original=True, well_formed=True, reject_reg=''):
"""
input_text: A string.
state_size: An integer, indicating the number of words in the model's state.
chain: A trained markovify.Chain instance for this text, if pre-processed.
parsed_sentences: A list of lists, where each outer list is a "run"
of the process (e.g. a single sentence), and each inner list
contains the steps (e.g. words) in the run. If you want to simulate
an infinite process, you can come very close by passing just one, very
long run.
retain_original: Indicates whether to keep the original corpus.
well_formed: Indicates whether sentences should be well-formed, preventing
unmatched quotes, parenthesis by default, or a custom regular expression
can be provided.
reject_reg: If well_formed is True, this can be provided to override the
standard rejection pattern.
"""

The most important parameter to understand is state_size = 2, which means that the Markov chains will be computing transitions between consecutive pairs of words. For more realistic sentences, this parameter can be increased, at the cost of making sentences appear less original. Next, we apply the Markov chains we have trained to generate a few example sentences (steps 3 and 4). We can see clearly that the Markov chains have captured the tone and style of the text. Finally, in step 5, we create a few tweets in the style of the airport reviews using our Markov chains.

主站蜘蛛池模板: 邻水| 冷水江市| 成安县| 莎车县| 名山县| 天峨县| 科尔| 永胜县| 芦山县| 祁东县| 申扎县| 定州市| 巴南区| 庆城县| 石狮市| 车致| 云霄县| 镇沅| 图木舒克市| 普安县| 绥中县| 中江县| 牡丹江市| 昌宁县| 岳池县| 延长县| 休宁县| 高阳县| 东乡县| 巴彦淖尔市| 原阳县| 台湾省| 吉水县| 桃源县| 神农架林区| 渭源县| 陈巴尔虎旗| 洮南市| 嘉善县| 崇礼县| 扶沟县|