- Learning Apache Apex
- Thomas Weise Munagala V. Ramanath David Yan Kenneth Knowles
- 276字
- 2021-07-02 22:38:40
CheckpointListener/CheckpointNotificationListener
This interface provides hooks for the operator developer to align behavior with the engine's checkpointing mechanism (which will be covered in detail in the Chapter 5, Fault Tolerance and Reliability). For now, it will be sufficient to understand that during checkpointing the state of the operator is externalized and saved to durable storage. In order to optimize this and align any additional fault tolerance support that the operator requires, the interfaces provide:
- beforeCheckpoint: This is called before the engine extracts the state from the operator. This is an opportunity to update the state. For example, suppose the operator implementation needs to write state to a write ahead log and needs to remember the file name(s) and offset range(s) so that when there is a need to recover, it knows what to read. In this example, the operator would flush pending data to the file and the update it's state containing the offset information, so that the meta information is checkpointed and available during recovery.
- checkpointed: This is called immediately after a checkpoint is completed. This could be used to optimize resources or record information about a completed checkpoint. In most cases, the operator or developer would implement beforeCheckpoint, however.
- committed: This is called with the latest known streaming window interval identifier that was fully processed in the entire DAG and therefore will never need to be replayed during recovery. This is an opportunity for operator implementations to finalize state and make it available to external consumers. For example, a file writer could close intermediate files and move/rename them to the final location. Alternatively, incremental state saving mechanisms can materialize that state and drop recovery logs.
推薦閱讀
- Ansible Configuration Management
- 軟件架構設計
- Seven NoSQL Databases in a Week
- 程序設計缺陷分析與實踐
- Matplotlib 3.0 Cookbook
- 完全掌握AutoCAD 2008中文版:綜合篇
- DevOps:Continuous Delivery,Integration,and Deployment with DevOps
- 大數據驅動的設備健康預測及維護決策優化
- ESP8266 Home Automation Projects
- 水下無線傳感器網絡的通信與決策技術
- DevOps Bootcamp
- Chef:Powerful Infrastructure Automation
- 智能生產線的重構方法
- 大數據案例精析
- 重估:人工智能與賦能社會