- Learning Apache Apex
- Thomas Weise Munagala V. Ramanath David Yan Kenneth Knowles
- 395字
- 2021-07-02 22:38:35
Real-time insights for Advertising Tech (PubMatic)
Companies in the advertising technology (AdTech) industry need to address data increasing at breakneck speed, along with customers demanding faster insights and analytical reporting.
PubMatic is a leading AdTech company providing marketing automation for publishers and is driven by data at a massive scale. On a daily basis, the company processes over 350 billion bids, serves over 40 billion ad impressions, and processes over 50 terabytes of data. Through real-time analytics, yield management, and workflow automation, PubMatic enables publishers to make smarter inventory decisions and improve revenue performance. Apex is used for real-time reporting and for the allocation engine.
In PubMatic's legacy batch processing system, there could be a delay of five hours to obtain updated data for their key metrics (revenues, impressions and clicks) and a delay of nine hours to obtain data for auction logs.
PubMatic decided to pursue a real-time streaming solution so that it could provide publishers, demand side platforms (DSPs), and agencies with actionable insights as close to the time of event generation as possible. PubMatic's streaming implementation had to achieve the following:
- Ingest and analyze a high volume of clicks and views (200,000 events/sec) to help their advertising customers improve revenues
- Utilize auction and client log data (22 TB/day) to report critical metrics for campaign monetization
- Handle rapidly increasing network traffic with efficient utilization of resources
- Provide a feedback loop to the ad server for making efficient ad serving decisions.
This high volume data would need to be processed in real-time to derive actionable insights, such as campaign decisions and audience targeting.
PubMatic decided to implement its real-time streaming solution with Apex based on the following factors:
- Time to value - the solution was able to be implemented within a short time frame
- The Apex applications could run on PubMatic's existing Hadoop infrastructure
- Apex had important connectors (files, Apache Kafka, and so on) available out of the box
- Apex supported event time dimensional aggregations with real-time query capability
With the Apex-based solution, deployed to production in 2014, PubMatic's end-to-end latency to obtain updated data and metrics for their two use cases fell from hours to seconds. This enabled real-time visibility into successes and shortcomings of its campaigns and timely tuning of models to maximize successful auctions.
Additional Resources
- Video: PubMatic presents High Performance AdTech Use Cases with Apache Apex at https://www.youtube.com/watch?v=JSXpgfQFcU8
- Slides: https://www.slideshare.net/ashishtadose1/realtime-adtech-reporting-targeting-with-apache-apex
- Hands-On Internet of Things with MQTT
- TestStand工業(yè)自動化測試管理(典藏版)
- 微型計算機控制技術
- 大數據技術入門(第2版)
- 自主研拋機器人技術
- WordPress Theme Development Beginner's Guide(Third Edition)
- 單片機C語言程序設計完全自學手冊
- 零起點學西門子S7-200 PLC
- 人工智能:語言智能處理
- 大數據導論
- 貫通Java Web輕量級應用開發(fā)
- ADuC系列ARM器件應用技術
- 基于Proteus的PIC單片機C語言程序設計與仿真
- JSP通用范例開發(fā)金典
- ARM嵌入式系統開發(fā)完全入門與主流實踐