Python Engineer to Architect High-Volume Data Pipeline (Social Engagement Data)
We are a data agency looking to replace an expensive legacy vendor with an in-house solution. We need a Senior Python Developer to build a high-efficiency data pipeline that aggregates public engagement data (Likes/Comments) from professional social networks. The Goal: Build a "Glass Box" scraper that runs on our cloud infrastructure. We want full ownership of the code and direct billing for the underlying resources (Proxies/APIs). The Specs (Must Have): - Volume: Capability to process 200,000 - 300,000 lookups per week. - Inputs: We provide post URLs or Keywords. - Outputs: CSV/JSON with User Name, Headline, and Profile URL. Cost Constraint: The system must operate (infrastructure wise) for under $1,200/month at full volume. The Architecture: We believe the best approach is a Python script leveraging enterprise APIs to handle the heavy lifting (e.g., Apify, Scrapingdog, or Bright Data). We do not want a Selenium bot running on a laptop. We want a cloud-deployed script (AWS Lambda/DigitalOcean) that manages rotation and rate limits via these APIs. Requirements: Deep experience with Apify Actors or Scrapingdog. Experience with Residential Proxies (configuring bandwidth to minimize waste). Ability to parse large JSON datasets efficiently. Ownership: You build it, we own the code. To Apply: Please tell me which API or Proxy provider you would recommend to hit a volume of 300k/week while keeping ongoing tech costs under $1,200/month. Apply tot his job