Mô tả công việc
Job Responsibilities
Build, operate, and maintain web scraping/crawling systems for sports-related data sources such as live scores, match schedules, betting odds, livestreams, news, player/team statistics, etc.
Develop real-time data collection pipelines with low latency (sub-second for livescore systems) while ensuring 24/7 stability and reliability.
Design and optimize scraping workflows including distributed jobs, queues, retry logic, monitoring, and alerting when data sources fail or change structure.
Implement anti-detection and anonymity techniques, including:
Proxy rotation (residential/datacenter/mobile)
User-agent & fingerprint rotation
TLS/JA3 spoofing
Bypassing Cloudflare, Akamai, PerimeterX, and DataDome
CAPTCHA solving
Headless browser stealth
Reverse-engineer private APIs, m3u8/HLS streams, and WebSocket feeds from sports data providers.
Standardize, clean, and push collected data into storage systems such as MySQL, MongoDB, Redis, RabbitMQ, etc.
Monitor source structure changes and quickly patch scrapers when targets update DOM structures, APIs, or anti-bot systems.
Automate SEO workflows, including:
Bulk keyword research
SERP crawling
Keyword rank tracking
On-page SEO audits
Index checking
Detecting Google algorithm changes
Participate in developing domain evaluation and domain hunting systems similar to SpamZilla / ODYS / Domcop / [protected info]
Collect expired/dropped/auction domains from sources such as GoDaddy Auctions, NameJet, SnapNames, DropCatch, Dynadot, Sav, [protected info], Pool, etc.
Evaluate domain quality and domain age
Detect spam signals
Generate scoring and reporting systems for domains
Automate repetitive SEO-related tasks:
Submit indexing
Create & verify Search Console / Analytics properties
Check [protected info] / sitemap / lang
Monitor uptime and Core Web Vitals
Optimize infrastructure costs (proxy, server, browser pool) while maintaining throughput and system stability.
Collaborate with Backend, Data, and DevOps teams to integrate collected data into final products.
Yêu cầu
Mandatory Requirements
Send your ENGLISH CV for this position
Minimum 2 years of experience in production-scale scraping/automation systems (not standalone scripts).
Strong proficiency in Python (Scrapy, Playwright, Selenium, httpx/aiohttp, asyncio) or [protected info] (Puppeteer, Playwright, Crawlee, undici).
Deep understanding of:
HTTP/HTTPS
TLS handshake
HTTP/2
WebSocket
Cookie/session handling
CORS
Hands-on experience bypassing common anti-bot systems:
Cloudflare (Turnstile, JS Challenge)
DataDome
PerimeterX
Akamai
Imperva
reCAPTCHA v2/v3
hCaptcha
Experience managing proxy pools (residential/mobile/ISP) and strong understanding of fingerprinting concepts:
Canvas
WebGL
Audio
Fonts
TLS/JA3
HTTP/2 frame order
Experience with headless browser stealth tools:
puppeteer-extra-stealth
playwright-stealth
patch right
camoufox
undetected-chromedriver
Ability to parse HLS/DASH streams (.m3u8, .mpd) and handle tokens and signed URLs for streaming.
Strong understanding of core SEO metrics:
DA/PA
DR/UR
TF/CF
Topical Trust Flow
Anchor ratio
Link velocity
Spam score
Experience integrating APIs from SEO platforms:
As
Data For SEO
SerAPI
WhoisXML
DomainTools
Wayback Machine CDX API
Strong experience with Linux servers, Docker, cron jobs, queue systems (Redis/RabbitMQ/Kafka), logging, and monitoring tools (Grafana/Prometheus/Sentry).
Understanding of:
Rate limiting
Distributed scraping
IP reputation
Request shaping
Preferred Qualifications
Previous experience building scraping systems for sports, betting, or livescore platforms.
Ability to reverse-engineer mobile applications using mitmproxy, Charles, or Frida to extract private APIs.
Experience with Kubernetes and autoscaling browser pools.
Familiarity with GraphQL, gRPC, and Protobuf endpoints.
Understanding of the WordPress ecosystem:
WP-CLI
REST API
XML-RPC
WordPress Database
Basic hooks/filters
Familiar with fingerprint/cookie isolation tools:
Ads Power
GoLogin
Kameleo
Octo Browser
Desired Qualities
Strong problem-solving mindset with patience and persistence.
Proactive, creative, and adaptable at work.
Careful and detail-oriented with strong awareness of system stability, monitoring, alerting, and recovery processes.
Quyền lợi
Benefits
Salary: 13 - 18 million VND (negotiable based on capability).
Full participation in social insurance, health insurance, and unemployment insurance according to Vietnamese labor law.
Public holidays and Tet holidays in accordance with Vietnamese regulations.
Opportunity to work in a highly practical environment with large-scale scraping and automation challenges.
Thông tin chung
- Thu nhập: 13 - 18 triệu VNĐ
Nơi làm việc
- AB04 Hẻm số 09, Đường số 66, phường An Khánh, Hồ Chí Minh
- (Trước sáp nhập: Thủ Đức, Hồ Chí Minh | Sau sáp nhập: An Khánh, Hồ Chí Minh)
- 1. AB04 Hẻm số 09, Đường số 66, phường An Khánh, TP.HCM
- (Trước sáp nhập: Thủ Đức, Hồ Chí Minh | Sau sáp nhập: An Khánh, Hồ Chí Minh)
Cách thức ứng tuyển
Ứng viên nộp hồ sơ trực tuyến bằng cách bấm nút Ứng tuyển bên dưới:
Hạn nộp: 20/06/2026