Full stack Engineer (Web Crawling and Automation)

Full-time

Remote

Time is our most precious resource. Recs is a recommendation layer built to help people spend theirs on what's actually worth it. We cut through the noise so people can find what fits their lives, make better choices, and avoid wasting time on things that don't deliver.

About the role

We’re looking for a Senior Full Stack Engineer focused on web crawling and automation to build API-driven, end-to-end data acquisition systems that interact with real-world websites using appropriate, target-specific interaction strategies.

Our crawling systems power other internal services via on-demand REST APIs. They handle a wide range of targets, including openly accessible websites, login-protected platforms, and paywalled content. Depending on the site, crawlers may use anything from high-throughput programmatic access to more stateful or behavior-aware automation when required.

This role goes beyond basic browser automation. You’ll own the full lifecycle of crawled data (from interaction and extraction to parsing, normalization, indexing, and API delivery) ensuring the data is reliable, structured, and usable by downstream systems.

What you’ll do

Before diving into the technical responsibilities, here are the traits we value most:

  • Candor: You communicate directly and honestly in service of better outcomes.

  • Conscientiousness: You take ownership, respect teammates, and build systems others can rely on.

  • First-principles thinking: You question assumptions and make decisions grounded in evidence.

In this role, you will:

  • Design and build end-to-end web crawling systems exposed as REST APIs

  • Implement browser-based automation using tools such as Playwright, Puppeteer, or similar

  • Build crawlers that adapt their interaction model based on the target, which may include:

    • Programmatic, high-throughput access where appropriate

    • Session-aware and stateful navigation flows

    • Realistic timing or behavior-aware interaction when required

  • Handle sites that require:

    • Authentication and login workflows

    • Persistent sessions and identity management

    • JavaScript-heavy or dynamically rendered content

  • Develop robust parsing and extraction pipelines to convert raw web data into structured formats

  • Design and maintain data normalization, enrichment, and validation workflows

  • Implement indexing strategies to make crawled data searchable, performant, and reliable

  • Build backend services and APIs that expose crawled and indexed data to internal consumers

  • Monitor, debug, and improve crawl correctness, stability, and cost efficiency

  • Collaborate with other engineers to integrate crawling pipelines into larger product workflows

  • Contribute to CI/CD pipelines, observability, and operational tooling

Who you are

You think of crawling as a system, not a script. You understand that different targets require different approaches and enjoy reasoning about trade-offs between speed, reliability, realism, and cost.

You’re comfortable debugging non-deterministic failures, working with imperfect or inconsistent data, and owning systems end-to-end, from first request to final API response.

You care about data quality and long-term maintainability. You think about schemas, indexing, and downstream consumers as part of the core problem, not an afterthought.

Required qualifications

  • Strong professional experience with JavaScript/TypeScript and/or Python

  • Proven experience building production-grade crawling or browser automation systems

  • Hands-on experience with Playwright, Puppeteer, Selenium, or similar

  • Experience designing API-driven crawling services

  • Strong understanding of:

  • Browser behavior and JavaScript execution

  • Sessions, cookies, headers, and authentication flows

  • Experience building parsing, normalization, and data processing pipelines

  • Backend experience building services and REST and/or GraphQL APIs

  • Experience working with relational and/or NoSQL databases

  • Proficiency with Git and collaborative development workflows

Nice-to-have skills

  • Experience designing adaptive interaction strategies for complex or sensitive websites

  • Experience crawling large or complex platforms (dynamic, authenticated, or paywalled)

  • Search and indexing systems (Elasticsearch, OpenSearch, or similar)

  • Distributed or queue-based processing systems

  • Experience with Rust for performance-critical components

  • Containerization and cloud infrastructure (Docker, AWS, GCP, or similar)

  • Observability tooling (logging, metrics, tracing)

  • CI/CD pipeline experience

  • Experience integrating AI/ML services into extraction, enrichment, or classification workflows

What we offer

  • A high-trust, remote-first engineering culture

  • End-to-end ownership of complex, business-critical systems

  • A team that values clear thinking, technical rigor, and direct communication

  • Room to influence architecture and technical direction

  • Competitive compensation based on experience and impact

Submit application for

Full stack Engineer (Web Crawling and Automation)