
Staff Data Engineer
brightwheelPosted 2/14/2025

Staff Data Engineer
brightwheel
Job Location
Job Summary
Brightwheel is seeking a Staff Data Engineer to join their team. As a key role in the implementation and evolution of their web scraping and data platform, you will be responsible for crafting and implementing a best-in-class web scraping strategy and infrastructure. You will build and scale pipelines that garner millions of records across hundreds of sites, stored as measurable data that enable insights for their Analytics team and customers. With 5+ years of experience in data engineering, coding in Python, and building web scraping tools, you will be a technical leader responsible for delivering value to the business. Brightwheel offers a competitive compensation package, including base salary, equity, and benefits, as well as flexible remote work options and a commitment to diversity and inclusion.
Job Description
What You'll Do
- Use modern tooling to build robust, extensible, and performant web scraping platform
- Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
- Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
- Troubleshoot, improve and scale existing data pipelines, models and solutions
- Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.
Qualifications, Skills, & Abilities
- 5+ years of work experience as a data engineer/full stack engineering, coding in Python.
- 5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
- 3-5 years of deployment experience with CI/CD
- Strong experience of HTML, CSS, JavaScript, and browser behavior.
- Experience with RESTful APIs and JSON/XML data formats.
- Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
- Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)
- Excellent analytical, problem solving, and troubleshooting skills to manage complex process and technology issues without guidance
- 2+ years of experience developing in Airflow
- 2+ deploying Infrastructure as Code within AWS or similar
- 2+ deploying microservices and/or APIs within cloud environment
- 1+ years using ML / AI workflows for data enrichment and/or sentiment analysis by integrating scraped data into ML pipelines.