Data EngineerHQ - San Francisco · View all jobs
We're looking to hire our first data engineer to lay the foundation for all aspects of Mixmax’s data infrastructure from end to end. You’ll be a key communicator, working both cross-functionally in our San Francisco office and across our distributed organization.
As a data engineer you’ll help start a team specifically focused on ensuring the company runs on accurate and repeatable data. This means being a member of a team that values continuous and collective learning, culture over process, data driven development and always asking tons of questions. We actively blog about our work, contribute to open source, sponsor Open Collectives, and host/present at meetups - we actively encourage you to do the same and under your own name.
As a data engineer, you’ll:
- Design, build, and support data-centric services including but not limited to event streaming, ETL pipelines, distributed data storage, and real-time data processing.
- Work on high impact projects that optimize data availability and quality, and provide reliable access to data across the company.
- Collaborate with partner teams to understand their business contexts and analytical challenges, and to transform and sprinkle data-driven fairy dust on their products.
- Develop machine-learning software using analytical data models that can generalize across Mixmax customers, but can automatically adapt to each of their individual features.
- Communicate strategies and processes around data modeling and architecture to multi-functional groups.
- Work with fellow engineers to build out other parts of the data infrastructure, effectively communicating your needs and understanding theirs.
Requirements and skills you possess:
- Exceptional coding and design skills, particularly in Java/Scala or Python.
- Extensive previous experience of working with large data volumes, including processing, transforming and transporting large-scale datasets for analytics and business purposes.
- Extensive experience data warehousing and ETL pipelines.
- Great communication and collaboration skills.
- Previous experience with AWS like EC2, RDS, S3, Redshift, SNS, SQS, etc.
- Previous experience with high volume heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, and Cassandra
- Previous experience building out high volume, distributed event systems (such as working w/ Kafka or similar)
Bonus points ++:
- Experience using terraform or other infrastructure as code
- Contributor to open source technologies
Diversity and inclusion are core to our culture, and we are actively committed to building a more inclusive work environment. If you are a member of an underrepresented group in technology, we strongly encourage you to apply.