Home Jobs Lead Data Engineer

Lead Data Engineer

New York, US

Permanent

$140k - $180k

Apply for role

About Us:

The company is a high-growth marketing technology company based in New York City, founded in 2013. They have developed a proprietary social graph platform that uses publicly available data to map over 25 billion social relationships between more than 250 million adult Americans. Their platform helps major corporations, national non-profit organizations, and high-profile political campaigns with their highest-stakes marketing challenge: reaching hard-to-reach audiences of decision-makers, spanning corporate executives, legislators, regulators, investors, members of the media, and more.

A few quick facts about them:

  • They have generated over $30M in revenue, with annual revenues in excess of $10M.
  • Their senior leadership team includes two members of the 2021 Forbes 30 under 30 list for Marketing and Advertising, a former senior White House advisor, one of the earliest sales leaders at Google and Twitter, and two of the Democratic Party’s most successful pollsters and strategists
  • Their investors include a global sports, entertainment and marketing giant.  A founder of Palantir, senior engineering leaders at Twitter and Yelp, past CEOs of Fortune 50 companies, and a number of well-known venture capitalists.
  • Current and past clients include Blackstone, Boeing, KKR, the Environmental Defense Fund, the Business Roundtable, Governor John Kasich, Congressman Conor Lamb and Republican Voters Against Trump.
  • Their work has been featured on Morning Joe on MSNBC, Bloomberg and The Colbert Report, and in Axios, BusinessWeek, the Associated Press, Forbes, the Washington Post, and Politico, among many others.

About the Role:

As Lead Data Engineer, you will join a team of talented developers and architect the core infrastructure supporting our b2b data products. You will collaborate closely with application developers to align our data strategy with business requirements.

Who You Are:

You are a software developer with broad experience but who specializes in data modeling, ETL processes and distributed computing. You work with data warehouses, data lakes and all flavors of databases and know how to orchestrate/automate jobs managing big data.

Qualifications:

  • You have 4+ years of professional software development experience
  • You have strong computer science fundamentals
  • You have strong software development skills in a server-side language (Python, Java, Scala, etc…)
  • You have designed relational schemas and worked with no-sql databases. You are intimate with data storage formats (JSON, Parquet, Avro, etc…)
  • You have experience tuning/indexing databases including column-based systems
  • You are a SQL expert
  • You have worked with distributed computing frameworks such as Spark, Hadoop, etc…
  • You provision your own cloud resources
  • You have architected data infrastructure and ETL pipelines
  • You are an excellent verbal and written communicator

Preferred Qualifications:

  • Your language of choice is Python
  • You have ample experience leveraging PySpark
  • You have a background in data science and know entity resolution and graph theory
  • You have an interest in application development
  • You are most familiar with AWS

Apply for role