Senior Data Engineer

Full time java linux hadoop python redhat
Software Development
United States
Hiring from: United States

About this job

Location options: Remote
Job type: Full-time
Role: System Administrator
Industry: Computer Software, Digital Pathology, Health Care
Company size: 51–200 people
Company type: VC Funded


java, linux, hadoop, python, redhat

Job description

We’re seeking a Senior Data Engineer who will be working the development and support of software applications, tools and data management pipelines for research and clinical purposes. Following modern product development practices, you will also assist in the design, implementation and maintenance of tools that extract and manipulate data from various sources, including in-house and external databases. This is an extraordinary opportunity to be part of a high-performing team and to pursue a life-changing mission with unique technical challenges!

This position can be fully remote for Canadian and US - based applicants.  We are working remotely now, but when it is safe to return, you may also opt to work from our NYC office


  • Work on Data Warehouse, Data Lake and BI projects and architectures at Paige.

  • Create and implement ETL pipelines that enables the extraction, transformation and transfer of large amounts of structured and unstructured data from various filesystems and databases, that are destined for the development of computation pathology algorithms.

  • Handle the challenges that come with managing terabytes of data.

  • Build tools to manage, automate and monitor our data and data processing infrastructure.

  • Design and develop software tools into existing resources. Be responsible for design, coding, testing, packaging, debugging, documentation and deployment of software systems.

  • Work independently to produce required functional, technical, and user documentation (e.g., business requirements, functional and technical specifications, system architecture, data flows, end-users training requirements) on assigned projects.

  • Work and collaborate with data engineers, scientists, engineers, IT operations and medical doctors to build tools manipulating data in order to build a new generation of artificial intelligence applications for cancer detection and treatment.


  • Experience in architecting, implementing and testing data processing pipelines (e.g. Spark, Beam, ...) and data mining / data science algorithms either on-premise or on a cloud environment.

  • Experience in administrating and ingesting data into standard data warehouses (e.g. Amazon Redshift, Microsoft SQL Server, Google BigQuery or Snowflake).

  • Experience architecting data warehouses and/or data lakes for large amounts of structured and unstructured data.

  • Experience with data lakes and expertise with designing and maintaining a BI solutions.

  • Experience with workflow management tools and platforms, such as Airflow.

  • Extensive experience in Python programming, or related languages.

  • Experience with RDBMS and NoSQL databases (e.g. MongoDB).

  • Experience in packaging and deploying applications on-premise and in the cloud (e.g. AWS).

  • Familiarity with modern development practices and DevOps.

  • Interest in building non-standard medical software applications, in collaboration with medical partners. Cross-disciplinary and strong analytic skills.

  • Master’s degree in computer science or a related field, or equivalent years of experience.

  • 6+ years of industry experience as a software/data engineer.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

***Must reside in Utah, Arizona, Idaho, Nevada, Oklahoma or Texas (others will not be considered)   About Us: Slingshot Technology, Inc is one of the best Utah start-ups, and we've been recognized by Inc Magazine as one of America's fastest growing...
Software Development
United States
Hiring from: Anywhere
Full time
About this job Location options: RemoteJob type: Full-timeExperience level: Mid-LevelRole: Data ScientistTechnologies google-cloud-platform, data-science, java, scala, python Job description Company Description: Flashpoint is the globally trusted leader in risk intelligence for organizations that demand the fastest, most comprehensive coverage of...
google-cloud-platform data-science java scala python
Software Development
United States
Hiring from: United States
About this job Location options: RemoteJob type: Full-timeTechnologies ruby, ruby-on-rails, automation Job description We’re looking for a motivated developer, with experience or junior, willing to join our QA team. The products that iubenda distributes are delivered across billions of page...
ruby ruby-on-rails automation
Software Development
Hiring from: Italy