Research Engineer

Wikimedia Foundation, Inc.
Full time hadoop machine-learning apache-spark python javascript
Software Development
No Location
Hiring from: U.S. / Canada, Europe, East Asia, Southeast Asia, South Asia, Middle East, Central Asia, North America, South America, North Africa, East Africa, Central Africa, West Africa, Southern Africa

About this job

Location options: Remote
Job type: Full-time
Experience level: Mid-Level, Senior
Role: Data Scientist
Industry: Education Technology, eLearning, Non-Profit
Company size: 201–500 people


hadoop, machine-learning, apache-spark, python, javascript

Job description


We’re hiring a Research Engineer strongly committed to the principles of free knowledge, open source and open data, transparency, privacy, and collaboration to join the Research team.  As a Research Engineer on our team, you will support the research scientists in addressing knowledge gaps on the Wikimedia projects, supporting the Wikimedia volunteers in improving knowledge integrity, and building a more global community of Wikimedia researchers. We’re accepting applications until the 31st of August with a start date by, or before, October 30th.

You’ll work remotely with a distributed team, with members spread between Europe and North America. Here are some things we’ve worked on recently that might give you a better sense of what you could be working on:

  • We built a hyperlink recommendation algorithm (by building on past research) to support the Growth team in their newcomer task recommendations.

  • We used readers’ trajectories on Wikipedia to inform Wikipedia editors about COVID-19 related pages that readers seek to gain information from. (code)

  • We worked with the Analytics, Legal, and Security teams to find a privacy-respectful way to store COVID-19 related page-view traces beyond the 90-day limit that is our standard for purging this data. (code)

  • We ran surveys in Wikipedia across 14 languages and collected demographics data from the Wikipedia readers and their motivation and needs to study the effect of demographics on reader behavior. (ongoing results)

  • We built an NLP model to identify Unsourced Statements in Wikipedia articles. (papercode

You can learn more about what we have done in the past six month by reading our biannual report.

You will be responsible for:

  • Defining engineering projects to improve the research scientists’ workflows.  For example, in collaboration with the Legal, Security, and Analytics teams you will be developing a process for public data releases by the team.

  • Collaborating with Analytics Engineering and Machine Learning Platform teams, to improve data collection and data sanitization and processing

  • Building experimental APIs for the models developed by the team

  • Writing distributed computing code in Spark for the algorithms developed by the research scientists

  • Acting as the Research team’s engineering contact for internal and external conversations and decision making

Skills and experience:

  • Experience working as a research or data engineer on complex applied research projects

  • Comfortable with mathematics and the basics of statistics 

  • Strong understanding of Computer Science fundamentals such as: algorithms, data structures and complexity

  • Strong real world experience writing applications using one or more of the following programming languages such as Python, JavaScript, PHP, and Scala 

  • Familiarity with scientific computing libraries in Python.  Experience with open source machine learning libraries such as scikit-learn and deep learning frameworks such as Keras, TensorFlow or Pytorch

  • Experience with Hadoop and related technologies: HDFS, YARN, MapReduce, Hive, Spark, etc. (more info about our Hadoop cluster and analytics servers

  • Experience with MySQL/Postgres technologies

  • Experience developing RESTful APIs for data retrieval

  • Strong written and oral communication skills in English, including the ability to communicate complex technical issues to a cross-team and cross-functional audience

  • BS, MS, or PhD in Computer Science, Mathematics, Statistics, or a closely related engineering field; or the equivalent in related work experience

 We know that you won’t know how all of our systems work on day one. With solid fundamentals and teamwork, you will get there.

Qualities that are important to us:

  • Commitment to the mission of the organization 

  • Commitment to our guiding principles

  • Ability to disagree in a respectful manner and yet work towards a solution even when you disagree

  • Willingness to understand math and algorithms 

  • Good at async communication 

  • Solution-focused. The Wikimedia ecosystem is complex, resources are limited, and our guiding principles are ambitious. We want you to work to find solutions embracing these factors.

  • Self motivated

  • Ability to navigate through ambiguity and bring a project to completion with limited directions

  • Curiosity and commitment to learn 

Additionally, we’d love it if you have:

  • A portfolio of open source programming projects

  • Experience in label collection using crowdsourcing platforms or large-scale systems

  • Production-level experience with Hadoop, Spark, Flink, Hive, Kafka, etc.

  • Experience with web UI development (Javascript, HTML, CSS)

  • Experience working with volunteers

  • Experience editing Wikipedia or other Wikimedia or open data / knowledge projects

The Wikimedia Foundation is... 

...the nonprofit organization that hosts and operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge, free of interference. We host the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive financial support from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

As an equal opportunity employer, the Wikimedia Foundation values having a diverse workforce and continuously strives to maintain an inclusive and equitable workplace. We encourage people with a diverse range of backgrounds to apply. We do not discriminate against any person based upon their race, traits historically associated with race, religion, color, national origin, sex, pregnancy or related medical conditions, parental status, sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, or any other legally protected characteristics.

If you are a qualified applicant requiring assistance or an accommodation to complete any step of the application process due to a disability, you may contact us at or (415) 839-6885.

U.S. Benefits & Perks*

  • Fully paid medical, dental and vision coverage for employees and their eligible families (yes, fully paid premiums!)

  • The Wellness Program provides reimbursement for mind, body and soul activities such as fitness memberships, baby sitting, continuing education and much more

  • The 401(k) retirement plan offers matched contributions at 4% of annual salary

  • Flexible and generous time off - vacation, sick and volunteer days, plus 19 paid holidays - including the last week of the year.

  • Family friendly! 100% paid new parent leave for seven weeks plus an additional five weeks for pregnancy, flexible options to phase back in after leave, fully equipped lactation room.

  • For those emergency moments - long and short term disability, life insurance (2x salary) and an employee assistance program

  • Pre-tax savings plans for health care, child care, elder care, public transportation and parking expenses

  • Telecommuting and flexible work schedules available

  • Appropriate fuel for thinking and coding (aka, a pantry full of treats) and monthly massages to help staff relax

  • Great colleagues - diverse staff and contractors speaking dozens of languages from around the world, fantastic intellectual discourse, mission-driven and intensely passionate people

*Please note that for remote roles located outside of the U.S., we defer to our PEO to ensure alignment with local labor laws.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

The Engineering manager will lead our Core Squad, responsible of our Platform, APIs, and product features essential for Slite such as Search, Docs structure, Integrations, Team preferences and Payment.  
go graphql Engineering
Software Development
Hiring from: Anywhere
View job 80,000 - 105,000 USD / year
About this job Compensation: $80k - 105kLocation options: RemoteJob type: Full-timeExperience level: SeniorRole: Full Stack DeveloperIndustry: Business Intelligence, Enterprise Software, Food & BeverageCompany size: 11–50 peopleCompany type: PrivateTechnologies azure, .net,, azure-functions, azure-sql-database Job description Job description We are looking...
azure .net azure-functions azure-sql-database
Software Development
United States
Hiring from: United States
About this job Location options: RemoteJob type: ContractExperience level: SeniorRole: Mobile DeveloperIndustry: Software Consulting, Software Development / Engineering, Web TechnologyCompany size: 51–200 peopleCompany type: PrivateTechnologies reactjs, ios, node.js, javascript, css Job description Department: Operations Job purpose:  Help our enterprise clients...
reactjs ios node.js javascript css
Software Development
No Location
Hiring from: Anywhere

Share you skills and get paid!

All from the comfort of your home.