Data Engineer
Company: Oregon Health & Science University
Location: Washington
Posted on: October 20, 2024
Job Description:
The CDC Foundation helps the Centers for Disease Control and
Prevention (CDC) save and improve lives by unleashing the power of
collaboration between CDC, philanthropies, corporations,
organizations and individuals to protect the health, safety and
security of America and the world. The CDC Foundation is the go-to
nonprofit authorized by Congress to mobilize philanthropic partners
and private-sector resources to support CDC's critical health
protection mission. Since 1995, the CDC Foundation has raised over
$1.9 billion and launched more than 1,300 programs impacting a
variety of health threats from chronic disease conditions including
cardiovascular disease and cancer, to infectious diseases like
rotavirus and HIV, to emergency responses, including COVID-19 and
Ebola. The CDC Foundation managed hundreds of programs in the
United States and in more than 90 countries last year. Visit
www.cdcfoundation.org for more information.
Overview
The Data Engineer will play a crucial role in advancing the CDC
Foundation's mission by designing, building, and maintaining modern
data infrastructure for the Northwest Portland Area Indian Health
Board (NPAIHB) Data Hub project. This role is aligned to the
Workforce Acceleration Initiative (WAI). WAI is a federally funded
CDC Foundation program with the goal of helping the nation's public
health agencies by providing them with the technology and data
experts they need to accelerate their information system
improvements.
Working closely with the Data Hub Team, the Data Engineer will
create the architecture needed for data storage, processing,
analysis, and secure transfer to Tribal Leaders and public health
professionals. The Data Engineer will collaborate with
epidemiologists, data content experts, IT staff, the Data Hub
Project Director, and others to develop and implement scalable
solutions that align with the objectives of the NPAIHB's Data Hub
project.
NPAIHB's Data Hub Team is currently developing a system, "The NW
Tribal Data Hub," to provide comprehensive, user-friendly public
health data dashboards for its 43 member Tribes. The Data Engineer
will ensure the successful design and implementation of a newly
created public health database, the ingestion of additional data
into the system, and create tables, views, and other database
structures to support epidemiological analysis, visualization, and
reporting to Tribes. The data, sourced primarily from state and
federal agencies, include vital statistics (births, deaths), cancer
registries, emergency department, clinical service data, and
others. The Data Engineer's work will be pivotal in enhancing the
capacity of Tribal public health departments to conduct data-driven
activities, advancing Tribal data sovereignty, and empowering
Tribes to improve health outcomes within their communities.
NPAIHB is a tribally owned and operated non-profit organization
serving the 43 federally recognized Tribes in the states of Idaho,
Oregon, and Washington. Led by the organization's Board of
Directors, NPAIHB's mission is to "eliminate health disparities and
improve the quality of life of American Indians and Alaska Natives
by supporting Northwest Tribes in their delivery of culturally
appropriate, high-quality health programs and services." NPAIHB is
a mission-driven organization with a staff of over 120
professionals dedicated to advancing Tribal health for the 7th
generation in the Pacific Northwest.
The Data Engineer will be hired by the CDC Foundation and assigned
to the Data Hub Team at NPAIHB. This position is eligible for a
fully remote work arrangement for U.S. based candidates.
Responsibilities
- Design a data hub roadmap to streamline secure and reliable
data management, including ingestion, processing, and storage
through enhancements or implementation of new systems and
pipelines.
- Load data into storage systems or data warehouses,
transforming, cleaning, and organizing with dimensional modeling
techniques to ensure accuracy, consistency, and efficient
querying.
- Transform and structure data to ensure it is optimized for use
in data visualization software, enabling accurate and effective
visual representations of epidemiological data.
- Collaborate closely with the project epidemiologist to ensure
they gain a comprehensive understanding of the data pipeline
architecture and data engineering methods to support long-term
maintenance and sustainability of the system.
- Collaborate closely with project epidemiologist to understand
data requirements and ensure that data infrastructure and workflows
align with epidemiological needs.
- Ensure thorough and clear documentation of database
architecture and workflows to promote sustainability, consistency,
and ease of maintenance.
- Define business rules around data governance for the Data Hub.
Apply rigorous data quality checks and validation processes to
guarantee the accuracy and reliability of the data released,
emphasizing the importance of delivering correct and trustworthy
data to support public health initiatives.
- Optimize data pipelines, infrastructure, and workflows for
performance and scalability.
- Monitor data pipelines and systems for performance issues,
errors, and anomalies, and implement solutions to address
them.
- Analyze and interpret datasets to identify data management
needs and advise on data management strategy.
- Implement security measures to protect sensitive
information.
- Collaborate with epidemiologists, analysts, and other partners
to understand current and future data needs and requirements, and
to ensure that the data infrastructure supports the organization's
goals and objectives.
- Collaborate with cross-functional teams to understand data
requirements to design and implement scalable database solutions in
accordance to end users' business needs.
- Implement and maintain ETL processes to ensure the accuracy,
completeness, and consistency of data.
- Design and manage data storage systems, including migration of
SAS datasets to PostgreSQL relational database.
- Apply knowledge about industry trends, best practices, and
emerging technologies in data engineering, and incorporate the
trends into the organization's data infrastructure.
- Provide technical guidance to other staff on preparing and
structuring data for visualization, leveraging knowledge of
visualization tools to support the creation of meaningful and
insightful visual outputs.
- Communicate effectively with partners at all levels of the
organization to gather requirements, provide updates, and present
findings.
Qualifications
Required Qualifications
- Bachelor's degree in Computer Science, Information Technology,
Data Science, or a related field.
- Minimum of five (5) years of related informatics experience,
preferably with three (3) years of experience in a lead data
engineer position.
- Demonstrated expertise in building SQL relational databases and
transitioning non-relational data into a structured relational
format, ensuring seamless integration and optimized
performance.
- Proficiency in SQL programming and other languages commonly
used in data engineering, such as Python, Java, Scala. Candidate
should be able to implement data automations within existing
frameworks as opposed to writing one off scripts.
- Experience transforming and preparing data into formats
suitable for data visualization software, ensuring it is structured
for optimal use in dashboards and other visual outputs.
- Strong understanding of database systems, including relational
databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g.,
MongoDB, Cassandra), with PostgreSQL preferred.
- Experience regarding engineering best practices such as source
control, automated testing, continuous integration and deployment,
and peer review, and serving as a subject matter expert on these
topics.
- Knowledge of data warehousing concepts and tools.
- Experience with cloud computing platforms, with preference for
experience in AWS environment.
- Expertise in data modeling, ETL (Extract, Transform, Load)
processes, and data integration techniques.
- Familiarity with agile development methodologies, software
design patterns, and best practices.
- Strong analytical thinking and problem-solving
abilities.
- Excellent verbal and written communication skills, including
the ability to convey technical concepts to non-technical partners
effectively.
- Flexibility to adapt to evolving project requirements and
priorities.
- Outstanding interpersonal and teamwork skills; and the ability
to develop productive working relationships with colleagues and
partners.
- Experience working in a virtual environment with remote
partners and teams.
- Proficiency in Microsoft Office.
- Ability to travel occasionally for in-person meetings (travel
costs will be covered by NPAIHB).
Preferred Qualifications:
- Experience facilitating data requirements gathering sessions to
support data modeling plans
- Experience planning and designing database models based on
business data requirements
- Experience working with complex public health, health care, or
other non-business data requiring advanced processing and analysis
techniques.
- Experience transitioning SAS datasets and analyses into
relational database structures.
- Experience building data pipelines within Amazon Web Services
(AWS), such as AWS Relational Database Services (RDS), Amazon
Aurora Serverless, AWS Glue, Lambda
- Experience creating complex fields and visuals in AWS
QuickSight or similar data visualization tools (Tableau, Microsoft
Power BI, etc).
- Experience with dimensional modeling in scenarios where
dimensions and fields change over time.
- Experience with implementing data suppression techniques and
familiarity with HIPAA, PHI, and other data confidentiality
regulations.
- Experience providing mentorship, training, and knowledge
transfer of data engineering techniques to build the organization's
capacity for ongoing system management and development
#J-18808-Ljbffr
Keywords: Oregon Health & Science University, Arlington , Data Engineer, Engineering , Washington, Virginia
Didn't find what you're looking for? Search again!
Loading more jobs...