A globally leading technology company is seeking an experienced Data Engineer to support large-scale data operations for machine learning workflows. You will work closely with external data vendors and internal teams to ingest, validate, curate, and organize high-quality datasets, enabling downstream ML model development. This role requires a strong background in Python and experience working with AWS S3-based pipelines. All qualified candidates are welcome to apply!
Job Responsibilities:
• Collaborate with external data collection vendors to track and ingest incoming datasets.
• Design and execute robust data validation and curation pipelines to ensure data quality and consistency.
• Implement logic to bin and categorize data according to project-specific criteria.
• Run pseudo-labeling workflows on newly ingested data using pre-trained ML models.
• Maintain clear status and versioning of datasets throughout their lifecycle.
• Distribute and deliver validated data assets to various internal product and ML teams.
• Maintain logs and reports to ensure traceability and accountability across data operations.
Candidate Requirements:
• 5+ years of industry experience in data engineering, data pipelines, or ML infrastructure.
• Strong proficiency in Python, including data processing and scripting.
• Experience working with AWS S3 for managing and organizing large-scale datasets.
• Familiarity with data quality assurance and curation processes.
• Comfortable operating in Unix/Linux environments, with familiarity in using command-line tools.
• Strong communication and coordination skills, especially when collaborating with external vendors and distributed teams.
• Self-driven, organized, and able to handle multiple data workflows in parallel.
Nice to Have:
• Experience with ML pipelines, especially pseudo-labeling or active learning.
• Familiarity with data versioning tools or frameworks (e.g., DVC, LakeFS).
• Prior experience in managing vendor relationships or annotation workflows.
• Speak multiple languages
Type: Contract
Duration: 12 months (with a possibility to extend)
Work Location: Seattle, WA (On site)
Pay Rate: $ 68.00 - $ 83.00 (DOE)
...Overview Join our team as an evening shift, full-time, Cat Scan-A Lead CT Technician in Montclair, NJ. You may be eligible for a sign on bonus of up to $10,000. Why Join Us? Thrive in a People-First Environment and Make Healthcare Better Thrive:...
...Detailed Responsibilities: PATIENT OBSERVATION - Observe patient to ensure their safety and well being; report observations to Charge Nurse or appropriate designee PATIENT CARE - Provide basic patient care; take and record vital signs to include temperature, pulse,...
...is working with Sherman Brothers Heavy Trucking to find a qualified Dedicated or Regional... ...experience! We are seeking flatbed drivers to run our Pacific Northwest Regional Flatbed... ...Pay Transflo/ Geotab Lease purchase program available And Much, Much more...
...Description Job Description This is a part-time, remote work-from-home adjunct faculty position including the following. This involves... ...University City Vision University is an online-only, Christian institution with the goal of providing radically affordable education...
...Responsibilities Digital Advertising: Manage and optimize Google Ads (must be Google Ads Certified) , Meta (Facebook/Instagram) Ads . SEO & Online Presence: Ramp up SEO activity , oversee Google My Business , manage Yelp , and ensure strong local visibility....