Data Integration Engineer
Remote
Contracted
Experienced
Position Type: Remote/Hybrid, will need to report onsite in Richmond, VA quarterly
VA or EST candidates only
Contract Length: 10 months
Position Overview:
The Lead Agentic Data Engineer will design, develop, and deploy data pipelines that power agentic AI solutions for real-world challenges. This role requires expertise in data processing for agentic systems, ensuring data quality, optimizing storage and retrieval, and facilitating seamless interaction between AI agents and data sources.
Required Skills:
VA or EST candidates only
Contract Length: 10 months
Position Overview:
The Lead Agentic Data Engineer will design, develop, and deploy data pipelines that power agentic AI solutions for real-world challenges. This role requires expertise in data processing for agentic systems, ensuring data quality, optimizing storage and retrieval, and facilitating seamless interaction between AI agents and data sources.
Required Skills:
- Understanding the Big data Technologies
- Experience developing ETL and ELT pipelines
- Experience with Spark, GraphDB, Azure Databricks
- Expertise in Data Partitioning
- 3 years of experience with Data conflation
- 3 years of experience developing Python Scripts
- 2 years of experience training LLMs with structured and unstructured data sets
- 3 years of experience with GIS spatial data
- Guiding and mentoring AI engineers, helping them develop their skills and knowledge in the field.
- Leading and managing AI projects, ensuring they stay on track, meet deadlines, and the findings are actionable and relevant.
- Contributing to the creation and implementation of AI strategies that align with the organization's goals and objectives.
- Designing and developing data pipelines for agentic systems, develop Robust data flows to handle complex interactions between AI agents and Data sources.
- Ability to use advanced mathematical modeling, statistical analysis, and optimization techniques to gather and analyze data, identifying problems and developing solutions to improve efficiency in prompts.
- Ability to train and fine tune large language models and Design and build the data architecture, including databases, data warehouses, and data lakes, to support various data engineering tasks.
- Develop and manage Extract, Load, transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science.
- Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems.
- Work with vector databases to store and retrieve embeddings efficiently.
- Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications.
- Optimize data storage and retrieval with high performance
Apply for this position
Required*