
Data Scientist - AI Agent Development
BinancePosted 4/16/2025

Data Scientist - AI Agent Development
Binance
Job Location
Job Summary
We are seeking a highly skilled Data Scientist to join Binance's team, focusing on the development of domain AI agents for specific operational tasks. The ideal candidate will have expertise in traditional Optical Character Recognition (OCR) and computer vision technologies. They will design, fine-tune, and deploy Large Language Models (LLMs) for AI agent applications, implement Retrieval-Augmented Generation (RAG) pipelines, and develop and optimize OCR systems. Collaboration with cross-functional teams is required to identify use cases, gather requirements, and deliver tailored AI solutions. The Data Scientist should have strong experience in LLM fine-tuning, traditional OCR techniques, programming skills in Python, and familiarity with cloud platforms for deploying machine learning models at scale. Binance offers a competitive salary, company benefits, and a work-from-home arrangement.
Job Description
Responsibilities
- Design, fine-tune, and deploy Large Language Models (LLMs) for AI agent applications.
- Implement Retrieval-Augmented Generation (RAG) pipelines to enhance the performance of AI agents in retrieving and generating contextually accurate responses.
- Develop and optimize traditional OCR systems for document digitization and text extraction tasks.
- Integrate OCR outputs with LLM-based AI agents to create seamless workflows for information processing.
- Collaborate with cross-functional teams to identify use cases, gather requirements, and deliver tailored AI solutions.
- Conduct data preprocessing, feature engineering, and model evaluation to ensure robust performance.
- Stay updated on the latest advancements in LLMs, RAG frameworks, computer vision, and OCR technologies.
- Document methodologies, experiments, and results for internal knowledge sharing.
Requirements
- Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field.
- Strong experience with LLM fine-tuning (e.g., GPT models, BERT) and familiarity with frameworks like Hugging Face or OpenAI APIs.
- Proficiency in traditional OCR techniques using tools such as PaddleOCR models.
- Strong programming skills in Python; familiarity with libraries like PyTorch, TensorFlow, Scikit-learn, or similar.
- Experience with cloud platforms (AWS, Azure, GCP) for deploying machine learning models at scale.
- Excellent problem-solving skills and the ability to work independently or as part of a team.