Lilly is a global healthcare leader headquartered in Indianapolis, Indiana, focused on discovering and delivering life-changing medicines. They are seeking a Data Scientist to apply data science, machine learning, and AI techniques to solve business problems, collaborating with stakeholders to identify opportunities for data-driven solutions.
Responsibilities
- Partnering with business stakeholders across GRA, GSC, and GSS to understand their workflows, identify high-impact problems, and frame them as data science opportunities — translating business questions into analytical approaches
- Developing and deploying machine learning models — classification, regression, clustering, time-series forecasting — to solve problems such as submission timeline prediction, document classification, regulatory risk scoring, and resource optimization
- Building and evaluating NLP and generative AI solutions — leveraging LLMs, RAG architectures, text extraction, entity recognition, and document summarization to automate regulatory authoring, scientific literature analysis, and content generation workflows
- Designing and executing experiments to evaluate model performance — using rigorous statistical methods, A/B testing, and evaluation frameworks (including RAGAS for RAG systems) to ensure solutions meet quality and accuracy thresholds before deployment
- Designing and building AI agents and agentic workflows — creating multi-step, tool-using systems that can autonomously execute complex tasks such as regulatory document drafting, data extraction and transformation, and cross-system orchestration — moving beyond single-prompt interactions to production-grade agent architectures that operate reliably in a validated environment
- Collaborating with full stack engineers and platform teams to productionize models — building APIs, integrating into existing applications, deploying on AWS infrastructure (Lambda, EKS, SageMaker, Databricks), and monitoring model performance in production
- Communicating findings and recommendations to both technical and non-technical audiences — using data visualization, storytelling, and clear business-impact framing to ensure your work drives actual decisions
- Staying current with emerging techniques in machine learning, generative AI, and data science — evaluating new tools, frameworks, and approaches for applicability to the GRA/GSC/GSS portfolio and sharing knowledge with the broader team
Skills
- Bachelor's degree in Data Science, Statistics, Computer Science, Mathematics, or a related quantitative field
- 1 years of professional data science experience in Python, R and core data science libraries
- Qualified applicants must be authorized to work in the United States on a full-time basis. Lilly will not provide support for or sponsor work authorization now or in the future for this role, including but not limited to F-1 CPT, F-1 OPT, F-1 STEM OPT, J-1, H-1B, TN, O-1, E-3, H-1B1, or L-1
- Experience with machine learning frameworks and model deployment patterns
- Academic Background in Data Science
- Hands-on experience with NLP techniques and/or generative AI — LLM APIs (OpenAI, Anthropic), RAG architectures, vector databases, prompt engineering
- Familiarity with cloud data platforms — AWS (SageMaker, Lambda, S3), Databricks, or similar
- Knowledge of statistical methods — hypothesis testing, experimental design, Bayesian methods, regression analysis
- Experience with SAS programming
- Strong communication skills — ability to present technical findings to non-technical audiences and translate business questions into analytical frameworks
- Collaborative mindset and experience working with cross-functional teams including engineers, product owners, and business partners
Benefits
- Company bonus
- 401(k)
- Pension
- Vacation benefits
- Eligibility for medical, dental, vision and prescription drug benefits
- Flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts)
- Life insurance and death benefits
- Certain time off and leave of absence benefits
- Well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities)
Company Overview
- BioSpace is the leading online community for industry news and careers for life science professionals. It was founded in 1988, and is headquartered in San Francisco, California, USA, with a workforce of 11-50 employees. Its website is http://www.biospace.com/.