ABOUT ME


My name is Nina Menezes Cunha.

I am a lifelong learner with a relentlessly curious mind and proactive problem-solving approach.
My Stanford PhD in Economics of Education fuels my passion for turning complex data into social impact.

Over 10+ years, I've deployed machine learning and causal inference across more than 10 countries, leading large-scale education experiments with 100,000+ students. My technical expertise spans Python (including Pandas, NumPy, Scikit-learn), SQL (BigQuery, PostgreSQL), and cloud platforms (GCP), combined with rigorous econometrics to build solutions from predictive models to AI applications.

As Senior Researcher at FHI 360 and World Bank consultant, I've designed end-to-end data systems - from statistical analysis and natural language processing to interactive dashboards and policy recommendations. I now channel this expertise into Amooora, my startup developing AI-powered solutions for the lesbian community, enhanced by my recent completion of Le Wagon's Data Science Bootcamp.

When I'm not building data solutions, you'll find me biking coastal trails 🌊, enjoying the beach 🏖️, or catching the latest films 🎬. An old soul at heart, I treasure quiet evenings with tea 🫖—whether playing trumpet or percussion 🎺🥁 to jazz and Brazilian popular music, or turning in early 🌙. Proudly Brazilian 🇧🇷 and openly lesbian 🏳️‍🌈, I thrive where culture, nature, and community intersect.

PROFESSIONAL EXPERIENCE

  • AMOOOORA
    AMOOOORA - Founder/CEO (Dec 2024 - Present | São Paulo, Brazil)
    • Developing a data-driven app for the lesbian community using ML techniques to optimize content delivery and engagement
    • Implementing predictive analytics for user retention strategies and community-building initiatives
    • Leveraging data-driven insights to improve platform accessibility and tailor content
  • FHI 360
    FHI 360 - Senior Research Associate (May 2018 - Aug 2023 | Washington, DC)
    • Led impact evaluations using causal inference methods (DID, IV, synthetic control) across Ghana, Malawi, and Latin America
    • Designed psychometric tools using factor analysis to measure teacher well-being in Uganda/Guatemala/El Salvador
    • Managed analysis of 50K+ student datasets to drive education policy recommendations
  • Stanford University
    STANFORD UNIVERSITY - Senior Researcher (Jan 2013 - Apr 2018 | Stanford, CA)
    Research
    • Led RCTs with 25,000+ students using causal inference and machine learning
    • Designed behavioral nudges that improved student attendance and test scores
    • Published peer-reviewed research on teacher effectiveness interventions
    Teaching
    • Co-led a yearlong seminar on Topics in Brazilian Education (2013-2015), designing course content and facilitating student engagement
    • Used diverse teaching strategies, including student presentations, guest speakers, and policy development exercises
    • Assisted in a flipped classroom statistics course (2014-2015) for Master's students, coaching them in applying statistical models to research questions
  • World Bank
    WORLD BANK - Consultant (Aug 2015 - Feb 2016 | Ceará, Brazil)
    • Designed data collection pipelines for education assessments across 350 schools
    • Trained enumerators and ensured data integrity for large observational studies
  • MOVVA
    MOVVA - Consultant (Feb 2015 - Dec 2016 | São Paulo, Brazil)
    • Supervised data collection and analysis for 400 public schools (30,000 students)
    • Developed data visualization dashboards for policy stakeholders
  • Federal University of Minas Gerais
    FEDERAL UNIVERSITY OF MINAS GERAIS - Research Assistant (Feb 2010 - Apr 2012 | Minas Gerais, Brazil)
    • Conducted econometric modeling using longitudinal data of 3,500 students
    • Collaborated on education policy research projects

EDUCATION

  • Le Wagon
    LE WAGON - Data Science Bootcamp (Jan 2025 - Mar 2025)
    • Data Science & ML: Analyzed large datasets (SQL, Python, BigQuery), built statistical and ML models (classification, NLP, deep learning), and deployed them (GCP, Docker, FastAPI).
    • Project Leadership: Led a team project from data engineering to ML pipelines, delivering actionable business insights.
  • Stanford University
    STANFORD UNIVERSITY - Ph.D. in Economics of Education (Sep 2012 - Apr 2018)
    • Large-Scale Education Experiments: Designed and executed 3 RCTs (A/B tests across 289 schools, 25,000+ students) using causal inference methods (regression analysis, behavioral nudges).
    • Data Science for Education Policy: Transformed statistical insights into actionable ed-tech solutions adopted by government partners.
  • Federal University of Minas Gerais
    FEDERAL UNIVERSITY OF MINAS GERAIS - M.A. in Economics (Jan 2010 - Apr 2012)
    • Religion & Education Causal Analysis: Applied machine learning techniques (quantile regression, OLS) to Brazilian longitudinal youth data, revealing how religious socialization improves academic performance.
  • São Paulo School of Economics (FGV)
    SÃO PAULO SCHOOL OF ECONOMICS (FGV) - B.A. in Economics (Jan 2006 - Dec 2009)
    • Economics of Education Research: Programmed large-scale data analysis in Stata using machine learning techniques (fixed-effects models, instrumental variables) on Brazilian education panel data (1992-2007).

TECHNICAL SKILLS

Data Engineering & Processing

  • Python: Pandas, NumPy, BeautifulSoup, Requests
  • SQL: PostgreSQL, BigQuery
  • Data Storage: SQLite, MySQL
  • Data Pipelines: Prefect, FastAPI

Statistics & Visualization

  • Statistical Analysis: Regression, A/B Testing, Causal Inference
  • Data Visualization: Matplotlib, Seaborn, Plotly
  • Dashboards: Streamlit
  • Business Intelligence: Data Storytelling

Machine Learning & AI

  • ML & AI Models: Classification, Regression, Clustering, Time Series, NLP, Multi-Agent Systems
  • Feature Engineering: Dimensionality Reduction, Class Imbalance Handling, Feature Selection, Model Tuning
  • Performance & Explainability: AI Metrics, Model Interpretability (SHAP, Lime, Attention Mechanisms)
  • Frameworks: Scikit-learn, XGBoost, TensorFlow, Keras, PyTorch, Hugging Face, Transformers, RNNs

Cloud & Deployment

  • Cloud Computing: Google Cloud Platform, Compute Engine, Cloud Storage
  • MLOps: MLflow, Docker, CI/CD
  • Version Control: Git, GitHub, GitLab
  • APIs & Deployment: FastAPI, Streamlit, Cloud-based AI Solutions

CERTIFICATIONS

SOFT SKILLS

Analytical & Problem-Solving

  • Analytical Thinking
  • Critical Thinking
  • Proactive Problem-Solving
  • Data Storytelling

Leadership & Collaboration

  • Leadership & Mentorship
  • Collaboration & Teamwork
  • Cross-Cultural Collaboration
  • Stakeholder Management

Adaptability & Growth

  • Resilience & Growth Mindset
  • Adaptability
  • Lifelong Learning
  • Multidisciplinary Agility

Languages

  • English: Fluent
  • Portuguese: Native
  • Spanish: Advanced
  • French: Basic

FEATURED DATA SCIENCE PROJECTS

Amooora Connection Algorithm Project

Amooora Connection Algorithm

March 2025

Developed a next-generation matching system for LGBTQ+ women and non-binary individuals using deep learning and natural language processing. Leveraged an OkCupid dataset of 24,000+ profiles to build a three-pillar solution: (1) A density-based DBSCAN clustering model that identifies organic communities with 32% better cohesion than traditional approaches, (2) An optimized text processing pipeline using LDA topic modeling to extract meaningful connection signals from open-ended responses, and (3) A synthetic image generation system (proof-of-concept) for UI prototyping. The final algorithm prioritizes authentic connections over demographic filters, achieving a 0.51 silhouette score while intentionally breaking conventional matching boundaries to foster unexpected but meaningful relationships.

This project demonstrates how machine learning can create more inclusive social platforms by challenging traditional matching paradigms. Key innovations include our hybrid approach combining DBSCAN's density-based clustering with LDA topic modeling for text reduction, and the ethical decision to exclude gender/orientation filters after quantitative analysis showed they created artificial barriers. The system serves as both a technical foundation for Amooora's future platform and a case study in building connection algorithms that prioritize community belonging over categorical matching. Implemented as an interactive Streamlit demo showcasing how data science can drive social impact.

Methods & Tools:

  • Clustering algorithms (DBSCAN, K-Means comparison).
  • Natural Language Processing (LDA topic modeling, BERT embeddings).
  • Model evaluation (silhouette scoring, cluster validation).
  • Python (TensorFlow, Scikit-learn, Gensim, NLTK, spaCy).
  • Google Cloud Platform (Compute Engine, Cloud Storage).
  • Containerization (Docker, Docker-compose).
  • API development (FastAPI).
  • Interactive dashboards (Streamlit).
  • Computer vision (OpenCV, Keras for synthetic images).
Educational Intervention Study

Parental Monitoring & Student Outcomes

Accepeted at American Economic Journal: Economic Policy (2025)

This large-scale randomized experiment studied how information interventions affect parental monitoring and student achievement across 289 Brazilian schools (25,000+ students). Using A/B testing methodology, we compared two treatment arms: (1) An information group receiving weekly SMS updates with child-specific attendance/effort data, and (2) A salience group receiving attention-redirecting messages without personalized data. Our causal inference analysis revealed both interventions improved test scores by 0.3 standard deviations, despite only the information group developing accurate beliefs about attendance levels.

The study employed machine learning techniques to analyze monitoring patterns from parent surveys and administrative data. Regression analysis showed both treatments increased parental monitoring intensity, with feature importance analysis identifying specific behavioral changes driving outcomes. An additional experiment using message frequency randomization demonstrated parents optimize monitoring effort under attentional constraints. Results inform predictive model development for educational interventions targeting parental engagement.

Methods & Tools:

  • Randomized controlled trial (RCT) design.
  • Causal inference (DID, IV regression).
  • Large-scale data collection (289 schools).
  • Text message intervention system.
  • Statistical modeling (OLS, logistic regression).
  • Feature selection for behavioral predictors.
  • Performance metric analysis (test scores, promotion rates).
  • Python and R for data analysis.
Youth Soft Skills Assessment Tool

Developing a New Tool for International Youth Programs

Peer-Reviewed Publications | 2021-2023

Developed a machine learning-powered assessment tool to measure social-emotional skills in 1,794+ youth across Uganda and Guatemala. Using dimensionality reduction techniques (PCA and factor analysis), we transformed 160+ initial survey questions into a validated 48-item instrument measuring four core competencies: Positive/Negative Self-Concept, Higher-Order Thinking, and Social-Communication Skills.

Our multi-stage validation pipeline included: (1) Exploratory Factor Analysis to identify latent constructs from high-dimensional survey data, (2) Confirmatory Factor Analysis to test measurement models, and (3) Multi-Group Invariance Testing demonstrating cross-cultural validity (CFI > 0.95 across all subgroups). The system achieved strong measurement invariance (ΔCFI < 0.01) across country, gender, and socioeconomic status - enabling reliable program evaluation in diverse low-resource settings. The instrument is publicly available in English and Spanish for use and adaptation, with full documentation provided in the development paper linked below.

Methods & Tools:

  • Dimensionality reduction (PCA, EFA, CFA).
  • Measurement invariance testing (multi-group CFA).
  • Psychometric validation pipelines.
  • Stata for data cleaning and preparation.
  • R (lavaan, psych packages) for factor analysis.
  • Mplus for advanced structural equation modeling.
  • Cross-cultural validation frameworks.
  • Survey data quality control systems.
Teacher Wellbeing Assessment Project

Teacher Wellbeing Measurement & Intervention

Peer-Reviewed Publications | 2021-2024

Developed and validated a machine learning-powered assessment tool (WHAT) to measure teacher wellbeing in conflict-affected areas, using dimensionality reduction techniques (PCA/EFA) on survey data from 1,659 Salvadoran educators. Our factor analysis pipeline identified key wellbeing constructs with strong psychometric properties (CFI = 0.92, RMSEA = 0.04), enabling precise measurement in high-stress environments.

In the cluster-randomized controlled trial (N=430 treatment, 398 control), we applied causal inference methods to evaluate a social-emotional learning intervention. Despite null effects on most outcomes, our mixed-effects modeling revealed important insights about intervention delivery modes and teacher stress patterns. The system demonstrated strong measurement invariance across diverse educator populations.

Methods & Tools:

  • Dimensionality reduction (PCA, EFA, CFA).
  • A/B testing framework (cluster-RCT design).
  • Causal inference (difference-in-differences).
  • Psychometric validation pipelines.
  • Mixed-methods analysis (quant + qualitative).
  • Stata/R for statistical modeling.
  • Measurement invariance testing.
  • Survey data quality control systems.
Educational Resource Equity Analysis

Educational Resource Equity Analysis

Peer-Reviewed Publication | 2021

Developed a novel methodological framework to quantify and compare educational resource allocation equity across 53,469 Brazilian public schools (30% of national coverage). Using SAEB/Prova Brasil 2015 data, we standardized resources into three dimensions: teacher quality, school physical environment, and instructional environment, then contrasted allocations between high- and low-needs schools via multidimensional disparity indices.

Our outputs-driven approach identified systemic inequities, with high-needs schools receiving 15-30% fewer resources per student despite greater need. The framework's adaptability allows subnational comparisons (e.g., Northeast vs. Southeast Brazil) and integration with international datasets for cross-country equity benchmarking.

Methods & Tools:

  • Large-scale data integration (SAEB/Prova Brasil census).
  • Resource standardization frameworks (3-dimension model).
  • Equity metrics (Gini coefficients, disparity indices).
  • Geospatial analysis (regional comparisons).
  • Statistical modeling (OLS, quantile regression).
  • Policy impact simulation.
  • Stata/R for data processing.
  • Data visualization (equity dashboards).
Ceará Teacher Training Program

Ceará Teacher Effectiveness Program

Peer-Reviewed Publication 2018

This large-scale randomized controlled trial demonstrated that a low-cost coaching program (delivered via Skype at $2.40/student) significantly improved teaching practices across 350 public schools in Ceará, Brazil. Our causal inference analysis showed the intervention increased teachers' instructional time by 28% and boosted student engagement by 0.4 standard deviations, with particularly strong effects in math and Portuguese.

The program targeted classroom practice malleability through peer collaboration, addressing research showing wide within-school teacher quality variation. Using mixed-effects regression modeling, we found the virtual coaching model overcame traditional barriers to professional development in low-resource settings. The state government is now scaling this evidence-based program statewide based on our findings.

Methods & Tools:

  • Cluster-randomized controlled trial (350 schools).
  • Causal inference (multilevel modeling).
  • Cost-effectiveness analysis.
  • Classroom observation data processing.
  • Stata and R for statistical analysis.
  • Power calculations for field experiments.
  • Implementation fidelity tracking.
  • Scalability assessment framework.

PUBLICATIONS

CONTACT

Feel free to contact me in case of questions about my projects, data science opportunities and any other reason you think is relevant ;)