My passion lies in harnessing the power of data to create better lives for all.

I currently live in San Fransisco, enjoying my daily commute to Palo Alto. My expertise and interests include statistical machine learning, predictive modeling and visualizations, and products management.

I'm also an advocate of data revolution for sustainable development. When I'm not playing with data, I enjoy running, painting, and spending quality time with loved ones. For more information, please contact me at moorissa.tjokro@columbia.edu.

Featured Work

Machine Learning

Building Spotify’s “Discover Weekly” with Spark

An MLlib & PySpark implementation of audio recommendation system using

a collaborative filtering algorithm.

Developing a Matching Algorithm

An entity resolution technique using various classification models, feature

importances, and pairwise comparison.

Direct Marketing Optimization using Mobile Data

A classification approach to strategize projected growths in user subscription.

Leveraging Philanthropy Impacts with Data Mining

An overview of Beyond Profit Project sponsored by Bloomberg Philanthropies.

Meta-Learning for Credit Card Fraud Detection

A research project on improving a fraud detection technique with Bayesian Nets,

kNN, and decision trees. The poster presentation can be found here.

Predicting NYC Renting Prices using Lasso Regression

A linear model building to accurately predict a monthly apartment listing.

Deep Learning

Starting out with Keras

An attempt to use multilayer perceptron & convolutional neural networks for

building deep learning models.

Natural Language Processing

Exploring Trending Topic Bias in News vs. Social Media

An NLP-based analysis of differences between topics in traditional newspaper

(ie. The New York Times) and online platform (ie. Twitter).

Making Boston Safer using Natural Language Processing

A set of classification methods to predict text data & model semantic categories.

Topic Modeling for The New York Times News Dataset

An Nonnegative Matrix Factorization (NMF) approach for classifying news topics.

Data Visualizations

Legacy of a Century: South Africa Today

A statistical attempt to explore the nation's journey after the life of Nelson

Mandela. Please open the link using Chrome on your desktop.

Comparing Marvel and DC Superheroes

An attempt to settle the age-old fight with data and D3.

Exploratory Data Analysis & Visualization Resources

A site repository for visualization resources using Javascript, HTML, CSS, & SVG.

How does Trump's budget cut affect you?

A report on the impacts of Trump's billion-dollar cuts in city transportation to

nationwide locations using R, Python, Carto, & Processing.

Ranking the Top 100 Sci-fi Books

Using D3 and Javascript to observe patterns.

Visualizations with R

A compilation of STAT GR5702 course assignments in descriptive statistics.

Visualizing the World's Poverty Rates

A small attempt to understand poverty rates using D3 & UN open source data.

What Makes Us Happy?

A data visualization to compare happiness across countries around the world.


  • It's the possibility of having a dream come true that makes life interesting.

    Paulo Coelho
  • If I had asked people what they wanted, they would've said faster horses.

    Henry Ford
  • We make a living by what we get. We make a life by what we give.

    Winston Churchill


Columbia University

M.S. in Data Science Dec '17

GPA: 3.6 / 4.0
Coursework: Machine Learning, Applied Machine Learning, Deep Learning & Neural Networks, Algorithms, Exploratory Data Analysis & Visualizations, Computer Systems, Bayesian Modeling, Storytelling with Data, Tech Entrepreneurship.

Georgia Institute of Technology

B.S. in Industrial Engineering & specialization in Statistics May '14

GPA: 3.8 / 4.0 (Summa Cum Laude)
Relevant Coursework: Probability Theory, Statistical Inference and Modeling, Database Systems Design and Manipulation, Regression and Forecasting, Quality Control, Optimization, Reliability Engineering (graduate level), Stochastic and Queueing Theory.



Data Scientist Intern Mar '18 - present


Building internal end-to-end products within the Charging Infrastructure team
(real-time availability dashboard, sites absency tool, etc). Developed time series modeling and machine learning algorithms (NLP, regression, classification/clustering, sentiment analysis).

NASA Goddard Institute for Space Studies

Machine Learning Intern Oct '17 - Dec '17'


Constructed an unsupervised clustering algorithm to assess ocean carbon cycle models and their atmospheric properties for Model E simulations.

NBC Universal — Comcast

Data Scientist Intern May '17 - Aug '17


Performed statistical inference, sampling methods, multivariate analyses, and supervised learning for modeling and extracting insights from high dimensional Nielsen data. Developed aggregation algorithms, automated an R&D process, and built visualization tools using Spark & Python.

Columbia University

Graduate Teaching Assistant May '17 - Aug '17


Facilitated the graduate student learning through the Applied Analytics course (~140 students), covering knowledge in scenario modeling, pattern detection, A/B testing, cluster analysis, sentiment analysis, time series analysis, prediction, NLP and IR, graph and information network mining.

Target Marketeam

Data Analyst Jul '14 - Jun '16


Managed data warehouses, built statistical models, and developed interactive decision support tools for nation's leading nonprofits. Selected as a lead analyst to collaborate closely with Senior VP of Analytics and Head of Analytics in integrating strategy execution, methodologies, risk and analytics.

United Nations World Food Programme

Research Assistant Aug '13 - May '14


Constructed a multivariate hub model for Specialized Nutritious Foods with Dr. Nazzal and Spatial Risk Calendar (SPARC) team at WFP. Resulted in 30% decrease in malnutrition rates + commodity shortages across sub-Saharan regions, and model adoption by the Zambian Ministry of Health.

Georgia Institute of Technology

Computer Science & Statistics Teaching Assistant Dec '12 - Dec '13


Led weekly recitations, grade exams, and tutored students for Data Manipulation & Database Systems and Applied Statistics course (~650 students).


On a day-to-day basis, I play with Python and Git.

  • Python, Scikit-Learn, Tensorflow, Spark, Keras
  • SQL, JavaScript, D3, HTML/CSS/SVG, Carto
  • R, Hadoop/MapReduce
  • SAS, JMP

Selected Honors

Columbia Annual Data Science
Hackathon, 1st Place Winner

Data Science Institute, 2017

Columbia Impact Hackathon,
1st Place Winner

Columbia Business School, 2016

Helen Grenga Nominee for Outstanding Woman Engineer

Georgia Institute of Technology, 2014

Rockwell Automation

Society of Women Engineers, 2013

Shannon & Wilson Technology Scholar

Shannon & Wilson, Inc., 2011

International Leadership Award

The International House NY, 2017

Toyota Scholarship

The International House NY, 2016

President’s Undergraduate Research Award

Georgia Tech Research Institute, 2014

Faculty Honors

Georgia Institute of Technology, 2012

Dean's List

Georgia Institute of Technology, 2011-2014

Student Spotlight

Seattle Colleges Foundation, 2011


Fun Fact

Almost double-majoring with fine arts for my undergrad, I have definitely thought of getting some certification in painting one day!