My Exploration and Journey on Data Analytics

A list of my top data analysis projects, in the form of Jupyter Notebook, R, and SQL. Click GITHUB button to my GitHub repository.

 
abtest.png

handy function for a/b testing (pYTHON)

Documented the handy functions from A/B Testing Course of DataCamp, and further tested its usability and concluded some insight.

aws.png

AWS rEAL-TIME TWEET DASHBOARD (SQL, kINESIS, qUICKSIGHT)

Programmed tweets streaming ETL pipeline and visualize reviews sentiment for 3 leading smartphone brands (iPhone, Samsung, OnePlus) across different regions.

SMWordCloud.png

Topic modeling on social media posts (python)

Using truncated SVD, I decompose the selected topics for different social media and make further suggestion on topics selection based on desired metrics for my internship.

medium product.png

mEDIUM.COM POST METRICS DISPLAYER for writers (PYTHON)

I wrote articles on Medium.com, and I decide to contribute my data capability to the writer community by building a data product that enables writers to analyze post performance more easily.

tweepy.jpg

EXTRACT TWITTER FOLLOWER WITH TWEEPY PACKAGE (PYTHON)

Tweepy is a useful tool when dealing with Tweets data using Python. I use Tweepy to help expand the understanding of the followers in my internship.

rfm.png

RFM ANALYSIS & CUSTOMER CHURN ANALYSIS FOR HOTEL/MALL Enterprise in china (python)

RFM analysis is a famous method to identify high value customers. Programming with Python, I establish RFM metrics for hotel dataset and further cluster hotel residents based on these valuable metrics.

funnel.png

Customer Funnel Analysis for Online Retailer (Python) 

Web traffic analysis is the basic skill required by many tech companies. I use simple pivot table in Python to conclude traffic contribution.

CLUSTER ANALYSES ON MEDIUM articles TITLES (pYTHON)

I conduct clustering analysis on Medium article titles to provide an alternative solution for its recommendation system.

html.jpg

WEB SCRAPING ON H1B ONLINE DATABASE (pYTHON)

I build a web scraper to scrap data from H1B database and further write some functions to visualize analyses on different positions, especially for business analytics profession.

pharama.jpg

Detecting earning manipulation and fraud iii: On real a company (pYTHON & R)

Using the tools and models introduced in I and II, I further put all these methods in practice, trying to investigate if a real-world pharmaceutical company airs some signal of earning manipulation

M.png

DETECTING EARNING MANIPULATION AND FRAUD II: M-SCORE GENERATOR (PYTHON)

M-Score is widely used for detecting earning manipulation. I create several useful function in python to query and visualize M-score for specific company.

enroncartoon.gif

DEtecting earning manipulation and fraud I: uSEFUL TOOLS (pYTHON & R)

I introduce several models and tools for detecting earning manipulation, which is useful for investors to investigate potential earning manipulation.

roma.jpg

Association rule on soccer player from team roma (r)

I practice association rule on the arrangement of soccer player on Team Roma.

TEXT ANALYTICS on JACK MA’S RETIREMENT letter AND ELON MUSK’S PRIVATIZATION LETTER (PYTHON)

I conduct the sentiment analysis on two letters to investigate the clue for future development.