I invest (modestly) in both US and Indian stock markets. For the last few days I observed that on the days when US market was down, the Indian market was not necessarily down. This is somewhat unexpected given my past experience (I used to work in the financial sector back in India). When I posted my observation on Facebook, people asked me for more concrete evidence for a lack of this correlation.
Translating Between Statistics and Machine Learning Summary: If you are like me, who has been trained in statistics and econometrics, not all the terminology used in machine learning is easily understandable. I think that machine learning guys are good marketers and they know how to name their techniques! For example, creating plain vanilla ‘dummy variables’ becomes ‘one hot encoding’ in machine learning :) There are some confusing things too. In statistics bias typically refers to the bias in the estimates.
In this post I show how to make a simple animation depicting the changing composition of Indian Parliament between 1998 and 2014, which was the last election year. In India there are several political parties and two major coalitions: National Democratic Alliance (NDA) and United Progressive Alliance (UPA). NDA leans right while UPA leans left. However, each of the coalitions is made up of several parties that have different ideologies.
In this week’s digest I am posting NLP related articles.
Detecting Sarcasm with Deep Convolutional Neural Networks: This article talks about a paper from 2017 that used Twitter data to build a deep learning model for sarcasm detection. I found that there is another more recent paper [PDF] that does sarcasm detection.
An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes): This is a simple tutorial that does topic modeling on online reviews.
This week’s articles:
A 60-Minutes Course on Fairness in Machine Learning Summary: The course focuses on the bias in machine learning because of humans! I think this is an important area of work.
Model-Based Machine Learning Book Summary: This is actually not an article but an entire book. I have read a few pages of the book but I am not at a point where I can summarize anything!
Whenever I get time, I am going to post articles on machine learning that I read during a week. I thought today is a good day to start doing it.
Using machine learning to index text from billions of images Summary: The article describes how Dropbox built a system to index images based on the text in those images. Dropbox used TensorFlow.
Rosetta: Understanding text in images and videos with machine learning Summary: From the article - “[Rosetta] extracts text from more than a billion public Facebook and Instagram images and video frames (in a wide variety of languages), daily and in real time, and inputs it into a text recognition model that has been trained on classifiers to understand the context of the text and the image together.
The colors are from WSJ article https://www.wsj.com/articles/the-kavanaugh-effect-political-debates-shake-up-the-workplace-1538949970
I will use the colors extracted from the solid bar at the bottom of the image. How did I do that? I used ColorZilla addon for Firefox: https://addons.mozilla.org/en-US/firefox/addon/colorzilla/
Here are the hex codes for 7 colors I extracted:
# Create a color palette from WSJ article wsjPal <- c('#65C1E8', '#D85B63', '#D680AD', '#5C5C5C', '#C0BA80', '#FDC47D', '#EA3B46') Try out a graph using the new color palette
Submission guideline You will submit:
All code (not applicable if you used only Tableau) Datasets Slides in Powerpoint, Keynote, or HTML (if used) Tableau workbook (if used) All the above files should be zipped and submitted on Blackboard before 10:30 pm on 10/13 (i.e., the deadline is 2 days after the last class).
The Zip file will have to be named as follows if you are from the day cohort:
In this post I will show how to create an Rstudio project and manage it. My main reason for working in such projects is that I am able to manage all my code and content from one single folder on the computer. I also use here package to manage links in the code.
For this tutorial you will need RStudio installed on your computer. The operating system doesn’t matter. I am using my MacBook Pro to record the videos.