In this week’s digest I am posting NLP related articles.
Detecting Sarcasm with Deep Convolutional Neural Networks: This article talks about a paper from 2017 that used Twitter data to build a deep learning model for sarcasm detection. I found that there is another more recent paper [PDF] that does sarcasm detection.
An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes): This is a simple tutorial that does topic modeling on online reviews.
This week’s articles:
A 60-Minutes Course on Fairness in Machine Learning Summary: The course focuses on the bias in machine learning because of humans! I think this is an important area of work.
Model-Based Machine Learning Book Summary: This is actually not an article but an entire book. I have read a few pages of the book but I am not at a point where I can summarize anything!
Whenever I get time, I am going to post articles on machine learning that I read during a week. I thought today is a good day to start doing it.
Using machine learning to index text from billions of images Summary: The article describes how Dropbox built a system to index images based on the text in those images. Dropbox used TensorFlow.
Rosetta: Understanding text in images and videos with machine learning Summary: From the article - “[Rosetta] extracts text from more than a billion public Facebook and Instagram images and video frames (in a wide variety of languages), daily and in real time, and inputs it into a text recognition model that has been trained on classifiers to understand the context of the text and the image together.
The colors are from WSJ article https://www.wsj.com/articles/the-kavanaugh-effect-political-debates-shake-up-the-workplace-1538949970
I will use the colors extracted from the solid bar at the bottom of the image. How did I do that? I used ColorZilla addon for Firefox: https://addons.mozilla.org/en-US/firefox/addon/colorzilla/
Here are the hex codes for 7 colors I extracted:
# Create a color palette from WSJ article wsjPal <- c('#65C1E8', '#D85B63', '#D680AD', '#5C5C5C', '#C0BA80', '#FDC47D', '#EA3B46') Try out a graph using the new color palette
Submission guideline You will submit:
All code (not applicable if you used only Tableau) Datasets Slides in Powerpoint, Keynote, or HTML (if used) Tableau workbook (if used) All the above files should be zipped and submitted on Blackboard before 10:30 pm on 10/13 (i.e., the deadline is 2 days after the last class).
The Zip file will have to be named as follows if you are from the day cohort:
In this post I will show how to create an Rstudio project and manage it. My main reason for working in such projects is that I am able to manage all my code and content from one single folder on the computer. I also use here package to manage links in the code.
For this tutorial you will need RStudio installed on your computer. The operating system doesn’t matter. I am using my MacBook Pro to record the videos.
This homework is based on Shiny app.
Q1. Create the following 3 Shiny apps.
Print a table of first n observations from mpg data set, where n is the number of observations to print. The default value of n will be 6 while users can select a numeric input anywhere between 1 and 20 observations. (3 points) Shiny app is visible here: https://malshe.shinyapps.io/problem_1_1/
Let users select a city from among the five Texan cities and then print “You selected [name of the city]”.
This note is pretty old. I have modified this note using dplyr package but still plenty of the code is still base R. The original note is available here: http://rpubs.com/malshe/224660. This file is just a small part of the original file.
The original homework questions are available here: http://rpubs.com/malshe/224662
library(dplyr) library(here) Get the data in red <- read.csv(here::here("static", "data", "winequality-red.csv"), stringsAsFactors = F) red$wine <- "red" white <- read.csv(here::here("static", "data", "winequality-white.
Q1 A. Using presidential and economics data frames from ggplot2 package, recreate the following graphs. Before you start making the plots, first understand the data sets well.
Start with checking the help panel by typing ?presidential and ?economics in the console. This will describe the variables in your data sets. Take a peek at the data using head() Get summary of the variables Note that you DON’T have to include all this in the homework.