In this post, we will visualize spread of worldwide COVID-19 cases through time. I obtained the data from Rami Krispin’s website: https://ramikrispin.github.io/coronavirus/ using coronovirus package. I also decided to do some experimentation using John Coene’s fantastic echarts4r package, which allows us to access echarts API.
Load the libraries and get the data in the R session.
library(dplyr) library(echarts4r) library(coronavirus) # Get the data data("coronavirus") Data Preparation Check out the first 6 observations.
This post will help you set up RStudio Cloud for the UTSA-GenAI workshop to be held on February 14, 2020. In the workshop I will show you a lesson that involved topic modeling on Amazon.com product reviews using Latent Dirichlet Allocation. The lesson requires a code file, a data file, and a word list. I have set up a R project on RStudio Cloud, which you can start working on right away by following the instructions below.
I came across the visualization of US opioid epidemic made by Kieran Healy in his book “Data Visualization: A Practical Introduction” (Link). He has used data through 2014 but in recent years the epidemic has become worse. So I extended the data to 2017 by downloading it from Kaiser Family Foundation’s website. After cleaning up the data, I ended up with an unbalanced panel of 50 states over 1999 to 2017.
I came across a few nice applications-related posts.
Machine Learning Music Composed From Re-Synthesized Fragments From 100s Of Terabytes Of LA Phil Recordings: This posts has “a new high def version of the dazzling 3D video/AI-driven performance displayed on the Walt Disney Concert Hall last year.” AI-driven music is nothing new. About 3 years ago I showed a video of computer algorithm creating fantastic music to my students and some of them became upset!
I came across this Science article someone shared on Twitter: Plastic waste inputs from land in to the ocean
In this post I am going to make a bar graph using the top 10 countries listed in Table 1. Here is the screenshot of that table.
The ranking in the table is based on the last column. As that column shows interval estimates, I decided use the midpoints of those intervals.
I invest (modestly) in both US and Indian stock markets. For the last few days I observed that on the days when US market was down, the Indian market was not necessarily down. This is somewhat unexpected given my past experience (I used to work in the financial sector back in India). When I posted my observation on Facebook, people asked me for more concrete evidence for a lack of this correlation.
Translating Between Statistics and Machine Learning Summary: If you are like me, who has been trained in statistics and econometrics, not all the terminology used in machine learning is easily understandable. I think that machine learning guys are good marketers and they know how to name their techniques! For example, creating plain vanilla ‘dummy variables’ becomes ‘one hot encoding’ in machine learning :) There are some confusing things too. In statistics bias typically refers to the bias in the estimates.
In this post I show how to make a simple animation depicting the changing composition of Indian Parliament between 1998 and 2014, which was the last election year. In India there are several political parties and two major coalitions: National Democratic Alliance (NDA) and United Progressive Alliance (UPA). NDA leans right while UPA leans left. However, each of the coalitions is made up of several parties that have different ideologies.
In this week’s digest I am posting NLP related articles.
Detecting Sarcasm with Deep Convolutional Neural Networks: This article talks about a paper from 2017 that used Twitter data to build a deep learning model for sarcasm detection. I found that there is another more recent paper [PDF] that does sarcasm detection.
An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes): This is a simple tutorial that does topic modeling on online reviews.
This week’s articles:
A 60-Minutes Course on Fairness in Machine Learning Summary: The course focuses on the bias in machine learning because of humans! I think this is an important area of work.
Model-Based Machine Learning Book Summary: This is actually not an article but an entire book. I have read a few pages of the book but I am not at a point where I can summarize anything!