Posts

In this post I will show how to create an Rstudio project and manage it. My main reason for working in such projects is that I am able to manage all my code and content from one single folder on the computer. I also use here package to manage links in the code. For this tutorial you will need RStudio installed on your computer. The operating system doesn’t matter. I am using my MacBook Pro to record the videos.

CONTINUE READING

This homework is based on Shiny app. Q1. Create the following 3 Shiny apps. Print a table of first n observations from mpg data set, where n is the number of observations to print. The default value of n will be 6 while users can select a numeric input anywhere between 1 and 20 observations. (3 points) Shiny app is visible here: https://malshe.shinyapps.io/problem_1_1/ Let users select a city from among the five Texan cities and then print “You selected [name of the city]”.

CONTINUE READING

This note is pretty old. I have modified this note using dplyr package but still plenty of the code is still base R. The original note is available here: http://rpubs.com/malshe/224660. This file is just a small part of the original file. The original homework questions are available here: http://rpubs.com/malshe/224662 library(dplyr) library(here) Get the data in red <- read.csv(here::here("static", "data", "winequality-red.csv"), stringsAsFactors = F) red$wine <- "red" white <- read.csv(here::here("static", "data", "winequality-white.

CONTINUE READING

Q1 A. Using presidential and economics data frames from ggplot2 package, recreate the following graphs. Before you start making the plots, first understand the data sets well. Start with checking the help panel by typing ?presidential and ?economics in the console. This will describe the variables in your data sets. Take a peek at the data using head() Get summary of the variables Note that you DON’T have to include all this in the homework.

CONTINUE READING

In this post, I am covering a few things that I didn’t touch upon in the class previously. For this we will use mpg data from ggplot2 package. If you want to know more about the data and the variables, run the following command in your R console: ?ggplot2::mpg Get structure of the mpg data head(mpg) ## # A tibble: 6 x 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.

CONTINUE READING

There are numerous ways in which you can extend ggplot2 package. In this post, I am going to talk about 3 packages that are relevant to us immediately. Extrafont The first package is extrafont, which enables importing font files from your computer to R. You will have to do this only once after you install the package and then in future whenever you want to use different fonts, you can simply call them by name in ggplot2.

CONTINUE READING

Cross-validation error is an estimate of the out-of-sample error. Cross-validation is a great tool for helping modelers select a model with low out-of-sample error. The objective of this note is to show you how to write simple code to carry out cross-validation in R. I will post similar code for SAS later. K-fold cross-validation involves splitting the sample in K equal and independent subsamples (i.e., there is no overlap in the subsamples).

CONTINUE READING

I woke up to great news today! Five judges of the Supreme Court of India have unanimously decrimialized homosexuality. A few months back, I made a simple t-shirt design using R. That time, it was an R exercise for me and I didn’t share it with many people. This is my small gift to LGBTQ Indians. Although I have made this using “Om”, many religious symbol are possible if you know the correct character in Wingdings font.

CONTINUE READING

The MS in Data Analytics (MSDA) program at the University of Texas at San Antonio (UTSA) is all set to welcome the third batch of students this fall. The program attracts many talented applicants globally. However, due to resource constraints, we can admit only a small number of students. The program has two cohorts—daytime and evening. This year we expect to admit anywhere between 80 to 100 students in the two cohorts combined.

CONTINUE READING

This a short tutorial for the incoming students of UTSA’s MS in Data Analytics program. I am going to assume that the reader has no knowledge of R and RStudio, the Integrated Development Environment (IDE), which we use to code. If you are a Mac user and you are comfortable with command line tools using Terminal, I suggest taking more systematic route for preparing your MacOS for R installation using Homebrew.

CONTINUE READING