I (Meaghan) have spent a lot of time in R. Like, a LOT. Not really
accomplishing lots and lots, mind you, but just kind of fucking around
hopelessly most of the time. Meaghan of 3 months ago looked at R and thought
"is there any way I can avoid using that program?" while Meaghan of
today thinks "is there anything I can do with that program that will
permit me to feel useful while procrastinating on something else?" As it
turns out, there definitely is: Meaghan of today will now be presenting a
wonderful R tutorial on how to scatterplot some shit and then make it pretty
without getting entangled in ggplot2 (which is another code word for the bowels of hell).
One of the beautiful and fucking awful things about R is
that for any one way of doing something, there's about 60 others. I'm going to
tell you how to do things that you could probably do in other ways. These ways
make sense to me, but if they cognitively don't work for you I'm sure you could
find another 6+ ways of accomplishing the same goal. Also, everything I'm
reporting here comes from a place of necessity: I'm sure there are other useful
things we could talk about with scatterplots, but since I didn't have to think
about them…. I'm not going to talk about them!
R: the very basics
It's free and available on the internet, and very powerful.
It isn't user-friendly, unless your user is the Lorax of computer programming.
I use R-studio, which makes R a little prettier and look
more like a normal computer program. I also like it because it tells me what
type of data I have and acts as a good reminder of what I have named it.
Because my normal naming schemes are pretty incomprehensible, including myself once the coffee wears off. |
In R-studio, however, the rules still apply: you have to
type most things to make things happen. I like to think of R as an extremely
drunk friend wandering through the woods: you need to be very specific and
precise when you speak, or your friend will either ignore you or do something
very unpredictable and unhelpful. So to start off, tell your drunk friend where
you're going by setting your working
directory.
setwd("C:/Users/Meaghan/Dropbox/Dissertation
Stuffs/Chapter 1 Variation of Teeth/Data Analysis Files")
Then, you're going to
need to import your data. You know how sometimes your drunk friends think that
closets are toilets until they are angrily corrected? Don't let the drunk
friend pee on your shoes, just tell them what the file type is to begin with.
Is it an xlsx file? Use read.xlsx. What about a csv? Use read.csv. Also, tell
them what to name it. In this case, we'll call it procrastination, since that
is ultimately what all this is.
procrastination <-
read.csv("procrastination.csv")
This imports a lovely dataset which includes columns
of the weeks of term (1-11), the number of fucks I give about things, how many
important project I have due, % of time I spend binge watching Netflix, and
amount I hate the world on a scale of 1-10.
What a lovely table! So concise, so descriptive, so helpful! |
So that's what my data looks like when imported into
R. Some people like to then go and turn components into vectors for easy
reference, but I work with huge different datasets and it doesn't work well for
me mentally. You can tell your drunk friend where to focus by using the dollar
sign – sort of literally dangling money in your friends face until they focus.
Procrastination$weeks tells R to look at the weeks column in the dataset
procrastination. Pretty simple. So let's do a basic scatterplot!
plot(procrastination$week, procrastination$fucks)
Yes, so pretty. Actually pretty hideous I
think we can all agree, so what can we do to fix it? Well first off, I hate the
label names. Drunk friend is not good at naming things. So let's tell drunk
friend here what to call everything and we will be reaaaaally precise here. For
the x label, I want it to say Weeks of Term. Y label should be # of Fucks
Given, and the title should be # of Fucks Decreases over course of term.
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term")
But I can't see
those dumb little dots. Drunk friend is still being unhelpfully obtuse. I'd like to have
closed circles. To do that, tell R your exact code for your symbols. These
symbol codes are called pch codes, and the one I want is 16.
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term", pch=16)
Better, much better, but I still can't see them. How
do I make the points bigger? Imagine your drunk friend comes with a pre-set
volume: unfortunately, you can tell your drunk friend to be louder or quieter
in relation to that volume but you can't just tell them "be an appropriate
volume, a volume which is close to silence" Same with size - drunk friend gets proportionality but not etiquette. Cex is the argument, and it relates to %; a
cex=.5 argument would make everything 50% of the size it was before, while a
cex=2 would make it twice as large.
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term", pch=16, cex=2)
Let's imagine however that you wanted to scale the
size of different points by something – like, maybe the amount you hate your
life. That would work a little differently. Remember to be specific: tell it
the other variable you want it to scale by.
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term", pch=16, cex=(procrastination$hate))
Color is pretty easy. You can tell R the normal nameof a color, or you can use the fancy codes from the internet. It doesn't matter
too much.
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term", pch=16,
cex=(procrastination$hate), col="red")
But that's boring. Why not make it a little fancier?
What about if you wanted to make it a gradient of colors relating to the amount
of time you spend on Netflix? Let's make a gradient to start with, a beautiful
gradient that goes from blue to red. We'll call it Color, because we are feeling highly descriptive and unoriginal.
color <- colorRampPalette(c("blue",
"red"))
Next, we want to divvy up that gradient according to
a variable. You want to tell it how many breaks you need as well – too many, and
everything looks the same color. In this case, we will go with 10.
Netflix <- color(10)[as.numeric(cut(procrastination$netflix,breaks
= 10))]
Now, color using our new Netflix code! Remember to capitalize!
plot(procrastination$week, procrastination$fucks,
xlab="Weeks of Term", ylab="# of Fucks Given", main="#
of Fucks Decreases Over Course of Term", pch=16,
cex=(procrastination$hate), col=Netflix)
Now, I was going to teach you how to put in a legend but to be honest, I think that is a different blog post. Also, I've procrastinated way too much already. Like.... way too much. Shit. I guess I'd better go edit something.
Well, join our blog next time for "Meaghan Procrastinates on Important Dissertation Progress By Half-Assing an R Tutorial!"
I really thank you for the valuable info on this great subject and look forward to more great posts. Thanks a lot for enjoying this beauty article with me. I am appreciating it very much! Looking forward to another great article. Good luck to the author! All the best! Best Essay Writing Service UK
ReplyDeleteI’ll right away seize your rss as I can not in finding your email subscription hyperlink or e-newsletter service. Do you've any? Kindly permit me recognize in order that I may just subscribe. Thanks. ppc strategy singapore
ReplyDelete