The EPA air quality data set is massive. It may require working with data.frames on the order of 30gb. I have been looking into R literature discussing the practical limitations of standard computations in R and how to effectively work with “big data.” This post may eventually become a nice tutorial but for now it will be a list of useful links.