Showing posts with label R project. Show all posts
Showing posts with label R project. Show all posts

Thursday, March 10, 2011

Statistical models for tropical cyclone activity


In collaboration with Gabriele Villarini and as part of the U.S. CLIVAR working group on hurricanes, I've written a short summary paper on statistical models for tropical cyclone activity. The paper is available here. The data for the R code are available here. The document was created using Sweave, LaTeX and R. The work is part of a larger project to publish a book on this topic with Oxford University Press. Comments and suggestions are certainly welcome.

Monday, November 23, 2009

Reproducible Codes for Climate Science

Foundational pillars of science include transparency and reproducibility. Unfortunately too few climatologists take advantage of the powerful R language for statistical computing. It's a shame because it makes developing, maintaining and documenting code easy, thus facilitating replication. As an example see our blog entry for July 5th 2008 on how to use quantile regression to quantify the increasing trends in hurricane intensities over the period 1981-2006. This is obviously the future for climate research and the faster we move in this direction, the better off climate science will be.

Saturday, January 10, 2009

AMS Annual Meeting, Phoenix, AZ

The 2009 addition of the American Meteorological Society's annual meeting is being held here in Phoenix, Arizona. I am one of the instructors for a short course on the topic of statistics and extreme events. The Use R for Climate Research tutorial that I use for the course is available here.

Download R here and work through the tutorial by copying and pasting the code into an R session. Feel free to email me or Thomas Jagger questions.

Saturday, July 05, 2008

Quantile Regression for Trend Analysis

Introduction
Quantile regression extends ordinary least-squares regression to quantiles of the response variable. Ordinary regression is a model for the conditional mean, where the mean is conditional on the value of the explanatory variable. Likewise, quantile regression is a model for the conditional quantiles. For trend analysis the explanatory variable is time. Quantiles are points taken at regular intervals from the cumulative distribution function of a random variable. The quantiles mark a set of ordered data into equal-sized data subsets.

The software is downloaded from the internet and installed on a computer. A data set from the internet is imported into a software session. An exploratory plot of the data is created to visualize the trends. A quantile regression model is run on the data to quantify the trends and determine their statistical significance.

Equipment


1. Computer running Macintosh, Linux/Unix, or Windows

2. Internet connection

3. R for statistical computing (http://www.r-project.org/) [1]

Time Needed
Approximately 20 minutes.

Procedure
1. Download and install R.
Tip: Only the base directory is needed.

2. Click on the icon to start R. With Linux/Unix, type the letter R from a command window.

3. Read the data into an R session by typing on the command line:
StormMax=read.csv("http://myweb.fsu.edu/jelsner/
extspace/extremedatasince1899.csv",T)

Caution: Type it all on a single line. The quotes must be bidirectional.

4. Subset the cyclones by basin and by year after 1977 (satellite era) by typing:
StormMaxBasin=subset(StormMax,Region=="Basin")
StormMaxBasin=subset(StormMaxBasin,Yr>1977)

5. Make the columns of the data set available by name by typing:
attach(StormMaxBasin)
6. Create an exploratory plot of the annual lifetime maximum wind speed (intensity) as a function of year by typing:
x=boxplot(Wmax~as.factor(Yr),plot=F)
boxplot(Wmax~as.factor(Yr),ylim=c(35,175),
xlab="Year",ylab="Intensity (kt)")
xx=1:32
abline(lm(x$stats[5,]~xx),col="red")
abline(lm(x$stats[4,]~xx),col="blue")
abline(lm(x$stats[3,]~xx),col="green")

7. Install and load the quantreg package developed by Roger Koenker [2]. Then print the reference citation.
install.packages("quantreg")
library(quantreg)
citation("quantreg")

8. Summarize the results of the quantile regression model at the upper quantiles 0.75, 0.9, and 0.95.
summary(rq(Wmax~Yr,tau=c(0.75,0.9,0.95)),se="iid")
Tip: The standard errors (se= argument) can be estimated using other methods, type ?summary.rq

9. Plot the model results.
model=rq(Wmax~Yr,tau=seq(0.2,0.8,0.1))
plot(summary(model,alpha=0.05,se="iid"),
parm=2,pch=19,cex=1.2,mar=c(5,5,4,2)+0.1,
ylab="Trend (kt/yr)",xlab="Quantile")

Troubleshooting

Make sure there is permission on the computer to install software.


Make sure the commands are typed exactly as shown with bidirectional double quotes.


If copy/paste is used, make sure to change the quotations to bidirectional.

Anticipated Results
The exploratory plot should verify the lack of trend in the median lifetime maximum intensity. It should show a tendency for the strongest cyclones (higher quantiles) to get stronger over time. The statistical significance of the trends is assessed with a quantile regression model and the results are plotted.


References


[1] R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org (2006).


[2] Koenker, R. quantreg: Quantile Regression. R package version 4.10. http://www.r-project.org (2007)

Acknowledgments
Thanks go to all involved with the R project for statistical computing. Special thanks go to Thomas Jagger for his statistical help. The work is supported by the U.S. National Science Foundation, Risk Prediction Initiative of the Bermuda Institute for Ocean Studies, and the Florida Catastrophic Storm Risk Management Center of Florida State University.