You are here

Katherine Lothrop Statistical Research Update

Through these past few weeks, I have been exploring my knowledge (with a ton of troubleshooting) of R studio and big data analysis.  

I am currently working with Tawnya Peterson to help her prepare her current research paper of M. bacteria growth and it's link to chlorophyll.  Over the past several weeks I have been learning how to strip the data off the web, and analyze a year's worth of data.  

Imagine this: A sensor takes a sample of the salinity, temperature, RFU, Fluorescence, etc. every 3 seconds, how do you analyze a year's worth of data? If you were to import it into excel, it would take an entire day, if not longer.  With R studio, a free coding language designed to handle big data, it takes only 4 minutes.  

This has been my project.  For the first couple of weeks I really struggled with stripping the data and time off the web, and formatting the time from millions of seconds (ex: 11231120 seconds) to date format (Febuary 28, 2014).  After finally figuring out how to format this using the ggplot2 package, I moved on to struggle with formatting the data and time together, on a time series graph over the course of a year.  Now it is week 6, and I have finally produced more than 15 acceptable plots for Tawnya to use in her research paper.  We are about to move on to the regression analysis of the data and exploring different statistical analysis techniques.  

This does not include the work I have done before joining Tawnya out in Portland.  Before arriving in Portland, I worked with Tawnya to create graphs of tidal cycles using salinity, temperature, elecrical conductivity, etc. for a random day of a month, a random month of the year, and a random 14 day tidal cycle.  

This will be very useful in my future career, as companies are never at a loss for information and data to be analyzed.  I hope to use my background in statistics to one day provide useful anylyses of product design manufacturing efficiency.