Davejacobs Stats Save

An experiment with stats, the Ruby way

Project README

Stats

Description

This is a prototype of a statistical library for Ruby. Starting out, the purpose of the library is to be readable (for people studying statistics), to be well-tested (against R and Python statistical functions), and to be useful for Small Data. Big Data can come later, if I have enough fun. With stats, I aim to create an API that makes statistics intuitive and harder to mess up. For example, I'd like to take a stab at an assumption framework that can tag specific functions with assumptions that will throw warnings if they're not met.


Try it out

Once this is stable and fully tested (it is so far for all the functions listed below), I'll consider publishing it as a gem. Until then, you can play around with master:

brew install gsl
git clone https://github.com/davejacobs/stats.git
cd stats
bundle

Running tests

I've started integrating R into my tests to make testing as easy and repeatable as possible. I'm also planning to incorporate something like Randly to expand the values that I test.

To run tests:

brew install homebrew/science/r
rspec

Progress

For developers

  • Get Ruby GSL bindings (gem install gsl) to work on Ruby 2.0/OS X
  • Implement gemspec so this is installable via git URL

Distribution functions

I've added a wrapper around GSL distribution functions, for more intuitive access and testing.

  • Normal distribution - PDF & CDF
  • Chi square distribution - PDF & CDF
  • T distribution - PDF & CDF
  • F distribution - PDF & CDF

Basic functions

  • Mean, arithmetic
  • Mean, geometric
  • Median
  • Mode
  • Variance
  • Standard deviation
  • Standard error of the mean (for samples only)
  • Relative standard error of the mean (for samples only)
  • Coefficient of variation

Significance tests

  • Chi square
  • T-test, single sample
  • T-test, two-sample
  • T-test, repeated measures
  • Wilcoxon rank sum test
  • Wilcoxon signed rank test
  • Median test
  • Kruskall-Wallis H test
  • Friedman test
  • ANOVA, one-way
  • Factorial ANOVA, two-way
  • Factorial ANOVA, three-way
  • ANOVA, repeated measures
  • MANOVA
  • ANCOVA
  • Welch's ANOVA
  • Fisher's least significant difference

Regressions

  • Linear regression
  • Multiple linear regression
  • Pearson's correlation
  • Spearman correlation

Support & other

  • Basic assumption framework
  • Confidence intervals (general idea)
  • Basic data structures
  • Significance methods on data structures
  • Test using R integration and something like Rantly

Resources

Open Source Agenda is not affiliated with "Davejacobs Stats" Project. README Source: davejacobs/stats
Stars
39
Open Issues
1
Last Commit
7 years ago
Repository

Open Source Agenda Badge

Open Source Agenda Rating