Compare commits

...

28 Commits

Author SHA1 Message Date
Arthur Poulet 924893d02d
migrate 2019-02-18 12:09:58 +01:00
Arthur POULET 2c76ea54fc
Update readme 2018-11-30 00:56:32 +01:00
Arthur POULET 7c8a0e01ca
Merge branch 'fix-big' 2018-11-30 00:55:54 +01:00
Arthur POULET 0d9304d183
Bump v0.3.0 2018-11-30 00:55:40 +01:00
Arthur POULET 15aad6fd73
Fix some big test and functions 2018-11-30 00:55:08 +01:00
Arthur POULET 9739e3d42c
Fix coding style 2018-11-29 00:17:23 +01:00
Arthur POULET 7ba3ee31ee
Update README freq exemple 2018-11-29 00:08:19 +01:00
Arthur POULET 59fa29a95d
Update readme development section 2018-11-29 00:05:17 +01:00
Arthur POULET 2f0a0ad399
Dump version v0.2.6 2018-11-29 00:04:03 +01:00
Arthur POULET e82eac24e1
Add all_frequencies 2018-11-29 00:03:38 +01:00
Arthur POULET 6af7fa3a3e
Bump v0.2.5 2018-11-28 19:48:20 +01:00
Arthur POULET 9a3b930177
Add frequency of 2018-11-28 19:47:52 +01:00
Arthur POULET 4e9af8e8a3
Bump v0.2.4 2018-11-21 21:10:49 +01:00
Arthur POULET 5b1f28512e
Merge branch 'median_quartiles' of git://github.com/pyrokar/stats 2018-11-21 21:09:48 +01:00
Gunter Solf e009c2943d Add method quartiles 2018-11-14 22:18:47 +01:00
Arthur POULET d2408b7c66
Remove changelog file 2018-11-06 00:31:36 +01:00
Arthur POULET da2d6566c1
Update README.md 2018-11-06 00:31:28 +01:00
Arthur POULET 78025d09be
Dump version 0.2.3 2018-10-28 02:22:58 +01:00
Arthur POULET 93a1c0841c
Merge branch 'pyrokar-median_quartiles' 2018-10-28 02:21:47 +01:00
Gunter Solf 3b032c8973 Add median / quartiles 2018-10-20 15:26:35 +02:00
Arthur POULET 7b82b989bc Dump version v0.2.2 2018-10-17 10:01:59 +02:00
Arthur POULET 072ed3e455 Update and fix the README 2018-08-13 14:41:37 +02:00
Arthur POULET 4b7435ad9a
Merge branch 'kojix2-patch-1' 2018-02-03 16:32:11 +01:00
Arthur POULET f586adfe1c
Update README 2018-02-03 16:31:46 +01:00
Arthur POULET 90da6764cf
Update specs to 0.24.1 2018-02-03 16:31:14 +01:00
kojix2 c9f356795b fix typo in README.md 2018-02-03 22:49:50 +09:00
kojix2 b85095c51d change require big_int or big_float to big 2018-02-03 22:46:54 +09:00
Arthur POULET 8d37e7f5d6
Improve a bit the documentation, specs, minor bugfix 2017-07-08 02:57:07 +01:00
24 changed files with 396 additions and 61 deletions

View File

@ -1,30 +0,0 @@
# v0.1.6
- Improve factorial (handle BigInt)
- Improve distributions (handle BigFloat and BigInt)
- Define more specs
- Improve Normale Distribution parameters (min-max instead of a-b)
# v0.1.5
- add specs for every function
- improve file architecture
# v0.1.4
- fix binomial distribution
# v0.1.3
- renamed crystal_proba to stats
# v0.1.2
- renamed CrystalProba to crystal_proba
# v0.1.0
## Added
- Initialization of the project
- The Binomial Distribution and Normale Distribution are added
- Basic specs are provided as documentation and unitary tests
- License set to MIT
- Compatibility with crystal v0.18
## Notes
- Project inherited from RubyBinomial and RubyNormale, from myself

View File

@ -1,7 +1,9 @@
**Migrated to <https://git.sceptique.eu/Sceptique/stats>**
# stats
An expressive implementation of statistical distributions.
Compatible with crystal v0.23
Compatible with crystal v0.27.0.
## Installation
@ -9,8 +11,8 @@ Add this to your application's `shard.yml`:
```yaml
dependencies:
Stats:
github: Nephos/stats
stats:
git: https://git.sceptique.eu/Sceptique/stats
```
@ -20,12 +22,13 @@ dependencies:
```crystal
require "stats"
include Stats
```
### Normal distribution
```crystal
NormaleDistribution::between # less_than, greater_than
NormaleDistribution.between # less_than, greater_than
standard_deviation: 15,
esperance: 100,
min: 85,
@ -82,9 +85,53 @@ Math.factorial(4) # => 24
[1,2,3,4].correlation_coef [-14,14,101,-100] + 1 > 1.5 # => false
```
### Median
```crystal
[1, 2, 5].median # => 2.0
[42, 1337].median # => 685.5
```
### Quartiles & Boxplot
*Note: not big compatible yet*
```crystal
[1, 3, 5].first_quartile # => 2.0 (alias of lower_quartile)
[1, 3, 5].second_quartile # => 3.0 (alias of median)
[1, 3, 5].third_quartile # => 4.0 (alias of upper_quartile)
[1, 3, 5].quartiles # => [2.0, 3.0, 4.0] ([Q1, Q2, Q3])
```
```crystal
arr = [-23, -5, 2, 5, 5, 6, 7, 8, 14, 15, 42, 1337]
arr.first_quartile # => 3.5 (Q1)
arr.second_quartile # => 6.5 (Q2)
arr.third_quartile # => 14.5 (Q3)
arr.interquartile_range # => 11.0 (alias of iqr) (IQR = Q3 - Q1)
# Tukey's fences with k = 1.5 (default parameter value)
arr.lower_fence # => -13.0 (Q1 - 1.5 * IQR)
arr.upper_fence # => 31 (Q3 + 1.5 * IQR)
arr.lower_outliers # => [-23]
arr.upper_outliers # => [42, 1337]
# Tukey's fences with k = 3 for "far out" outliers
arr.upper_fence(3) # => 47.5 (Q3 + 3 * IQR)
arr.upper_outliers(3) # => [1337]
```
### Frequency
```crystal
[0, 1, 2, 3].frequency_of(0) # => 0.25 (amount of X in the population, by the size of the population)
[0, 0, 1, 2, 3].all_frequencies # => { 0 => 0.4, 1 => 0.2, 2 => 0.2, 3 => 0.2}
```
## Development
- The lib is adapted to be usable with BigInt and BigFloat values
- The lib should take care of "big" numbers
## Contributing

View File

@ -1,5 +1,5 @@
name: stats
version: 0.1.6
version: 0.3.0
authors:
- Arthur Poulet <arthur.poulet@mailoo.org>

View File

@ -21,9 +21,9 @@ describe BinomialDistribution do
end
it "initialize errors" do
expect_raises { BinomialDistribution(Int32, Float64).new(-1) }
expect_raises { BinomialDistribution.new(0, -1) }
expect_raises { BinomialDistribution.new(0, 1.5) }
expect_raises(Math::DomainError) { BinomialDistribution(Int32, Float64).new(-1) }
expect_raises(Math::DomainError) { BinomialDistribution.new(0, -1) }
expect_raises(Math::DomainError) { BinomialDistribution.new(0, 1.5) }
end
it "distribute" do

View File

@ -1,3 +1,5 @@
require "big"
describe Math::Correlation do
it "test basic confidence interval" do
arr1 = [1, 2, 2.5, 3, 3.5, 3.8, 4, 4.2, 4.4, 4.5]
@ -9,4 +11,11 @@ describe Math::Correlation do
arr1.covariance(arr2).round(4).should eq 4.2215
arr1.correlation_coef(arr2).round(4).should eq 0.8906
end
it "test big" do
[BigInt.new(1), 2].correlation_coef([2, 3])
[1, 2].correlation_coef([BigInt.new(2), 3])
[BigInt.new(1), 2].correlation_coef([BigInt.new(2), 3])
[1, 2].correlation_coef([BigInt.new(2), BigInt.new(3)])
end
end

View File

@ -10,8 +10,8 @@ describe Math do
Math.factorial(6).should eq 720
end
it "factorial bigint" do
res = (1..20).to_a.reduce(BigInt.new 1) { |e, i| e * BigInt.new(i) }
Math.factorial(BigInt.new 20).should eq res
it "factorial big" do
res = (1..20).to_a.reduce(BigInt.new(1)) { |e, i| e * BigInt.new(i) }
Math.factorial(BigInt.new(20)).should eq res
end
end

34
spec/math/frequency.cr Normal file
View File

@ -0,0 +1,34 @@
FREQ_LIMIT = 100
describe Math::Frequency do
it "test trivia" do
([] of Int32).frequency_of(0).should eq 0.0
[0, 1, 2, 3].frequency_of(0).should eq 0.25
[0, 0, 1, 2, 3].frequency_of(0).should eq 0.40
[0, 0, 1, 2, 3].all_frequencies.should eq({0 => 0.4, 1 => 0.2, 2 => 0.2, 3 => 0.2})
# allfreq1 = [0, 0, 1, 2, 3].all_frequencies(2)
# allfreq1[1].should eq 0.4
# allfreq1.size.should eq 2
# expect_raises(Error) { [0, 0, 1, 2, 3].all_frequencies(2, true) }
end
it "test basic" do
FREQ_LIMIT.times do |modi|
modulo = modi + 1
# we should have the same or more because we don't / modulo in the freq
arr = FREQ_LIMIT.times.to_a.map { |e| e % modulo }
(arr.frequency_of(0) >= (1.0f64 / modulo)).should be_true
next if modulo <= 20
# make an array with less or equal amount of iterations
arr_less = FREQ_LIMIT.times.to_a.map { |e| e % (modulo + 5) }
(arr.frequency_of(0) >= arr_less.frequency_of(0)).should be_true
# make an array with more or equal amount of iterations
arr_more = FREQ_LIMIT.times.to_a.map { |e| e % (modulo - 5) }
(arr.frequency_of(0) <= arr_more.frequency_of(0)).should be_true
end
[0, 0, 0, 0, 1, 1, 1, 2, 2, 3].all_frequencies.should eq({0 => 0.4, 1 => 0.3, 2 => 0.2, 3 => 0.1})
end
end

View File

@ -1,5 +1,11 @@
require "big"
describe Math::MACD do
it "test basic macd" do
[1, 2, 3, 2, 1].macd(3).map { |e| e.round(3) }.should eq [2, 2.333, 2]
end
it "test big basic macd" do
puts [BigInt.new(1), BigInt.new(2), BigInt.new(3), BigInt.new(2), 1].macd(3).map { |e| e.round(3) }.should eq [2, 2.333, 2]
end
end

View File

@ -1,16 +1,29 @@
require "big"
describe Math::Mean do
it "test several basic mean special case" do
arr = ([] of Int32)
arr.mean.should eq 0.0
arr.quadratic_mean.should eq 0.0
arr.harmonic_mean.should eq 0.0
arr.geometric_mean.should eq 0.0
end
it "test mean on Array(Float64)" do
([] of Float64).mean.should eq 0.0
[1.0, 2.0, 3.0].mean.should eq 2.0
[1.0, 2.0, -3.0].mean.should eq 0.0
end
it "test mean on Array(Int32)" do
([] of Int32).mean.should eq 0.0
[1, 2, 3].mean.should eq 2.0
[1, 2, -3].mean.should eq 0.0
end
it "test mean on big" do
[BigInt.new(1), 2, 3].mean.should eq 2.0
[BigInt.new(1), BigInt.new(2), BigFloat.new(-3)].mean.should eq 0.0
end
it "test quadratic mean" do
[1, 2, 3, 2].quadratic_mean.round(4).should eq(2.1213)
[1, 2, 1, 5, 10, 9, 1, -13, 2].quadratic_mean.round(4).should eq(6.549)

23
spec/math/median.cr Normal file
View File

@ -0,0 +1,23 @@
require "big"
describe Math::Median do
it "test trivia" do
arr = ([] of Int32)
arr.median.should eq 0.0
end
it "test basic" do
[1.0, 2.0].median.should eq 1.5
[42, 1337].median.should eq 689.5
[1, 2, 5].median.should eq 2.0
[2, 5, 1].median.should eq 2.0
[4, 1, 1, 1, 2].median.should eq 1.0
end
it "test big" do
[BigInt.new(1.0), 2.0].median.should eq 1.5
[BigInt.new(1.0), BigFloat.new(2.0)].median.should eq 1.5
end
end

98
spec/math/quartile.cr Normal file
View File

@ -0,0 +1,98 @@
require "big"
module Math::Quartile
it "trivial" do
arr = [1, 3, 5]
arr.first_quartile.should eq 2.0
arr.second_quartile.should eq 3.0
arr.third_quartile.should eq 4.0
arr.quartiles.should eq [2.0, 3.0, 4.0]
arr.iqr.should eq 2.0
end
# TODO
# it "big" do
# arr = [BigInt.new(1), BigFloat.new(3), 5]
#
# arr.first_quartile.should eq 2.0
# arr.second_quartile.should eq 3.0
# arr.third_quartile.should eq 4.0
#
# arr.quartiles.should eq [2.0, 3.0, 4.0]
#
# arr.iqr.should eq 2.0
# end
it "odd size input" do
arr = [6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49]
arr.first_quartile.should eq 25.5
arr.second_quartile.should eq 40
arr.third_quartile.should eq 42.5
arr.quartiles.should eq [25.5, 40, 42.5]
arr.iqr.should eq 17.0
end
it "even size input" do
arr = [7, 15, 36, 39, 40, 41]
arr.first_quartile.should eq 15.0
arr.second_quartile.should eq 37.5
arr.third_quartile.should eq 40.0
arr.quartiles.should eq [15, 37.5, 40]
arr.iqr.should eq 25.0
end
it "complex" do
arr = [7, 7, 31, 31, 47, 75, 87, 115, 116, 119, 119, 155, 177]
arr.first_quartile.should eq 31
arr.second_quartile.should eq 87
arr.third_quartile.should eq 119
arr.quartiles.should eq [31, 87, 119]
arr.iqr.should eq 88
end
it "complex 2" do
# https://en.wikipedia.org/wiki/Quartile#Example_1
arr = [6, 7, 15, 36, 39, 40, 41, 42, 43, 47, 49]
arr.first_quartile.should eq 25.5
arr.second_quartile.should eq 40
arr.third_quartile.should eq 42.5
arr.quartiles.should eq [25.5, 40, 42.5]
arr.iqr.should eq 17
end
it "boxplot values" do
arr = [-23, -5, 2, 5, 5, 6, 7, 8, 14, 15, 42, 1337]
arr.first_quartile.should eq 3.5
arr.second_quartile.should eq 6.5
arr.third_quartile.should eq 14.5
arr.quartiles.should eq [3.5, 6.5, 14.5]
arr.iqr.should eq 11.0
arr.lower_fence.should eq -13.0
arr.upper_fence.should eq 31
arr.lower_outliers.should eq [-23]
arr.upper_outliers.should eq [42, 1337]
arr.upper_fence(3).should eq 47.5
arr.upper_outliers(3).should eq [1337]
end
end

View File

@ -9,6 +9,16 @@ describe Math::StandardDeviation do
arr.standard_deviation.should eq(standard_deviation)
end
it "test big" do
arr = [BigInt.new(1), BigFloat.new(2), 3, 4.0, 4]
mean = (4 + 4 + 3 + 2 + 1) / 5.0
variance = ((4 - mean)**2 + (4 - mean)**2 + (3 - mean)**2 + (2 - mean)**2 + (1 - mean)**2) / 5.0
standard_deviation = Math.sqrt(variance)
arr.mean.should eq(mean)
arr.variance.should eq(variance)
arr.standard_deviation.should eq(standard_deviation)
end
it "test standard deviation without explanations" do
arr = [1, 5, 23, 2, 0, 0, 1]
arr.mean.round(2).should eq 4.57
@ -18,4 +28,10 @@ describe Math::StandardDeviation do
[1.0, 2.0, 3.0].variance.round(4).should eq(0.6667)
[1.0, 2.0, 3.0].standard_deviation.round(4).should eq(0.8165)
end
it "test several special case" do
arr = [] of Int32
arr.variance.should eq 0.0
arr.standard_deviation.should eq 0.0
end
end

View File

@ -41,7 +41,7 @@ describe NormaleDistribution::Persistant do
end
it "must fail" do
expect_raises { NormaleDistribution::Persistant(Int32, Float64).new standard_deviation: -1 }
expect_raises { NormaleDistribution::Persistant(Int32, Float64).new standard_deviation: 0 }
expect_raises(ArgumentError) { NormaleDistribution::Persistant(Int32, Float64).new standard_deviation: -1 }
expect_raises(ArgumentError) { NormaleDistribution::Persistant(Int32, Float64).new standard_deviation: 0 }
end
end

View File

@ -1,9 +1,7 @@
# This file defines new operations on big numbers with native data types
# :nodoc:
require "big_int"
# :nodoc:
require "big_float"
require "big"
{% for klass in [Float32, Float64, Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64] %}
# :nodoc:

View File

@ -1,5 +1,5 @@
# :nodoc:
require "big_int"
require "big"
require "./factorial"

18
src/lib/math/frequency.cr Normal file
View File

@ -0,0 +1,18 @@
module Math::Frequency(T)
def frequency_of(value : T) : Float64
return 0.0f64 if empty?
count { |curr| curr == value }.to_f64 / size.to_f64
end
def all_frequencies : Hash(T, Float64)
values = to_set
frequencies = Hash(T, Float64).new(0.0f64, values.size)
each { |value| frequencies[value] += 1 }
frequencies.each { |k, _| frequencies[k] = frequencies[k] / size }
frequencies
end
end
module Enumerable(T)
include Math::Frequency(T)
end

View File

@ -1,8 +1,17 @@
module Math::MACD
# Computes the MACD (sliding mean) over N values.
# It will return another list of *size - n + 1* elements, because it is not
# possible to computes the n/2 first and lasts means.
# NOTE for now it is impossible to computes a MACD with a pair window (n should be odd)
#
# TODO: maybe computing a reduced MACD is good solution against reduced returns ?
def macd(n : Int)
((n.odd?) ? macd_odd(n) : macd_pair(n)).compact
end
# :nodoc:
#
# MACD computation if n is odd
private def macd_odd(n)
map_with_index do |_, i|
next if i < n / 2
@ -12,8 +21,11 @@ module Math::MACD
end
end
# :nodoc:
#
# MACD computation if n is pair
private def macd_pair(n)
raise ArgumentError.new "The MACD should be computed for an odd window"
raise ArgumentError.new "The MACD should be computed for an odd window (you should retry with #{n + 1})"
end
end

View File

@ -1,22 +1,30 @@
module Math::Mean
# Standard arithmetic mean
# TODO: Handle big Float/Int
def mean : Float64
def mean
return 0.0_f64 if empty?
sum.to_f64 / size.to_f64
end
# The square root of mean square
def quadratic_mean : Float64
# The root square mean of the list.
# TODO: Handle big Float/Int
def quadratic_mean
return 0.0_f64 if empty?
Math.sqrt map { |e| e ** 2 }.mean
end
# The geometric mean of the list.
# For [a, b], a/c = c/b; c**2 = a*b
def geometric_mean : Float64
# TODO: Handle big Float/Int
def geometric_mean
return 0.0_f64 if empty?
reduce { |l, r| l * r } ** (1.0 / size.to_f64)
end
def harmonic_mean : Float64
# The harmonic mean of the list.
# TODO: Handle big Float/Int
def harmonic_mean
return 0.0_f64 if empty?
size.to_f64 / map { |e| 1.0 / e }.sum
end
end

13
src/lib/math/median.cr Normal file
View File

@ -0,0 +1,13 @@
module Math::Median
def median
return 0.0_f64 if empty?
sorted = sort
size = size()
return sorted[(size - 1) / 2] / 1.0 if size.odd?
(sorted[(size / 2) - 1] + sorted[size / 2]) / 2.0
end
end
module Enumerable(T)
include Math::Median
end

70
src/lib/math/quartile.cr Normal file
View File

@ -0,0 +1,70 @@
# There are several methods for computing the quartiles of an array.
#
# This library utilizes the method proposed by John Tukey
# https://en.wikipedia.org/wiki/Quartile#Method_2
#
module Math::Quartile
def lower_quartile : Float64
return 0.0_f64 if empty?
m = self.median
lower_half = self.select { |i| i <= m }
lower_half.median
end
# alias
def first_quartile : Float64
lower_quartile
end
# alias
def second_quartile : Float64
median
end
def upper_quartile : Float64
return 0.0_f64 if empty?
m = self.median
upper_half = self.select { |i| i >= m }
upper_half.median
end
# alias
def third_quartile : Float64
upper_quartile
end
def quartiles : Array(Float64)
[first_quartile, second_quartile, third_quartile]
end
def iqr : Float64
third_quartile - first_quartile
end
# alias
def interquartile_range : Float64
iqr
end
def lower_fence(k : Number = 1.5) : Float64
lower_quartile - k * iqr
end
def upper_fence(k : Number = 1.5) : Float64
upper_quartile + k * iqr
end
def lower_outliers(k : Number = 1.5) : Array
lf = lower_fence k
self.select { |i| i < lf }
end
def upper_outliers(k : Number = 1.5) : Array
uf = upper_fence k
self.select { |i| i > uf }
end
end
module Enumerable(T)
include Math::Quartile
end

View File

@ -1,13 +1,14 @@
require "./mean"
module Math::StandardDeviation
# Squared deviation from the mean
# Squared deviation from the mean.
def variance
return 0.0_f64 if empty?
mean = mean()
self.map { |e| (e - mean)**2 }.mean
end
# Population standard deviation
# Population standard deviation.
def standard_deviation
Math.sqrt variance
end

View File

@ -1,5 +1,4 @@
# require "big_int"
# require "big_float"
# require "big"
require "./stats/*"
require "./lib/*"

View File

@ -1,3 +1,3 @@
module Stats
VERSION = "0.2"
VERSION = "0.3"
end