Skip to contents

This function takes quality.scores, trims it and fits it to the distribution given. It then iteratively tests the largest datapoint compared a null distribution of size no.simulations. If the largest datapoint has a significant p-value it tests the 2nd largest one and so on. The function supports the following distributions:

  • 'weibull'

  • 'norm'

  • 'gamma'

  • 'exp'

  • 'lnorm'

  • 'cauchy'

  • 'logis'

Usage

cosine.similarity.iterative(
  quality.scores,
  no.simulations,
  distribution = c("lnorm", "weibull", "norm", "gamma", "exp", "cauchy", "logis"),
  trim.factor = 0.05,
  alpha.significant = 0.05
)

Arguments

quality.scores

A dataframe with columns 'Sum' (of scores) and 'Sample', i.e. the output of accumulate.zscores

no.simulations

The number of datasets to simulate

distribution

A distribution to test, will default to 'lnorm'

trim.factor

What fraction of values of each to trim to get parameters without using extremes

alpha.significant

Alpha value for significance