Minimizing the difference within groups - what Wang & Song refer to as
withinss, or within sum-of-squares, means that groups are optimally
homogenous within and the data is split into representative groups.
This is very useful for visualization, where you may want to represent
a continuous variable in discrete color or style groups. This function
can provide groups that emphasize differences between data.
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
Attempts to find a reasonable number of significant digits for the created breaks
Probably not possible to be 100% correct with all possible data sets,
but should err on the side of too much significance, rather than too little.
Parameters
numbers: number[]
Returns number
getDecimalSeparator
getDecimalSeparator(): string
Gets the internationalized decimal separator
Returns string
uniqueCountSorted
uniqueCountSorted(input: any[]): number
For a sorted input, counting the number of unique values
is possible in constant time and constant memory. This is
a simple implementation of the algorithm.
Values are compared with ===, so objects and non-primitive objects
are not handled in any special way.
Ckmeans clustering is an improvement on heuristic-based clustering approaches like Jenks. The algorithm was developed by Haizhou Wang and Mingzhou Song (http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Wang+Song.pdf) as a dynamic programming(https://en.wikipedia.org/wiki/Dynamic_programming) approach to the problem of clustering numeric data into groups with the least within-group sum-of-squared-deviations.
Minimizing the difference within groups - what Wang & Song refer to as
withinss
, or within sum-of-squares, means that groups are optimally homogenous within and the data is split into representative groups. This is very useful for visualization, where you may want to represent a continuous variable in discrete color or style groups. This function can provide groups that emphasize differences between data.From the JavaScript implementation by Tom MacWright https://github.com/simple-statistics Original copyright notice follows:
Copyright (c) 2014, Tom MacWright
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.