C++ API Reference for Intel® Data Analytics Acceleration Library 2019 Update 5

References | Namespaces | Classes | Enumerations

Contains classes for computing initial centroids for the K-Means algorithm. More...

References

 Batch
 
 Distributed
 

Namespaces

 daal::algorithms::kmeans::init
 Contains classes for computing initial centroids for the K-Means algorithm.
 
 daal::algorithms::kmeans::init::interface1
 Contains version 1.0 of the Intel(R) Data Analytics Acceleration Library (Intel(R) DAAL) interface.
 

Classes

struct  Parameter
 Base classes parameters for computing initial centroids for the K-Means algorithm. More...
 
class  InputIface
 Interface for the K-Means initialization batch and distributed Input classes. More...
 
class  Input
 Input objects for computing initial centroids for the K-Means algorithm More...
 
class  PartialResult
 Partial results obtained with the compute() method of the K-Means algorithm in the batch processing mode. More...
 
class  Result
 Results obtained with the compute() method that computes initial centroids for the K-Means algorithm in the batch processing mode. More...
 
class  DistributedStep2MasterInput
 Input objects for computing initials clusters for the K-Means algorithm in the second step of the distributed processing mode More...
 
struct  DistributedStep2LocalPlusPlusParameter
 Parameters for computing initial centroids for the K-Means algorithm. More...
 
class  DistributedStep2LocalPlusPlusInput
 Interface for the K-Means initialization distributed Input classes used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
class  DistributedStep3MasterPlusPlusInput
 Interface for the K-Means distributed Input classes used with plusPlus and parallelPlus methods only on the 3rd step on a master node. More...
 
class  DistributedStep4LocalPlusPlusInput
 Interface for the K-Means distributed Input classes used with plusPlus and parallelPlus methods only on the 4th step on a local node. More...
 
class  DistributedStep5MasterPlusPlusInput
 Interface for the K-Means distributed Input classes. More...
 
class  DistributedStep2LocalPlusPlusPartialResult
 Partial results obtained with the compute() method of the K-Means algorithm in the distributed processing mode. More...
 
class  DistributedStep3MasterPlusPlusPartialResult
 Partial results obtained with the compute() method of the K-Means algorithm in the distributed processing mode. More...
 
class  DistributedStep4LocalPlusPlusPartialResult
 Partial results obtained with the compute() method of the K-Means algorithm in the distributed processing mode. More...
 
class  DistributedStep5MasterPlusPlusPartialResult
 Partial results obtained with the compute() method of the K-Means algorithm in the distributed processing mode. More...
 

Enumerations

enum  Method {
  deterministicDense = 0, defaultDense = 0, randomDense = 1, plusPlusDense = 2,
  parallelPlusDense = 3, deterministicCSR = 4, randomCSR = 5, plusPlusCSR = 6,
  parallelPlusCSR = 7
}
 
enum  InputId { data }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm. More...
 
enum  DistributedStep2MasterInputId { partialResults }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm in the distributed processing mode. More...
 
enum  DistributedLocalPlusPlusInputDataId { internalInput = lastDistributedStep2MasterInputId + 1 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with plusPlus and parallelPlus methods only on a local node. More...
 
enum  DistributedStep2LocalPlusPlusInputId { inputOfStep2 = lastDistributedLocalPlusPlusInputDataId + 1 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep3MasterPlusPlusInputId { inputOfStep3FromStep2 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with plusPlus and parallelPlus methods only on the 3rd step on a master node. More...
 
enum  DistributedStep4LocalPlusPlusInputId { inputOfStep4FromStep3 = lastDistributedLocalPlusPlusInputDataId + 1 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with plusPlus and parallelPlus methods only on a local node. More...
 
enum  DistributedStep5MasterPlusPlusInputId { inputCentroids, inputOfStep5FromStep2 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with parallelPlus method only on a master node. More...
 
enum  DistributedStep5MasterPlusPlusInputDataId { inputOfStep5FromStep3 = lastDistributedStep5MasterPlusPlusInputId + 1 }
 Available identifiers of input objects for computing initial centroids for the K-Means algorithm used with parallelPlus methods only on the 5th step on a master node. More...
 
enum  PartialResultId { partialCentroids, partialClusters = partialCentroids, partialClustersNumber }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode. More...
 
enum  DistributedStep2LocalPlusPlusPartialResultId { outputOfStep2ForStep3, outputOfStep2ForStep5 }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep2LocalPlusPlusPartialResultDataId { internalResult = lastDistributedStep2LocalPlusPlusPartialResultId + 1 }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 2nd step on a local node. More...
 
enum  DistributedStep3MasterPlusPlusPartialResultId { outputOfStep3ForStep4 }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 3rd step on a master node. More...
 
enum  DistributedStep3MasterPlusPlusPartialResultDataId { rngState = lastDistributedStep3MasterPlusPlusPartialResultId + 1, outputOfStep3ForStep5 = rngState }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with parallelPlus method only on the 3rd step on a master node. More...
 
enum  DistributedStep4LocalPlusPlusPartialResultId { outputOfStep4 }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with plusPlus and parallelPlus methods only on the 4th step on a local node. More...
 
enum  DistributedStep5MasterPlusPlusPartialResultId { candidates, weights }
 Available identifiers of partial results of computing initial centroids for the K-Means algorithm in the distributed processing mode used with parallelPlus method only on the 5th step on a master node. More...
 
enum  ResultId { centroids }
 Available identifiers of the results of computing initial centroids for the K-Means algorithm. More...
 

Enumeration Type Documentation

enum DistributedLocalPlusPlusInputDataId

Enumerator
internalInput 

DataCollection with internal algorithm data calculated by previous steps on this node

enum DistributedStep2LocalPlusPlusInputId

Enumerator
inputOfStep2 

Numeric table with the new centroids calculated by previous steps of initialization algorithm

enum DistributedStep2LocalPlusPlusPartialResultDataId

Enumerator
internalResult 

DataCollection with internal algorithm data required as an input for the future steps on the node

enum DistributedStep2LocalPlusPlusPartialResultId

Enumerator
outputOfStep2ForStep3 

Numeric table containing output from step 2 on the local node used by step 3 on a master node

outputOfStep2ForStep5 

Numeric table containing output from step 2 on the local node used by step 5 on a master node

enum DistributedStep2MasterInputId

Enumerator
partialResults 

Collection of partial results computed on local nodes

enum DistributedStep3MasterPlusPlusInputId

Enumerator
inputOfStep3FromStep2 

Numeric table with the data calculated on step2 on local nodes

enum DistributedStep3MasterPlusPlusPartialResultDataId

Enumerator
rngState 

Service data generated as the output of step3Master to be used in step5Master

outputOfStep3ForStep5 

Service data generated as the output of step3Master to be used in step5Master

enum DistributedStep3MasterPlusPlusPartialResultId

Enumerator
outputOfStep3ForStep4 

KeyValueDataCollection with the input for local nodes on step 4

enum DistributedStep4LocalPlusPlusInputId

Enumerator
inputOfStep4FromStep3 

Numeric table with the data calculated on step3 on master node

enum DistributedStep4LocalPlusPlusPartialResultId

Enumerator
outputOfStep4 

NumericTable with the new centroids calculated on step 4 on the local node

enum DistributedStep5MasterPlusPlusInputDataId

Enumerator
inputOfStep5FromStep3 

Service data generated as the output of step3Master

enum DistributedStep5MasterPlusPlusInputId

Enumerator
inputCentroids 

DataCollection of NumericTables with the new centroids

inputOfStep5FromStep2 

DataCollection of NumericTables with the new centroids rating

enum DistributedStep5MasterPlusPlusPartialResultId

Enumerator
candidates 

NumericTable with the new centroids calculated on the previous steps

weights 

NumericTable with the weights of the new centroids calculated on the previous steps

enum InputId

Enumerator
data 

Input data table

enum Method

Available methods for computing initial centroids for the K-Means algorithm

Enumerator
deterministicDense 

Default: uses first nClusters points as initial centroids

defaultDense 

Synonym of deterministicDense

randomDense 

Uses random nClusters points as initial centroids

plusPlusDense 

Kmeans++ algorithm by Arthur and Vassilvitskii (2007): http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [1] the first center is selected at random, each subsequent center is selected with a probability proportional to its contribution to the overall error

parallelPlusDense 

Kmeans|| algorithm: scalable Kmeans++ by Bahmani et al. (2012) http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf [2]

deterministicCSR 

Uses first nClusters points as initial centroids for data in a CSR numeric table

randomCSR 

Uses random nClusters points as initial centroids for data in a CSR numeric table

plusPlusCSR 

Kmeans++ algorithm Arthur and Vassilvitskii (2007) http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [1] for the data in a CSR numeric table: the first center is selected at random, each subsequent center is selected with a probability proportional to its contribution to the overall error

parallelPlusCSR 

Kmeans|| algorithm: scalable Kmeans++ by Bahmani et al. (2012) http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf [2] for the data in a CSR numeric table

enum PartialResultId

Enumerator
partialCentroids 

Table with the sum of observations assigned to centroids

partialClusters 

Table with the sum of observations assigned to centroids

Deprecated:
This item will be removed in a future release.
partialClustersNumber 

Table with the number of observations assigned to centroids

Deprecated:
This item will be removed in a future release.
enum ResultId

Enumerator
centroids 

Table for cluster centroids

For more complete information about compiler optimizations, see our Optimization Notice.