This is a C++ implementation of scalable DBSCAN and OPTICS, called sDBSCAN and sOPTICS, in high dimensional space.
sDBSCAN and sOPTICS use a significantly large number of random projection vectors to utilize the neighborhood preserving property of a few random vectors.
This lead to a multi-thread friendly implementation with more than 100x speedup compared to scikit-learn.

sDBSCAN and sOPTICS use FFHT (Fast Fast Hadamard Transform) (https://github.com/FALCONN-LIB/FFHT) that provides a heavily optimized C99 implementation of the Fast Hadamard Transform.

sDBSCAN also needs EigenLib (https://eigen.tuxfamily.org) with vectorization to fast compute distance and boost (https://www.boost.org/) with binary histogram.


# Run sOptics, sDbscan, sDbscan-1NN

./DbscanCEOs --numPts 70000 --numDim 784 --X "data/mnist_all_X" --alg sOptics --eps 1800 --minPts 50 --numEmbed 1024 --numProj 1024 --topKProj 5 --topMProj 50 --dist L2 --output y_optics_sigma_2600 --numThreads 4 --sigma 2600 

./DbscanCEOs --numPts 70000 --numDim 784 --X "data/mnist_all_X" --alg sDbscan --eps 1300 --minPts 50 --numEmbed 1024 --numProj 1024 --topKProj 5 --topMProj 50 --dist L2 --clusterNoise 0 --output y_dbscan --numThreads 64

./DbscanCEOs --numPts 70000 --numDim 784 --X "data/mnist_all_X" --alg sDbscan_1NN --eps 1300 --minPts 50 --numEmbed 1024 --numProj 1024 --topKProj 5 --topMProj 50 --dist L2 --clusterNoise 0 --output y_dbscan --numThreads 64

See the Compile-Run.sh for compiling and running scripts with Mnist8m.
Use test directory to run the algorithm on small Mnist (70,000 points)




