Algorithms

Algorithms

This page documents library components that are all basically just implementations of mathematical functions or algorithms that don't fit in any of the other pages of the dlib documentation. So this includes things like checksums, cryptographic hashes, sorting, etc.

Tools bigint disjoint_subsets disjoint_subsets_sized Quantum Computing _{quantum_register
gate} hsort_array isort_array numeric_constants put_in_range qsort_array split_array integrate_function_adapt_simp square_root Set Utilities _{set_intersection_size
set_intersection
set_union
set_difference}

Statistics rand median running_stats running_stats_decayed running_scalar_covariance_decayed running_gradient running_scalar_covariance mean_sign_agreement correlation covariance r_squared mean_squared_error running_covariance running_cross_covariance random_subset_selector randomly_subsample find_upper_quantile count_steps_without_decrease_robust count_steps_without_decrease count_steps_without_increase binomial_random_vars_are_different event_correlation max_scoring_element min_scoring_element

Hashing md5 crc32 hash count_bits hamming_distance murmur_hash3 murmur_hash3_128bit gaussian_random_hash uniform_random_hash projection_hash create_random_projection_hash create_max_margin_projection_hash hash_samples hash_similar_angles_64 hash_similar_angles_128 hash_similar_angles_256 hash_similar_angles_512

Filtering kalman_filter rls_filter momentum_filter rect_filter find_optimal_rect_filter find_optimal_momentum_filter

hash_similar_angles_64 dlib/lsh.h dlib/lsh/hashes_abstract.h This object is a tool for computing locality sensitive hashes that give vectors with small angles between each other similar hash values. In particular, this object creates 64 random planes which pass though the origin and uses them to create a 64bit hash. hash_similar_angles_128 dlib/lsh.h dlib/lsh/hashes_abstract.h This object is a tool for computing locality sensitive hashes that give vectors with small angles between each other similar hash values. In particular, this object creates 128 random planes which pass though the origin and uses them to create a 128bit hash. hash_similar_angles_256 dlib/lsh.h dlib/lsh/hashes_abstract.h This object is a tool for computing locality sensitive hashes that give vectors with small angles between each other similar hash values. In particular, this object creates 256 random planes which pass though the origin and uses them to create a 256bit hash. hash_similar_angles_512 dlib/lsh.h dlib/lsh/hashes_abstract.h This object is a tool for computing locality sensitive hashes that give vectors with small angles between each other similar hash values. In particular, this object creates 512 random planes which pass though the origin and uses them to create a 512bit hash. hash_samples dlib/graph_utils_threaded.h dlib/graph_utils/find_k_nearest_neighbors_lsh_abstract.h This is a simple function for hashing a bunch of vectors using a locality sensitive hashing object such as hash_similar_angles_128. It is also capable of running in parallel on a multi-core CPU. bigint dlib/bigint.h dlib/bigint/bigint_kernel_abstract.h This object represents an arbitrary precision unsigned integer. It's pretty simple. It's interface is just like a normal int, you don't have to tell it how much memory to use or anything unusual. It just goes :) bigint_kernel_1 dlib/bigint/bigint_kernel_1.h This implementation is done using an array of unsigned shorts. It is also reference counted. For further details see the above link. Also note that kernel_2 should be faster in almost every case so you should really just use that version of the bigint object. kernel_1a is a typedef for bigint_kernel_1 bigint_kernel_2 dlib/bigint/bigint_kernel_2.h This implementation is basically the same as kernel_1 except it uses the Fast Fourier Transform to perform multiplications much faster. kernel_2a is a typedef for bigint_kernel_2 crc32 dlib/crc32.h dlib/crc32/crc32_kernel_abstract.h This object represents the CRC-32 algorithm for calculating checksums. gaussian_random_hash dlib/hash.h dlib/general_hash/random_hashing_abstract.h This function uses hashing to generate Gaussian distributed random values with mean 0 and variance 1. uniform_random_hash dlib/hash.h dlib/general_hash/random_hashing_abstract.h This function uses hashing to generate uniform random values in the range [0,1). murmur_hash3 dlib/hash.h dlib/general_hash/murmur_hash3_abstract.h This function takes a block of memory and returns a 32bit hash. The hashing algorithm used is Austin Appleby's excellent MurmurHash3. murmur_hash3_128bit dlib/hash.h dlib/general_hash/murmur_hash3_abstract.h This function takes a block of memory and returns a 128bit hash. The hashing algorithm used is Austin Appleby's excellent MurmurHash3. kalman_filter dlib/filtering.h dlib/filtering/kalman_filter_abstract.h This object implements the Kalman filter, which is a tool for recursively estimating the state of a process given measurements related to that process. To use this tool you will have to be familiar with the workings of the Kalman filter. An excellent introduction can be found in the paper:

An Introduction to the Kalman Filter by Greg Welch and Gary Bishop

momentum_filter dlib/filtering.h dlib/filtering/kalman_filter_abstract.h This object is a simple tool for filtering a single scalar value that measures the location of a moving object that has some non-trivial momentum. Importantly, the measurements are noisy and the object can experience sudden unpredictable accelerations. To accomplish this filtering we use a simple Kalman filter with a state transition model of:


   position_{i+1} = position_{i} + velocity_{i} 
   velocity_{i+1} = velocity_{i} + some_unpredictable_acceleration

and a measurement model of:


   measured_position_{i} = position_{i} + measurement_noise

Where some_unpredictable_acceleration and measurement_noise are 0 mean Gaussian noise sources. To allow for really sudden and large but infrequent accelerations, at each step we check if the current measured position deviates from the predicted filtered position by more than a user specified amount, and if so we adjust the filter's state to keep it within these bounds. This allows the moving object to undergo large unmodeled accelerations, far in excess of what would be suggested by the basic Kalman filter's noise model, without then experiencing a long lag time where the Kalman filter has to "catch up" to the new position. rect_filter dlib/filtering.h dlib/filtering/kalman_filter_abstract.h This object is just a momentum_filter applied to the four corners of a rectangle. It allows you to filter a stream of rectangles, for instance, bounding boxes from an object detector applied to a video stream. find_optimal_momentum_filter dlib/filtering.h dlib/filtering/kalman_filter_abstract.h This function finds the "optimal" settings of a momentum_filter based on unfiltered measurement data. find_optimal_rect_filter dlib/filtering.h dlib/filtering/kalman_filter_abstract.h This function finds the "optimal" settings of a rect_filter based on unfiltered measurement data. rls_filter dlib/filtering.h dlib/filtering/rls_filter_abstract.h This object is a tool for doing time series prediction using linear recursive least squares. In particular, this object takes a sequence of points from the user and, at each step, attempts to predict the value of the next point. projection_hash dlib/lsh.h dlib/lsh/projection_hash_abstract.h This is a tool for hashing elements of a vector space into the integers. It is intended to represent locality sensitive hashing functions such as the popular random projection hashing method. create_random_projection_hash dlib/lsh.h dlib/lsh/create_random_projection_hash_abstract.h Creates a random projection based locality sensitive hashing function. The projection matrix is generated by sampling its elements from a Gaussian random number generator. create_max_margin_projection_hash dlib/lsh.h dlib/lsh/create_random_projection_hash_abstract.h Creates a random projection based locality sensitive hashing function. This is accomplished using a variation on the random hyperplane generation technique from the paper:

Random Maximum Margin Hashing by Alexis Joly and Olivier Buisson

In particular, we use a linear support vector machine to generate planes. We train it on randomly selected and randomly labeled points from the data to be hashed. hash dlib/hash.h dlib/general_hash/hash_abstract.h This is a set of convenience functions for invoking murmur_hash3 on std::strings, std::vectors, std::maps, or dlib::matrix objects.

As an aside, the hash() for matrix objects is defined here. It has the same interface as all the others.

count_bits dlib/hash.h dlib/general_hash/count_bits_abstract.h This function counts the number of bits in an unsigned integer which are set to 1. hamming_distance dlib/hash.h dlib/general_hash/count_bits_abstract.h This function returns the hamming distance between two unsigned integers. That is, it returns the number of bits which differer in the two integers. rand dlib/rand.h dlib/rand/rand_kernel_abstract.h This object represents a pseudorandom number generator. disjoint_subsets dlib/disjoint_subsets.h dlib/disjoint_subsets/disjoint_subsets_abstract.h This object represents a set of integers which is partitioned into a number of disjoint subsets. It supports the two fundamental operations of finding which subset a particular integer belongs to as well as merging subsets. disjoint_subsets_sized dlib/disjoint_subsets.h dlib/disjoint_subsets/disjoint_subsets_sized_abstract.h This object is just like disjoint_subsets except that it also keeps track of the size of each set. running_stats dlib/statistics.h dlib/statistics/statistics_abstract.h This object represents something that can compute the running mean, variance, skewness, and kurtosis statistics of a stream of real numbers. running_stats_ex.cpp.html kcentroid_ex.cpp.html running_stats_decayed dlib/statistics.h dlib/statistics/statistics_abstract.h This object represents something that can compute the running mean and variance of a stream of real numbers. It is similar to running_stats except that it forgets about data it has seen after a certain period of time. It does this by exponentially decaying old statistics. running_scalar_covariance_decayed dlib/statistics.h dlib/statistics/statistics_abstract.h This object represents something that can compute the running covariance of a stream of real number pairs. It is essentially the same as running_scalar_covariance except that it forgets about data it has seen after a certain period of time. It does this by exponentially decaying old statistics. running_gradient dlib/statistics/running_gradient.h dlib/statistics/running_gradient_abstract.h This object is a tool for estimating if a noisy sequence of numbers is trending up or down and by how much. It does this by finding the least squares fit of a line to the data and then allows you to perform a statistical test on the slope of that line. find_upper_quantile dlib/statistics/running_gradient.h dlib/statistics/running_gradient_abstract.h Finds and returns the scalar value such that a user specified percentage of the values in a container are greater than said value. For example, 0.5 would find the median value in a container while 0.1 would find the value that lower bounded the 10% largest values in a container. count_steps_without_increase dlib/statistics/running_gradient.h dlib/statistics/running_gradient_abstract.h Given a potentially noisy time series, this function returns a count of how long the time series has gone without noticeably increasing in value. It does this by adding the elements of the time series into a running_gradient object and counting how many elements, starting with the most recent, you need to examine before you are confident that the series has been increasing in value. binomial_random_vars_are_different dlib/statistics/statistic.h dlib/statistics/statistics_abstract.h This function performs a simple statistical test to check if two binomially distributed random variables have the same parameter (i.e. the chance of "success"). It uses the simple likelihood ratio test discussed in the following paper:

Dunning, Ted. "Accurate methods for the statistics of surprise and coincidence." Computational linguistics 19.1 (1993): 61-74.

So for an extended discussion of the method see the above paper. event_correlation dlib/statistics/statistic.h dlib/statistics/statistics_abstract.h This function does a statistical test to determine if two events co-occur in a statistically significant way. It uses the simple likelihood ratio test discussed in the following paper:

Dunning, Ted. "Accurate methods for the statistics of surprise and coincidence." Computational linguistics 19.1 (1993): 61-74.

So for an extended discussion of the method see the above paper. max_scoring_element dlib/algs.h dlib/algs.h This function finds the element of container that has the largest score, according to a user supplied score function, and returns a std::pair containing that maximal element along with the score. min_scoring_element dlib/algs.h dlib/algs.h This function finds the element of container that has the smallest score, according to a user supplied score function, and returns a std::pair containing that minimal element along with the score. count_steps_without_decrease dlib/statistics/running_gradient.h dlib/statistics/running_gradient_abstract.h Given a potentially noisy time series, this function returns a count of how long the time series has gone without noticeably decreasing in value. It does this by adding the elements of the time series into a running_gradient object and counting how many elements, starting with the most recent, you need to examine before you are confident that the series has been decreasing in value. count_steps_without_decrease_robust dlib/statistics/running_gradient.h dlib/statistics/running_gradient_abstract.h This function behaves just like count_steps_without_decrease except that it ignores times series values that are anomalously large. This makes it robust to sudden noisy but transient spikes in the time series values. running_covariance dlib/statistics.h dlib/statistics/statistics_abstract.h This object is a simple tool for computing the mean and covariance of a sequence of vectors. running_cross_covariance dlib/statistics.h dlib/statistics/statistics_abstract.h This object is a simple tool for computing the mean and cross-covariance matrices of a sequence of pairs of vectors. running_scalar_covariance dlib/statistics.h dlib/statistics/statistics_abstract.h This object is a simple tool for computing the covariance of a sequence of scalar values. mean_sign_agreement dlib/statistics.h dlib/statistics/statistics_abstract.h This is a function for computing the probability that matching elements of two std::vectors have the same sign. correlation dlib/statistics.h dlib/statistics/statistics_abstract.h This is a function for computing the correlation between matching elements of two std::vectors. covariance dlib/statistics.h dlib/statistics/statistics_abstract.h This is a function for computing the covariance between matching elements of two std::vectors. r_squared dlib/statistics.h dlib/statistics/statistics_abstract.h This is a function for computing the R squared coefficient between matching elements of two std::vectors. mean_squared_error dlib/statistics.h dlib/statistics/statistics_abstract.h This is a function for computing the mean squared error between matching elements of two std::vectors. random_subset_selector dlib/statistics.h dlib/statistics/random_subset_selector_abstract.h This object is a tool to help you select a random subset of a large body of data. In particular, it is useful when the body of data is too large to fit into memory. randomly_subsample dlib/statistics.h dlib/statistics/random_subset_selector_abstract.h This is a set of convenience functions for creating random subsets of data. hsort_array dlib/sort.h dlib/sort.h hsort_array is an implementation of the heapsort algorithm. It will sort anything that has an array like operator[] interface. put_in_range dlib/algs.h dlib/algs.h This is a simple function that takes a range and a value and returns the given value if it is within the range. If it isn't in the range then it returns the end of range value that is closest. isort_array dlib/sort.h dlib/sort.h isort_array is an implementation of the insertion sort algorithm. It will sort anything that has an array like operator[] interface. numeric_constants dlib/numeric_constants.h dlib/numeric_constants.h This is just a header file containing definitions of common numeric constants such as pi and e. qsort_array dlib/sort.h dlib/sort.h qsort_array is an implementation of the QuickSort algorithm. It will sort anything that has an array like operator[] interface. If the quick sort becomes unstable then it switches to a heap sort. This way sorting is guaranteed to take at most N*log(N) time. split_array dlib/array.h dlib/array/array_tools_abstract.h This function is used to efficiently split array like objects into two parts. It uses the global swap() function instead of copying to move elements around, so it works on arrays of non-copyable types. integrate_function_adapt_simp dlib/numerical_integration.h dlib/numerical_integration/integrate_function_adapt_simpson_abstract.h Computes an approximation of the integral of a real valued function using the adaptive Simpson method outlined in

Gander, W. and W. Gautshi, "Adaptive Quadrature -- Revisited" BIT, Vol. 40, (2000), pp.84-101

integrate_function_adapt_simp_ex.cpp.html md5 dlib/md5.h dlib/md5/md5_kernel_abstract.h This is an implementation of The MD5 Message-Digest Algorithm as described in rfc1321. median dlib/algs.h dlib/algs.h This function takes three parameters and finds the median of the three. The median is swapped into the first parameter and the first parameter ends up in one of the other two, unless the first parameter was the median to begin with of course. square_root dlib/algs.h dlib/algs.h square_root is a function which takes an unsigned long and returns the square root of it or if the root is not an integer then it is rounded up to the next integer. set_intersection dlib/set_utils.h dlib/set_utils/set_utils_abstract.h This function takes two set objects and gives you their intersection. set_union dlib/set_utils.h dlib/set_utils/set_utils_abstract.h This function takes two set objects and gives you their union. set_difference dlib/set_utils.h dlib/set_utils/set_utils_abstract.h This function takes two set objects and gives you their difference. set_intersection_size dlib/set_utils.h dlib/set_utils/set_utils_abstract.h This function takes two set objects and tells you how many items they have in common. quantum_register dlib/quantum_computing.h dlib/quantum_computing/quantum_computing_abstract.h This object represents a set of quantum bits. It can be used with the quantum gate object to simulate quantum algorithms. quantum_computing_ex.cpp.html gate dlib/quantum_computing.h dlib/quantum_computing/quantum_computing_abstract.h This object represents a quantum gate that operates on a quantum_register. quantum_computing_ex.cpp.html