faq.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>

<doc>
   <title>Frequently Asked Questions</title>

   <!-- ************************************************************************* -->
   <questions group="General">


      <question text="How can I cite dlib?">
         If you use dlib in your research then please use the following citation:
         <br/>
         <br/>
Davis E. King. <a href="http://jmlr.csail.mit.edu/papers/volume10/king09a/king09a.pdf">Dlib-ml: A Machine Learning Toolkit</a>. 
   <i>Journal of Machine Learning Research</i> 10, pp. 1755-1758, 2009
         <br/>
         <pre>

@Article{dlib09,
  author = {Davis E. King},
  title = {Dlib-ml: A Machine Learning Toolkit},
  journal = {Journal of Machine Learning Research},
  year = {2009},
  volume = {10},
  pages = {1755-1758},
}
         </pre>
      </question>

      <!-- ****************************************** -->


      <question text="Why isn't serialization working?">
         Here are the possibilities:
         <ul>
            <li>You are using a file stream and forgot to put it into binary mode.  
               You need to do something like this:
<code_box>
std::ifstream fin("myfile", std::ios::binary);
</code_box>
or
<code_box>
std::ofstream fout("myfile", std::ios::binary);
</code_box>

If you don't give <tt>std::ios::binary</tt> then the iostream will mess with the binary data and cause serialization
to not work right.  
            </li>

            <br/>
            <li>The iostream is in a bad state.  You can check the state by calling <tt>mystream.good()</tt>.
            If it returns false then the stream is in an error state such as end-of-file or maybe it failed
            to do the I/O.  Also note that if you close a file stream and reopen it you might have to call
            <tt>mystream.clear()</tt> to clear out the error flags.
            </li>
         </ul>

      </question>
      <!-- ****************************************** -->

      <question text="How do I set the size of a matrix at runtime?">
         Long answer, read the <a href="matrix_ex.cpp.html">matrix example program</a>.
         <br/><br/>
         Short answer, here are some examples:
<code_box>
matrix&lt;double&gt; mat;
mat.set_size(4,5);

matrix&lt;double,0,1&gt; column_vect;
column_vect.set_size(6);

matrix&lt;double,0,1&gt; column_vect2(6);  // give size to constructor

matrix&lt;double,1&gt; row_vect;
row_vect.set_size(5);
</code_box>

      </question>
      <!-- ****************************************** -->

      <question text="Where is the documentation for &lt;object/function&gt;?">
         If you can't find something then check the <a href="term_index.html">index</a>.
         <br/><br/>
         Also, the bulk of the documentation can be found by following the 
         <b><font style='font-size:1.3em' color='#0000FF'>Detailed Documentation</font></b> links.   
      </question>


      <!-- ****************************************** -->

      <question text="How does dlib interface with other libraries/tools?">
         There shouldn't ever be anything in dlib that prevents you from using or 
         interacting with other libraries.  However, there are some additional tools
         in dlib to make some interactions easier.  

         <ul>
            <li><b>BLAS and LAPACK libraries</b> are used by the <a href="linear_algebra.html#matrix">matrix</a>
            automatically if you <tt>#define</tt> DLIB_USE_BLAS and/or DLIB_USE_LAPACK and link against
            the appropriate library files.  Note that the CMakeLists.txt file that comes with dlib will
            do this for you automatically in many instances.</li><br/>

            <li><b>Armadillo and Eigen libraries</b> have matrix objects which can be converted into
            dlib matrix objects by calling dlib::mat() on them.</li><br/>

            <li><b>OpenCV</b> image objects can be converted into a form usable by dlib routines
            by using <a href="imaging.html#cv_image">cv_image</a>.  You can also convert from a
            dlib matrix or image to an OpenCV Mat using dlib::<a href="imaging.html#toMat">toMat</a>().</li><br/>
            <li><b>Google Protocol Buffers</b> can be serialized by the dlib 
            <a href="other.html#serialize">serialization</a> routines.  
            This means that, for example, you can pass protocol buffer objects through a 
            <a href="network.html#bridge">bridge</a>.
            </li><br/>

            <li><b>libpng and libjpeg</b> are used by <a
            href="imaging.html#load_image">load_image</a> whenever
            DLIB_PNG_SUPPORT and DLIB_JPEG_SUPPORT are defined respectively.
            You must also tell your compiler to link against these libraries to
            use them. However, CMake will try to link against them
            automatically if they are installed.</li><br/>

            <li><b>SQLite</b> is used by the <a href="other.html#database">database</a> object.  In
            fact, it is just a wrapper around SQLite's C interface which simplifies its use (e.g. 
            makes resource management use RAII).</li>

         </ul>
      </question>

   </questions>


   <!-- ************************************************************************* -->

   <questions group="Machine Learning">

      <question text="Why is RVM training is really slow?">
         The optimization algorithm is somewhat unpredictable.  Sometimes it is fast and 
         sometimes it is slow.  What usually makes it really slow is if you use a radial basis
         kernel and you set the gamma parameter to something too large.  This causes the
         algorithm to start using a whole lot of relevance vectors (i.e. basis vectors) which
         then makes it slow.  The algorithm is only fast as long as the number of relevance vectors
         remains small but it is hard to know beforehand if that will be the case.  
         <p>
            You should try <a href="ml.html#krr_trainer">kernel ridge regression</a> instead since it
            also doesn't take any parameters but is always very fast.
         </p>

      </question>

      <!-- ****************************************** -->

      <question text="Why is cross_validate_trainer_threaded() crashing?">
         This function makes a copy of your training data for each thread.  So you are probably running out
         of memory.  To avoid this, use the <a href="algorithms.html#randomly_subsample">randomly_subsample</a> function
         to reduce the amount of data you are using or use fewer threads. 
         <p>
            For example, you could reduce the amount of data by saying this:
<code_box>
// reduce to only 1000 samples
cross_validate_trainer_threaded(trainer, 
                                randomly_subsample(samples, 1000), 
                                randomly_subsample(labels,  1000), 
                                4,   // num folds
                                4);  // num threads
</code_box>
         </p>
      </question>

      <!-- ****************************************** -->

      <question text="How can I define a custom kernel?">
         See the <a href="using_custom_kernels_ex.cpp.html">Using Custom Kernels</a> example program.
      </question>

      <!-- ****************************************** -->

      <question text="Can you give advice on feature generation/kernel selection?">
         <p>
            Picking the right kernel all comes down to understanding your data, and obviously this is 
            highly dependent on your problem.  
         </p>

         <p>
            One thing that's sometimes useful is to plot each feature against the target value.  You can get an idea of 
            what your overall feature space looks like and maybe tell if a linear kernel is the right solution.  But 
            this still hides important information from you.  For example, imagine you have two diagonal lines which 
            are very close together and are both the same length.  Suppose one line is of the +1 class and the other is the -1 
            class.  Each feature (the x or y coordinate values) by itself tells you almost nothing about which class 
            a point belongs to but together they tell you everything you need to know.  
         </p>

         <p>
            On the other hand, if you know something about the data you are working with then you can also try and 
            generate your own features.  So for example, if your data is a bunch of images and you know that one 
            of your classes contains a lot of lines then you can make a feature that attempts to measure the number 
            of lines in an image using a hough transform or sobel edge filter or whatever.  Generally, try and 
            think up features which should be highly correlated with your target value.  A good way to do this is 
            to try and actually hand code N solutions to the problem using whatever you know about your data or 
            domain.  If you do a good job then you will have N really great features and a linear or rbf kernel 
            will probably do very well when using them.
         </p>

         <p>
            Or you can just try a whole bunch of kernels, kernel parameters, and training algorithm options while 
            using cross validation.  I.e. when in doubt, use brute force :)   There is an example of that kind of 
            thing in the <a href="model_selection_ex.cpp.html">model selection</a> example program. 
         </p>
      </question>

      <!-- ****************************************** -->

      <question text="Why does my decision_function always give the same output?">
         This happens when you use the radial_basis_kernel and you set the gamma value to
         something highly inappropriate.  To understand what's happening lets imagine your
         data has just one feature and its value ranges from 0 to 7.  Then what you want is a 
         gamma value that gives nice Gaussian bumps like the one in this graph: <br/>

         <center><image src="rbf_normal.gif"/></center>

         <br/>
         However, if you make gamma really huge you will get this (it's zero everywhere except for one place):
         <br/>
         <center><image src="rbf_big_gamma.gif"/></center>

         <br/>
         Or if you make gamma really small then it will be 1.0 everywhere:
         <br/>
         <center><image src="rbf_small_gamma.gif"/></center>

         <p>
            So you need to pick the gamma value so that it is scaled reasonably to your data.  A <i><font color="red">good rule of
            thumb (i.e. not the optimal gamma, just a heuristic guess)</font></i> is the following:
         </p>
         <code_box>const double gamma = 1.0/compute_mean_squared_distance(randomly_subsample(samples, 2000));</code_box>
           
      </question>

      <!-- ****************************************** -->

   </questions>

   <!-- ************************************************************************* -->


</doc>