howto_contribute.xml 36.8 KB
Newer Older
Davis King's avatar
Davis King committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>

<doc>
    <title>How to Contribute</title>



    <!-- ************************************************************************* -->

    <body>
        <br/><br/>
        

        <!--   ****************************   EASY CONTRIBUTIONS  ****************************    -->

         There are some simple ways to contribute to dlib:

         <ul>
            <li> You could make a dlib logo </li>
            <li> Find confusing or incorrect documentation </li>
            <li> Help make the web page prettier </li>
            <li> Link to dlib from your web page </li>
            <li> Add yourself or your project to the list of 
            <a href="http://dclib.wiki.sourceforge.net/dlib_users">dlib users</a> </li>
            <li> Try to compile the dlib regression test suite on any platforms you
            have access to </li>
         </ul>

        <!--   ****************************   CODE CONTRIBUTIONS  ****************************    -->

         Code contributions are also welcome, however, you should read over the coding guidelines below
         and try to follow them.  It is also probably a good idea to read the books Effective C++ and 
         More Effective C++ by Scott Myers.   And as always, feel free to contact me if you have any questions.

        
         <h2>Coding Guidelines</h2>

         1. <a href="#1">Use Design by Contract</a><br/>
         2. <a href="#2">Use spaces instead of tabs.</a><br/>
         3. <a href="#3">Use the standard C++ naming convention</a><br/>
         4. <a href="#4">Use RAII</a><br/>
         5. <a href="#5">Don't use pointers</a><br/>
         6. <a href="#6">Don't use #define for constants.</a><br/>
         7. <a href="#7">Don't use stack based arrays.</a><br/>
         8. <a href="#8">Use exceptions, but don't abuse them</a><br/>
         9. <a href="#9">Write portable code</a><br/>
         10. <a href="#10">Setup regression tests</a><br/>
         11. <a href="#11">Use the Boost Software License</a><br/>


         <ul>
        <!--   ****************************  -->
            <anchor>1</anchor>
            <li> <h3> Apply Design by Contract to Your Code  </h3>
               <ul><p>
                  The most important part of a software library isn't the code, it is the set
                  of interfaces the library exposes to the user.  These interfaces need to be easy 
                  to use right, and hard to use wrong.  The only way this
                  happens is if the interfaces are documented in a simple, consistent, and precise way.
               </p>
               <p>
                  The name for the way I design and document these interfaces is known as
                  Design by Contract.   There is a lot that can be said about Design by Contract, in fact,
                  whole books have been written about it, and programming languages exist which
                  use Design by Contract as a central element.  Here I will just go over some
                  of the basic ways it is used in dlib as well some of the reasons why it is a Good Thing.
               </p>
               <li> <b>Functions should have documented preconditions which are programmatically verifiable</b>
                  <ul>
                     <p>
                     Many functions have a set of requirements or preconditions that need to be satisfied
                     if they are to be used.  If these requirements are not satisfied 
                     when a function is called then the function will not do what it is supposed to do.  Moreover,
                     any piece of software that calls a function but doesn't make sure all preconditions
                     are satisfied contains a bug, <i>by definition</i>.  
                     </p>
                     <p>
                        This means all functions must precisely document their preconditions if they are to be
                        usable.  In fact, all preconditions should be programmatically verifiable.  Doing this
                        has a number of benefits.  First, it means they are unambiguous.  English
                        can be confusing and vague, but saying "<tt>some_predicate == true</tt>" uses a 
                        formal language, C++, that we all should understand quite well.  Second, it means 
                        you can put checks into the code that will catch <i>all</i> usage errors. 
                     </p>
                     <p>
                        These checks should always be implemented using 
                        <a href="metaprogramming.html#DLIB_ASSERT">DLIB_ASSERT</a> or
                        <a href="metaprogramming.html#DLIB_CASSERT">DLIB_CASSERT</a> and they should always
                        cover all preconditions.   
                        These macros take a boolean argument and if it is false they throw dlib::fatal_error.  So
                        you can use them to check that all your preconditions are true.  Also, don't forget that
                        a violated function precondition indicates a bug in a program.  
                        That is, when dlib::fatal_error is thrown it means a bug has been found and the only thing 
                        an application can do at that point is print an error message and terminate.  
                        In fact, dlib::fatal_error has checks in it to make sure someone doesn't catch the
                        exception and ignore it.  These checks will abruptly terminate any program that attempts
                        to ignore fatal errors.   
                     </p>
                     <p>
                        The above considerations bring me to my next bit of advice.  Developers new to Design by Contract
                        often think input validation should be part of a function's preconditions.
                        They then complain that labeling invalid program input as a bug, throwing fatal_error, and 
                        terminating the application is a very bad thing.  They are right, that would be a bad thing
                        and you should not write software that behaves that way.  The way out of this problem is, of
                        course, to not consider invalid input a bug.  Instead, you should perform explicit input validation 
                        on any
                        data coming into your program <i>before</i> it gets to any functions that have preconditions
                        which demand the validated inputs.  Moreover, if you make your preconditions programmatically verifiable
                        then it should be easy to validate any inputs by simply using whatever it is you
                        use to check your preconditions.  
                     </p>
                     <p>
                        Consider the function <a href="algorithms.html#cross_validate_trainer">cross_validate_trainer</a> as an 
                        example.  One of its requirements is that the input forms a valid binary classification problem.
                        This is documented in the list of preconditions as 
                        "<tt>is_binary_classification_problem(x,y) == true</tt>".  This precondition is just saying 
                        that when you call
                        the <tt>is_binary_classification_problem</tt> function on the x and y inputs it had better return true 
                        if you want to use those inputs with the <tt>cross_validate_trainer</tt> function.   
                        Given this information it is trivial to perform input validation.  All you have to do is
                        call <tt>is_binary_classification_problem</tt> on your input data and you are done.   
                     </p>
                     <p>
                        Using the above technique you have validated your inputs, documented your preconditions, and are
                        buffered by DLIB_ASSERT statements that will catch you if you accidentally forget to validate any
                        inputs.   
                     </p>
                     <p>The thing to understand here is that
                        a violation of a function's preconditions means you have a bug on your hands.  Or in other words,
                        you should never intentionally violate any function preconditions.  But of course 
                        it will happen from time to time because bugs are unavoidable.  But at least with 
                        this approach you will get a detailed error message early in development rather than a 
                        mysterious segmentation fault days or weeks later.
                     </p>
                  </ul></li>
               <li> <b>Functions should have documented postconditions  </b>
                  <ul><p>
                     I don't have nearly as much to say about postconditions as I did about function requirements.  You should
                     strive to write programmatically verifiable postconditions because that makes your postconditions
                     more precise.  However, it is sometimes the case that this isn't practical and that is fine.  
                     But whatever you do write needs to clearly communicate to the
                     user what it is your function does.  
                  </p></ul></li>
               <p>
                  Now you may be wondering why this is called <i>Design</i> by Contract and not Documentation
                  by Contract.  The reason is that the process of writing down all these detailed descriptions
                  of what your code does becomes part of how you design software.  For example, often you 
                  will find that when you go to write down the requirements for calling a function you are unable 
                  to do so.  This may be because the requirements are so complex you can't think of a way 
                  to describe them, or you may realize that you yourself don't even know what they are.  Alternatively, 
                  you may know what they are but there isn't any way to verify them programmatically.   All these
                  things are symptoms of a bad <i>design</i> and the reason you became aware of this design problem 
                  was by attempting to apply Design by Contract.  
               </p>
               <p>
                  After you get enough practice with this way of writing software you begin to think a lot
                  more about questions like "how can I design this class such that every member function
                  has a very simple set of requirements and postconditions?"  Once you start doing this
                  you are well on your way to creating software components that are easy to use right, and 
                  hard to use wrong.
               </p>
               <p>
                  The notation dlib uses to document preconditions and postconditions is located in
Davis King's avatar
Davis King committed
165
                  the <a href="intro.html#Notation">introduction</a>.  All code that goes into dlib
Davis King's avatar
Davis King committed
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
                  must document itself using this notation.  You should also separate the implementation
                  and specification of a component into two separate files as described in the introduction.  This
                  way users don't even see implementation details when they look at the documentation for a 
                  component.  
               </p>
               </ul>
            </li>


        <!--   ****************************  -->
            <anchor>2</anchor>
            <li><h3>Use spaces instead of tabs.   </h3>
            <ul> <p>This is just generally good advice but
                  it is especially important in dlib since everything is viewable 
                  as pretty-printed HTML.  Tabs show up as 8 characters in most browsers
                  and this results in the HTML version being difficult to read.  So 
Davis King's avatar
Davis King committed
182
                  don't use tabs.  Additionally, please use 4 spaces for each tab level.</p>
Davis King's avatar
Davis King committed
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
            </ul></li>
           


        <!--   ****************************  -->
            <anchor>3</anchor>
           <li><h3> Never use capitol letters in the names of variables, functions, or
              classes.  Use the _ character to separate words.  </h3>
            <ul>
               <p>
                  The reason dlib uses this style is because it is the style used by the
                  C++ standard library.  But more importantly, dlib currently provides
                  an interface to users that has a consistent look and feel and it is
                  important to continue to do so.   
               </p>
                  <p>
                     As for constants, they should usually contain all upper case letters 
                     but all lowercase is ok sometimes.
                  </p>
            </ul></li>

        <!--   ****************************  -->
            <anchor>4</anchor>
            <li> <h3> Don't use manual resource management.  Use RAII
               instead.</h3>
               <ul><p>
                  You should not be calling new and delete in your own code.  You should instead
                  be using objects like the std::vector, <a href="containers.html#scoped_ptr">scoped_ptr</a>,
                  or any number of other objects that manage resources such as memory for you.  If you want
                  an array use std::vector (or the checked <a href="containers.html#std_vector_c">std_vector_c</a>).
                  If you want to make a lookup table use a <a href="containers.html#map">map</a>.  If you want
                  a two dimensional array use <a href="containers.html#matrix">matrix</a> or 
                  <a href="containers.html#array2d">array2d</a>.
               </p>
               <p>
                  These container objects are examples of what is called RAII (Resource Acquisition Is Initialization)
                  in C++.  It is essentially a name for the fact that, in C++, you can have totally automated and
                  deterministic resource management by always associating resource acquisition with the construction
                  of an object and resource release with the destruction of an object.  I say resource management 
                  here rather than memory management
                  because, unlike Java, RAII can be used for more than memory management.  For example, when
                  you use a <a href="dlib/threads/threads_kernel_abstract.h.html#mutex">mutex</a> you first lock
                  it, do something, and then you need to remember to unlock it.  The RAII way of doing this is
                  to use the <a href="api.html#auto_mutex">auto_mutex</a> which will lock a mutex and automatically
                  unlock it for you.   Or suppose you have made a TCP <a href="api.html#sockets">connection</a> 
                  to another machine and you want to be certain the resources associated with that connection 
                  are always released.  You can easily accomplish this with RAII by using the scoped_ptr as
                  shown in <a href="sockets_ex_2.cpp.html">this</a> example program.
               </p>
               <p>
                  RAII is a trivial technique to use.  All you have to do is not call new and delete yourself and
                  you will never have another memory leak.  Just use the appropriate <a href="containers.html">container</a>
Davis King's avatar
Davis King committed
235
                  instead.  Finally, if you don't use RAII then your code is almost certainly not exception safe.   
Davis King's avatar
Davis King committed
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
               </p>
               </ul>
            </li>

        <!--   ****************************  -->
            <anchor>5</anchor>
            <li> <h3>Don't use pointers </h3>
               <ul><p>
                  There are a number of reasons to not use pointers.  First, if you are using pointers then
                  you are probably not using RAII.  Second, pointers are ambiguous.  When I see a pointer
                  I don't know if it is a pointer to a single item, a pointer to nothing, or 
                  a pointer to an array of who knows how many things.   On the other hand, when I see a 
                  std::vector I know with certainty that I'm dealing with a kind of array.  Or if I see a 
                  reference to something then I know I'm dealing with exactly one instance of some object.  
               </p>
               <p>
                  Most importantly, it is impossible to validate the state of a pointer.  Consider two
                  functions:  
                  <blockquote><tt>double compute_sum_of_array_elements(const double* array, int array_size);  <br/>
                     double compute_sum_of_array_elements(const std::vector&lt;double&gt;&amp; array); </tt></blockquote>

                  The first function is inherently unsafe.  If the user accidentally passes in an invalid pointer
Davis King's avatar
Davis King committed
258
                  or sets the size argument incorrectly then their program may crash and this will turn into a 
Davis King's avatar
Davis King committed
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
                  potentially hard to find bug.  This is because there is absolutely nothing you can do inside
                  the first function to tell the difference between a valid pointer and size pair and an invalid
                  pointer and size pair.  <b><i>Nothing</i></b>.   The second function has none of these difficulties.
               </p>
               <p>
                  If you absolutely need pointer semantics then you can usually use a smart pointer like
                  <a href="containers.html#scoped_ptr">scoped_ptr</a> or <a href="containers.html#shared_ptr">shared_ptr</a>.
                  If that still isn't good enough for you and you <i>really</i> need to use a normal C style pointer
                  then isolate your pointers inside a class so that they are contained in a small area of the code.  
                  However, in practice the container classes in dlib and the STL are more than sufficient in nearly 
                  every case where pointers would otherwise be used.
               </p>
               </ul>
            </li>

        <!--   ****************************  -->
            <anchor>6</anchor>
            <li> <h3> Don't use #define for constants.   </h3>
               <ul><p>
                  dlib is meant to be integrated into other people's projects.  Because of this everything
                  in dlib is contained inside the dlib namespace to avoid naming conflicts with user's code.
                  #defines don't respect namespaces at all.  For example, if you #define a constant called SIZE then it
                  will cause a conflict with any piece of code <i>anywhere</i> that contains the identifier SIZE.  
                  This means that #define based constants must be avoided and constants should be created using the
                  const keyword instead.
               </p>
               </ul>
            </li>

        <!--   ****************************  -->
            <anchor>7</anchor>
            <li> <h3>Don't use stack based arrays.   </h3>
               <ul><p>
                  A stack based array, or C style array, is an array declared like this:
                  <blockquote><tt>int array[200];</tt></blockquote>
Davis King's avatar
Davis King committed
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
                  Most of my criticisms of pointers also apply to stack based arrays.  In particular, 
                  if you are passing a stack based array to a function then that means you are probably
                  using functions similar to the unsafe compute_sum_of_array_elements() example above.
               </p>
               <p>
                  The only time it is OK to use this kind of array is when you use it for simple
                  tasks and you don't start passing pointers to the array to other parts of your code.  You
                  should also use a constant to store the array size and use that constant in your loops
                  rather than hard coding the size in numerous places.   
               </p>
               <p>
                  But even still, you should use a container class instead and preferably one with the ability to do range
                  checking such as the  <a href="containers.html#std_vector_c">std_vector_c</a>.   </p>
                  <p>
                     Consider the following two bits of code:
<pre>
   for (int i = 0; i &lt; array_size; ++i) 
      my_c_array[i] = 4;

   for (int i = 0; i &lt; my_std_vector.size(); ++i)
      my_std_vector[i] = 4;

</pre>
                  The second loop clearly doesn't overflow the bounds of the my_std_vector.   On the other 
                  hand, just by looking at the code in the first loop, we can not tell if it overflows
                  my_c_array.  We have to assume that array_size is the appropriate constant but we could be wrong.
               </p>
               <p>
                  Buffer overflows are probably the most common kind of bug in C and C++ code.  These bugs also
                  lead to serious exploitable security holes in software.  So please try to avoid stack based arrays.
               </p>
               </ul>
Davis King's avatar
Davis King committed
326
327
328
329
330
331
332
            </li>



        <!--   ****************************  -->
            <anchor>8</anchor>
            <li> <h3> Use exceptions, but don't abuse them. </h3>
Davis King's avatar
Davis King committed
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
               <ul>
                  <p>
                   Exceptions are one of the great features of modern programming languages.  Some 
                   people, however, consider that to be a contentious statement.   But if you accept 
                   the notion that a software library should be hard to use wrong then it 
                   becomes difficult to reject exceptions.  
                  </p>
                  <p>
                   Most of the complaints I hear about exceptions are actually complaints 
                   about their <i>misuse</i> rather than objections to the basic idea.  
                   So before I begin to defend the above
                   paragraph I would like to lay out more clearly when it is appropriate to
                   use exceptions and when it is not.   
                  </p>
                  <p>
                  There are two basic questions you should ask yourself when deciding whether to 
                  throw an exception in response to some event.  The first is (1) "should this event
                  occur in the normal use of my library component?"  The second question is (2) "if this event
                  were to occur, is it likely that the user will want to place the code for dealing 
                  with the event near the invocations of my library component?"
                  </p>
                  <p>
                     If your answers to the above two questions are "no" then you should probably
                     throw an exception in response to the event.  On the other hand, if you answer
                     "yes" to either of these questions then you should probably <i>not</i> throw an exception.
                  </p>

Davis King's avatar
Davis King committed
360
               <p>
Davis King's avatar
Davis King committed
361
362
363
                  A good example of an event worth throwing exceptions for is running out of memory.  
                  (1) It doesn't happen very often, and (2) when it does happen it is hardly ever the case that 
                  you want to deal with the out of memory event right next to the place where you are 
Davis King's avatar
Davis King committed
364
365
366
                  attempting to allocate memory.  
               </p>
               <p>
Davis King's avatar
Davis King committed
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
                  Alternatively, an example of an event that shouldn't throw an exception comes to 
                  us from the C++ I/O streams.  This part of the standard library allows
                  you to read the contents of a file from disk.  When you hit the end of file they
                  do not throw an exception.  This is appropriate because (1) you usually want to
                  read a file in its entirety. So hitting EOF happens all the time.  Additionally, (2)
                  when you hit EOF you usually want to break out of the loop you are in
                  and continue immediately into the next block of code.
               </p>
               <p>
                  Usually when someone tells me they don't like exceptions they give reasons like "they make 
                  me put try/catch blocks all over the place and it makes the code hard to read."  Or "it makes
                  it hard to understand the flow of a program with exceptions in it."   Invariably they
                  have been working with bodies of software that disregard the above rules regarding questions
                  1 and 2.  Indeed, when exceptions are used for flow control the results are horrifying.  Using
                  exceptions for events that occur in the normal use of a library component, especially when
                  the events need to be dealt with near where they happen result in a spaghetti like mess
Davis King's avatar
Davis King committed
383
                  of throw statements and try/catch blocks.  Clearly, exceptions should be used judiciously.  
Davis King's avatar
Davis King committed
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
                  So please, take my advice regarding questions 1 and 2 to heart. 
               </p>
               <p>
                  Now lets go back to my claim that exceptions are an important part of making
                  a library that is hard to use wrong.  But first lets be honest about one thing,  
                  many developers don't think very hard about error handing and they similarly aren't very
                  careful about checking function return codes.  Moreover, even the most studious of
                  us can easily forget to check error codes.  It is also easy to forget to add 
                  appropriate exception catch blocks.
               </p>
               <p>
                  So what is so great about exceptions then?  Well, lets imagine some error just occurred
                  and it caused an exception to be thrown.   If you forgot to setup catch blocks to deal with
                  the error then your program will be aborted.  Not exactly a great thing.  But you will, however,
                  be able to easily find out what exception was thrown.  Additionally, exceptions typically contain an error
                  message telling you all about the error that caused the exception to be thrown.  Moreover, 
                  any debugger worth its
                  salt will be able to show you a stack trace that lets you see exactly where the exception came from.
                   The exception <i>forces</i> you, the user, to 
                  be aware of this potential error and to add a catch block to deal with it. 
                  This is where the "hard to use wrong" comes from. 
               </p>
               <p>
                  Now lets imagine that we are using return codes to communicate errors to the user and the 
                  same error occurs.  If you forgot to do all your return code checking then you will
                  simply be unaware of the error.  Maybe your program will crash right away.  But more likely, it
                  will continue to run for a while before crashing at some random place far away from the source
                  of the error.  You and your debugger now get to spend a few hours of quality time 
                  together trying to figure out what went wrong.  
               </p>
               <p>
                  The above considerations are why I maintain that exceptions, used properly, contribute to 
                  the "hard to use wrong" factor of a library.  There are however other reasons to use exceptions.
                  They free the user from needing to clutter up code with lots of return code checking.  This makes
                  code easier to read and lets you focus more on the algorithm you are trying to implement and less
                  on the bookkeeping.  
Davis King's avatar
Davis King committed
420
               </p>
Davis King's avatar
Davis King committed
421
422
423
424
425
426
427
428
               <p>
                  Finally, it is important to note that there is a place for return codes.  When you answer "no"
                  to questions 1 and 2 I suggest using exceptions.  However, if you answer "yes" to even one
                  of them then I would recommend pretty much anything other than throwing an exception.  In this
                  case error codes are often an excellent idea.
               </p>


Davis King's avatar
Davis King committed
429
430
               <p>
                  As an aside, it is also important that your exception classes inherit from 
Davis King's avatar
Davis King committed
431
                  <a href="other.html#error">dlib::error</a> to maintain consistency with the rest of the library.
Davis King's avatar
Davis King committed
432
433
434
435
436
437
438
439
440
441
442
443
444
445
               </p>
               </ul>
            </li>


        <!--   ****************************  -->
            <anchor>9</anchor>
            <li> <h3>Write portable code</h3>
               <ul>
                  <li> <b>Don't make assumptions about how objects are laid out in memory. </b>
                     <ul> <p>
                         If you have been following the prohibition against messing around with
                         pointers then this won't even be an issue for you.  Moreover, just about the only
                         time this should even come up is when you are casting blocks of 
Davis King's avatar
Davis King committed
446
                         memory into other types or dumping the contents of memory to an I/O channel.
Davis King's avatar
Davis King committed
447
448
                         All of these things are highly non-portable so don't do them.
                        </p>
Davis King's avatar
Davis King committed
449
450
                        <p>
                           If you want a portable way to write the state of an object to an
Davis King's avatar
Davis King committed
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
                           I/O channel then I recommend you use the <a href="other.html#serialize">serialization</a>
                           capability in dlib.  If that doesn't suit your needs then do 
                           something else, but whatever you do don't just dump the contents of memory.  
                           Convert your data into some portable format and then output that.
                        </p>
                        <p>
                           As an example of something else you might do, suppose you have a bunch of integers 
                           you want to write to disk.  Assuming all your integers are positive numbers representable 
                           using 32 or fewer bits you could store all your numbers in 
                           <a href="other.html#uint32">dlib::uint32</a> variables and then convert them
                           into either big or little endian byte order and then write them to an output stream.  
                           You could do this using code similar to the following:

                           <pre>
   dlib::<a href="other.html#byte_orderer">byte_orderer</a>::kernel_1a bo;
   ...
   bo.host_to_big(my_uint);
   my_out_stream.write((char*)&amp;my_uint, sizeof(my_uint));
   ...
                           </pre>

                           <p>
                           There are three important things to understand about this process.  First, you need
                           to pick variables that always have the same size on all platforms.  This means you
                           can't use <i>any</i> of the built in C++ types like int, float, double, long, etc... All 
                           of these types have different sizes depending on your platform and even compiler settings. 
                           So you need to use something like dlib::uint32 to obtain a type of a known size.
                           </p>
                           <p>
                           Second, you need to convert each thing you write out into either big or little endian byte order.  
                           The reason for this is, again, portability.  If you don't explicitly convert to one
                           of these byte orders then you end up writing data out using whatever the byte order
                           is on your current machine.  If you do this then only machines that have the same
                           byte order as yours will be able to read in your data.  If you use the dlib::byte_orderer
                           object this is easy.  It is very type safe.  In fact, you should have a hard time even getting
                           it to compile if you use it wrong.
                           </p>
                           <p>
                           The third thing you should understand is that you need to write out each of your
                           variables one at a time.  You can't write out an entire struct in a  
                           single ostream.write() statement because the compiler is allowed to put any
                           kind of padding it feels like between the fields in a struct.  
                           </p>
                           <p>
                           You may be aware that compilers usually provide #pragma directives that allow you 
                           to explicitly control this padding.  However, if you want to submit code to dlib 
                           you will not use this feature.  Not all compilers support it in the same way and, 
                           more importantly, not all CPU architectures are even capable of running code that 
                           has had the padding messed with.  This is because it can result in the CPU attempting
                           to perform what is called an "unaligned load" which many CPUs (like the SPARC) are
                           incapable of doing.
                           </p>
                           <p>
                              So in summary, convert your data into a known type with a fixed size, then convert
                              into a specific byte order (like big endian), then write out each variable individually.
                              Or you could just use <a href="other.html#serialize">serialize</a> and not worry about all
                              this horrible stuff. :)
                           </p>
                           
Davis King's avatar
Davis King committed
510
                        </p>
Davis King's avatar
Davis King committed
511
512
                     </ul>
                  </li>
Davis King's avatar
Davis King committed
513

Davis King's avatar
Davis King committed
514
515
516
                  <li> <b> All code that calls functions that aren't in dlib or the C++
                     standard library must be isolated inside the API wrappers.</b>
                     <ul><p>
Davis King's avatar
Davis King committed
517
                        If you want to contribute code to dlib which needs to use something that isn't 
Davis King's avatar
Davis King committed
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
                        in the C++ standard then we need to introduce a new library component
                        in the <a href="api.html">API wrappers</a> section.  The new component would
                        provide whatever functionality you need.  This new component would have
                        to provide at least POSIX and win32 implementations.  
                     </p>
                     <p>
                        It is also worth pointing out that <i>simple</i> wrappers around operating system 
                        specific calls are usually a bad solution.  This is because there are
                        invariably subtle, if not huge, differences between what is available on different 
                        operating systems.
                        So being truly portable takes a lot of work.  It involves reading everything
                        you can find about all the APIs needed to implement the feature on each target platform.
                        In many cases there will be important details that are undocumented and you will
                        only be able to find out about them by searching the internet for other developers
                        complaining about bugs in API functions X, Y, and Z.  All this stuff needs to be abstracted
                        away to put a portable and simple interface in front of it.  So this is a task 
                        that shouldn't be taken lightly.
                     </p>
                     </ul>
                  </li>
               </ul></li>


        <!--   ****************************  -->
            <anchor>10</anchor>
            <li> <h3>Library components should have regression tests</h3>
               <ul>
                  <p>
                     dlib has a <a href="other.html#dlib_testing_suite">regression test suite</a> located in 
                     the dlib/test folder.  Whenever possible, library components should have tests
                     associated with them.  GUI components get a pass since it isn't very easy to setup
                     automatic tests for them but pretty much everything else should have some sort
                     of test.
                  </p>
               </ul>
            </li>

        <!--   ****************************  -->
            <anchor>11</anchor>
            <li> <h3>You must use the Boost Software License</h3>
               <ul>
                  <p>
                     Having the library use more than one open source license is confusing
                     so I ask that any code contributions be licensed under the Boost Software
                     License.
                  </p>
               </ul>
            </li>


         </ul>


        <!--   ****************************  -->




    
    </body>



    <!-- ************************************************************************* -->

</doc>