The Craft Of Coding

CPP 29-Feb Designing Programs

Posted by Pete Sun, 28 Feb 2016 21:41:00 GMT

For a small enough task, it is possible to start writing code immediately. This is not necessarily the best plan, but it can be feasible.For larger tasks when the program is going to be more complex, just jumping into coding is going to lead to problems.

Understanding the problem

Crazy as it may seem, no matter how clear a problem statement may look on first reading, there is going to be some complexity that is not immediately obvious. So it always pays to spend some time Exploring the Requirements, initially by just asking questions, but more usefully by generating examples that the program should and should not be able to handle.

Consider the case of a simple command line calculator, that takes the input calculation, builds a bracketed expression and then prints the result…

Calc>1+2
Expression : (1+2)=3
Calc>1+(2*3)
Expression : (1+(2*3))=7
Calc>1+2*3
Expression : ((1+2)*3)=9
Calc>
  1. What do we think we understand about the requirements?
  2. What is unclear/under-specified?
  3. What restrictions do we want to put on the program to make it easier to implement?

Using Examples to Explore Requirements

Once you have an idea about the what the program is supposed to do you can test out that idea by working through an example. If you think that operator precedence needs to be taken into account, you can create examples where this would matter and work through these examples with the people requesting the program.

Note. You will need to write down each of these examples with expected output for use as test cases that you can use to test the program (after it is written) to validate whether it implements the requirements correctly. To do this successfully you will need examples of inputs that the program is not expected to be able to process.

Design is Iterative and Incremental

Whenever you have a design idea, you can use the examples you have collected to check if the proposed mechanism will work for the examples. In most cases this will cause you to refine or rework your design ideas. The benefit of revising the design at this stage is that it is much simpler to revise your design before you have committed any time to writing code, since once code has been written you will find yourself very reluctant to delete that code and start again on a new idea.

Conceptually most programs are very similar, they have three stages that are repeated indefinitely:

  1. Input – collecting the data from users or devices using some mechanism
  2. Processing – manipulating the inputs in some way, often in combination with stored data
  3. Output – resenting the results of the processing to users (or sending it off to another device)

Historically there have been two ways of designing programs, Top Down and Bottom Up, both of which have merits, these days most people use a mixture of the two.

Top Down Design

From your understanding of the problem, write down three to seven high level steps that could be used to make the program do what it is supposed to do. For each of these high level steps,

  1. Identify the expected inputs to this step.
  2. Identify the information that this step needs to remember.
  3. Specify the outputs from this step.

An obvious check at this stage is to make sure that the inputs to a downstream step are produced by a prior step in the sequence.

Once the consistency of these steps has been confirmed, then the next part of the process is stepwise refinement, where each high level step is further decomposed into three to seven sub-steps, where we can repeat the process of identifying the inputs, stored data and outputs.

The theory behind this is that after several passes of this stepwise refinement, each sub-sub-sub-step is well enough defined for it to be easy to implement in code. All these sub parts should then naturally fit together and as a whole work to deliver the desired functionality.

Bottom Up Design

Starting with low level components that are known to work (or you know how to build), the program is incrementally built up from a very small beginning. In this approach, the complete program always works, even if the implementation is incomplete and only some inputs can be handled correctly.

The program can be continually tested with a wider variety of inputs as the functionality is built up.

The theory behind this is is that the developers can be making progress on the parts of the problem that they know how to build while in the background they can think about ways of implementing the more complex, less understood parts. All parts always work together because the program can always be tested to ensure that it works as a whole.

Design is Hard

Although as simplified prescriptions, both Top Down and Bottom Up design look workable, in practice, neither works all that well by itself. Top Down is good for getting the big picture, but Bottom Up experience is always needed or you can end up with a box labelled the magic happens here that nobody knows how to implement. Knowing the availability of well tested components is also key to success with top down design, so that the design steps can be steered to the needs of the existing components. After all, if you need a way to distribute multiple documents rapidly, email, or webserver coupled with a instant message sending a URL is a known solution rather than trying to design all the low level steps.

Tasks for the week

  1. Document your understanding of the requirements for the command line calculator.
  2. Document any questions you have about the requirements.
  3. Document your example inputs for the calculator, together with your expected outputs. These need to be in text files with no extraneous characters, so eventually you can run your program with the input file and then compare to the expected output file

    calc < inputs.txt > outputs.txt

inputs.txt

1+2
1/2

expected.txt

Calc>Expression : (1+2)=3
Calc>Expression : (1/2)=0.5
Calc>

© Pete McBreen 2016 Things are not as they seem. They are what they are. — Terry Pratchett (Thief of Time)

CPP 22-Feb Compiling, Linking, Testing and Debugging

Posted by Pete Sun, 21 Feb 2016 03:40:00 GMT

We are now far enough along in learning C++ for you to have realized that programs can be compelling and infuriating at the same time.

The main problem with software development is how we, individually deal with errors and mistakes. Part of the problem is cultural, in that errors are thought of as a bad thing, but in programming errors are a normal, expected part of the development process. So the challenge is how can we learn from mistakes?

  1. Finding the learning opportunity in each mistake requires you to think about what lead up to the mistake.
  2. Making small mistakes helps
  3. Only making one mistake at a time makes it easier to locate the mistake

Compiling

When writing C++ code, your first chance to check that you have not made a syntactic mistake is when you try to compile the code. So a useful strategy is to only type a few lines before trying to compile the program.

Useful habits when writing code

  1. Whenever writing any construct, make sure you terminate it correctly and then go back and fill in the body of the construct. So for class methods, always type the () { } after the method name before doing anything else.
  2. Only write a few lines of code before trying to compile it.
  3. When defining a class, after adding a method signature to the class header file (.h), immediately define that method in the class source file (.cpp)
  4. Whenever you add a variable to a class, make sure you add it to all the constructors (and destructor if necessary.

Linking

Currently the linking stage is not that visible or obvious in the programs we are creating. When you compile each cpp file is converted into an object file (see the obj directory in your project folder). The individual files have a .o extension using CodeBlocks, Other systems will use the .obj extension. After all the object files are created, then the linking stage happens, starting with the file that contains the main function, all the references are resolved to the various function definitions in the other object files (and potentially any other libraries or system libraries that you are using).

You might have noticed that all our programs are much bigger than the sum of the sizes of the object files, this is because the C++ runtime system is included into the executable file.

Mistakes at the linking stage are typically just problems of file paths, or of not having the expected library files on the expected locations.

Testing

Whole books have been written on testing, but the entire body of testing can be simply reduced to the simple sequence of

  1. Know what the initial state of the program is intended to be.
  2. Specify simple inputs that make it easy for you to determine the expected outputs.
  3. Run the program with the planned, simple inputs.
  4. Compare the expected outputs against the actual results.

Useful habits when testing

  • Use simple inputs
  • Start testing early as you are adding functionality to the program
  • Test your program every time it successfully compiles and links
  • Test with correct input and inputs that should lead to errors
  • Adopt the mindset that the program is wrong, and your inputs are going to show where it is wrong

Debugging

This is the opposite of writing code when you are putting the bugs into the code. The challenge when debugging lies in understanding how the program is producing the actual results that differ from the expected results of your test cases.

  • Only make a change to the program when you understand what is going wrong
  • Sometimes it is better to revert to an earlier version of a program and try to add the functionality again from that base rather than try to fix what went wrong (but this is normally only a useful strategy if you forgot the hints about compiling)

© 2016 Pete McBreen …it was amazing how intelligent people kept on making the same mistakes. — Terry Pratchett (Small Gods)

CPP 8th Feb Array Initialization and arrays of Pointers

Posted by Pete Sun, 07 Feb 2016 20:09:00 GMT

A partial list of applications and companies that use C++, other than the obvious use to filter out first year engineering students at the University of Calgary.

Array initialization

As discussed last week, it is not a good idea to declare one variable for each quiz

    Quiz q1("q1");   // not a good idea
    Quiz q2("q2");
    Quiz q3("q3");
    Quiz q4("q4");

instead we want to create an array of quizzes, but to do this you need to do Array Initialization since the Quiz does not have a default constructor that takes no parameters.

        Quiz quizzes[4] = { Quiz("q1"), Quiz("q2"),Quiz("q3"),Quiz("q4") };

It is feasible to not specify the size of the array and have the compiler determine how long it needs to be, but often in your code you will need to specify the size of the array when iteration over the quizzes, so it is often easier to just specify the length in the declaration.

        Quiz quizzes[] = { Quiz("q1"), Quiz("q2"),Quiz("q3"),Quiz("q4") };

Avoiding Magic Numbers

In the context of this code the number 4 is a magic number that signifies interesting things about the code, but just looking at a number 4 in the code you cannot determine why it is 4 and not 3 or 5. The fix for this is to use a constant to represent the magic number and have the name of the constant explain what the value is.

    const int NUM_QUIZZES = 4; 
    Quiz quizzes[NUM_QUIZZES] = { Quiz("q1"), Quiz("q2"),Quiz("q3"),Quiz("q4") };

Your code can then use NUM_QUIZZES safely and if the number of quizzes changes, you can safely just change that constant and not need to make any changes elsewhere in the program (other than in the initialization of the array). Now the code can have the loop

    const int NUM_QUIZZES = 4;
    Quiz quizzes[NUM_QUIZZES] = {Quiz("q1"), Quiz("q2"),Quiz("q3"),Quiz("q4")};
    while (input >> qr ) {
        for (int i=0 ; i < NUM_QUIZZES ; i++ ) {
            quizzes[i].add(qr);
    // rest of code omitted ...

Using an array of pointers

The above mechanism works, but you must know at compile time the names of each of the quizzes and students. A better mechanism is to have a dynamic array of Quizzes that you add each new Quiz as it is discovered.

To do this we allocate an array of pointers to Quiz objects, so to help us remember this we prefix the name of the variable with ap to denote that we have an Array of Pointers, so the variable name becomes apQuizzes, and with this construct we can remember that apQuizzes[1] is still a pointer. Please note also that it is safest to initialize the array of pointers to 0 so that we KNOW that the pointers are all set to NULL. We also should allocate one more space than we intend to use so that we have a way of making sure we do not iterate over the end of the array - since we all know by now that this will cause problems. It is now safe to massively over-allocate NUM_QUIZZES since each space in the array of pointers only takes up 8 bytes (assuming 64 bit machine)

    Quiz *apQuizzes[NUM_QUIZZES + 1] = {0}; // always allocate 1 more as sentinel at end

Inside the read loop, we need to check if the array position we are looking at is empty, if so we need to add a new Quiz at that position, otherwise we can just add to the Quiz in the current position. Note. To make this code simpler, Quiz::add has been changed to return a bool that is true if the quiz name matched, and false otherwise.

    while (input >> qr ) {
        for (int i=0 ; i < NUM_QUIZZES ; i++ ) {
            if (!apQuizzes[i]) {  // this position blank, so add this quiz name and add result
                apQuizzes[i] = new Quiz(qr.quiz());
                apQuizzes[i]->add(qr);
                break; // break out of the for loop
            }
            if (apQuizzes[i]->add(qr)) break; // found quiz, so done for this input
        }
    }

Printing the results is easy if we remember we have a sentinel at the end of the array. We can just print every position of the array until we find a null pointer that denotes that we are at the end of the data. Because the empty parts of the array have a 0, and 0 in C++ is defined as false, we can just put the pointer into the test part of a while loop.

    int i = 0;
    while(apQuizzes[i]) {
        cout << *(apQuizzes[i++]); // do not forget to increment i
    }

An alternate implementation of the above loop that does assignment inside the while test – hence the extra () – puts the increment in the while test as well to make it easier to see visually that the increment is happening. It also simplifies the print at the expense of declaring an extra variable pQuiz.

    int i = 0;
    Quiz* pQuiz;
    while( (pQuiz = apQuizzes[i++]) ) { // increment i after assign
        cout << *pQuiz; 
    }

In terms of C++ code, both are equivalent functionally, but aesthetically some programmers/instructors will prefer one over the other.

© 2016 Pete McBreen Education had been easy. Learning things had been harder. —Terry Pratchett (Hogfather)