SCOUG-Programming Mailing List Archives
Return to [ 19 |
February |
2003 ]
>> Next Message >>
Content Type: text/plain
I've had several challenges to the usefulness of automatically
generating test data: Ben Archer, Steven Levine, and Greg
Smith.
Greg Smith deals with closed systems with feedback related
to a process known as control system synthesis. Ben Archer
works with the creation of algorithms that translate raw,
seemingly random data into a prescribed, meaningful order. I
don't know that Steven and I disagree.
None, however, challenged the ability to automatically
generate test data to enumerate all possibilities. No one
seemed to mention we could enumerate more than we can
possibly read. As humans we have physical limits and time
constraints. We do not enumerate all possible test cases for
a program. We cannot read them if we could. Instead we
rely on a system of beta testing and associated beta testers,
knowing full well that regardless of their number they will not
have generated all possible test cases nor evaluated all
possible results.
Steven talks about "unit" testing. Here "unit" normally means
a complete program. We call it "unit" because it is part of an
application system of multiple programs. When we go to test
the application system we refer to it as "integration" testing.
Then when we go to test the application system with other
application systems we refer to this as "system" testing.
All I'm suggesting is that the basic "unit" tested here, the
program, is not the smallest unit you could test, only the
smallest acceptable to a compiler, which for C is an external
procedure. For HPCalc, for example, we cannot perform a
"unit" test until we have completed five compiles, one for
each of the .c source modules, and linked them together into
an executable module.
Yet the smallest executable "unit" in a program is a
statement. The next in size are the control structures and so
on until we have constructed an external procedure. Once we
have compiled and linked all the necessary object modules
into a single executable program we are now limited to using
their i-o interfaces for the inputting of test data and viewing
of output results. It is this input of test data, the source for
regression test data, that we admit is never complete enough
for exhaustive testing. Nor is it complete enough even when
it involves extremely large numbers of beta testers with their
own sets of regression test data.
You see if you look at the way we currently do things using
compiled output our experience says that something, in this
instance exhaustive true/false testing, cannot happen. I point
out that using a predicate logic form of logic programming as
demonstrated with Trilogy that it can in fact occur. Doing it,
however, at the program-as-unit-test level says that though
the software may produce the test data we cannot complete
the evaluation of the "true" results.
The answer, of course, lies in decomposition of the problem
here as it does elsewhere: you break it into units whose
results you can handle. You begin at the statement level
move up to first level control structures, then second and so
on until you have reached the highest level control structure,
the program. In short you test from the inside out.
You need a means of doing this based on the internal
structure(s) of the program. The program itself is a control
structure, i.e. it follows the one-in/one-out, IPO model. It does
this regardless of the number of different i-o data accesses.
The fact that they are fragmented (or segmented) on either
input or output by transaction (a set of input data values)
does not change the fact that for one set of aggregate data
input one and only one set of aggregate data output exists.
So an overall IPO model of a program decomposes into
multiple levels of internal IPO models until eventually each IPO
results in a single HLL statement which in turn decomposes
into an IPO of machine instructions. That is, "logical
equivalency" occurs throughout: the whole at any level is
equal to the sum of its parts and is equal to or greater than
any one of them. I'm sure you've heard that before. The
"equal to or greater than" difference occurs with software to
account for recursion and co-routines.
That's what makes an earlier post on classifying instructions
as "n", "c", and "u" relevant: "n" for followed by Next
Sequential Instruction (NSI) only, "c" for followed by NSI or
not-NSI, and "u" for followed by not-NSI only. It gives us a
first step for determining program structure, for converting
the one-dimensional, linear addressing scheme of memory into
a two-dimensional (and higher) hierarchy of segments
representing the possible paths.
It works for instructions. It works for statements. We can
regardless of the spaghetti nature of the source or executable
code translate it to an hierarchy of IPO (control structure)
units. We can represent this as an indented logical levels as
we might on a manufacturing bill of material or we can
abstract it as a hierarchical graph of nodes and (control)
connections.
However we represent it we now have clearly defined the
paths in the program. If we have clearly defined them, then
obviously there are not an infinite number of them as some
programmers are wont to claim. Nor are there an infinite
number of test cases required to exhaustively test the paths.
Should suddenly all opposition disappear to the views
expressed here, the "ach" phenomenon occurring
simultaneously, and we agree that we have found the
single language and the single tool, both written in the
language, the next step lies in translating all existing
applications from whatever language used to this, the lingua
franca, our esperanto. I would simply suggest that we have
always had a lingua franca for any hardware platform: its
instruction set. We use that instruction set as the logically
equivalent, ultimate translation of the source. Thus we have
only one translation between the lingua franca of the
executable machine code into the lingua franca of our source
language.
In short we don't have to write translators for every source
language, particularly for programs whose actual source may
have been lost or disappeared somehow. We only have to
write one (interactive) translator of the executable source,
i.e. machine code, as an incremental option to our single tool.
It doesn't make any difference of how many different machine
lingua francas we have to translate. We know that each
machine instruction is an "n", a "c", or a "u". If we can
algorithmically translate their executable code into a
hierarchy of the five basic control structures (sequence,
decision (case), iteration (do until and do while)), we can
generate their logically equivalent machine source code into
our symbolic source lingua franca: our HLL.
However long it takes you to get to this point, believe me it's
far less than it took me. I will continue to provide a
systemic look at programming.
I will also continue to argue in favor of an internally
integrated toolset over a set of externally (manually)
integrated, normally non-integrated set of tools. Their
enumerated possibilities far exceed our needs for a functional
(and functioning) subset. To keep them deliberately separate
both physically and functionally contributes only to a loss in
productivity. I think the applicable maxim here is, "Work
smarter, not harder."
=====================================================
To unsubscribe from this list, send an email message
to "steward@scoug.com". In the body of the message,
put the command "unsubscribe scoug-programming".
For problems, contact the list owner at
"rollin@scoug.com".
=====================================================
>> Next Message >>
Return to [ 19 |
February |
2003 ]
The Southern California OS/2 User Group
P.O. Box 26904
Santa Ana, CA 92799-6904, USA
Copyright 2001 the Southern California OS/2 User Group. ALL RIGHTS
RESERVED.
SCOUG, Warp Expo West, and Warpfest are trademarks of the Southern California OS/2 User Group.
OS/2, Workplace Shell, and IBM are registered trademarks of International
Business Machines Corporation.
All other trademarks remain the property of their respective owners.
|