![]() This note introduces what we call the recognizor semantics for First Order languages. This is an essential difference between the usual way logicians Although it is of interest to logicians, but I hope is useful to everyone. It uses [sexpression]s to mention [wff]s. Recognizer Semantics for First Order Sentencesor A finitist Interpretation of First Order NotionsThis paper describes how reasoning can be carried out using only finite structures. It reduces reasoning in full classical logic to operations that at first glance are clearly expressable in PRA but is a formalization which 1) imagines that there are other things in the world besides natural numbers and that we can form mental models (or as we sometimes say 'mental images' even though we do not literally mean pictures) 2) that, unlike functions, we introduce programs that compute on these 'mental images', always terminate (or can be interupted) but, if interupted, could be restarted if desirde and 3) that reasoning is carried out in the real world by an agent who can only interact with its world through it's sensors and effectors. We are interested in building a reasoner, not simply discovering a description of how conclusions can be derived from 'facts'. We start by observaing that any understanding we have of our world is entirely in our head. This simple observation has profound consquences with respect to how we can understand the world and what structures can be in our head to realize this understanding.
1) heads are finite in extent. This means that however we imagine building
somethng that can think, it must be built out of a finite number of finite
things. This means that the logicians notion of theory (which, for example,
are deductively closed and thus 'contain' infinitely many things) cannot be
used as a blueprint for building a reasoner. Exactly how to replace theories
is the a major part of this work.
2) we have 'sensors' (like ears) that give us access to information about
the world but what we can know of the world is limited to our interpretations
of these sensors together with whatever understanding machinery we have for
further interpreting this data and what machinery we have to remember our
conclusions. A successful blueprint will have to explain how this sensory
data becomes cognative data.
3) we have 'effectors' (like speech) that allow us to express our ideas and
ask questions about our world and possibly other wany of effecting our
enviroment, but the only way we have of telling if an effector has actually
done anything is through our sensors.
Taken together 2) and 3) allow us to improve our understanding of the world
by talking to people and reading, i.e., through conversation. We believe
that our 'reasoner' must be interactive in this way and that the quality of
this conversation must include the possibility of reading this article and
participating in any discussions about it that might arise.
This view may fall prey to the charge of being 'solipsistic' but we think of
ourselves as realists. The reason we can have meaningful conversations about
the world with other people is that there are other people with whom we share
a long evolutionary history and evolution has 'built' us in such a way that
our models of the world are sufficiently accurate to provide enough survival
value to live relatively long and productive lives.
This view of reasoning has evolved during our attempts to develop computer
systems that can actually carry out general reasoning. We believe this view
provides a finitistically acceptable description of full first order reasoning
by changing how we think about the activity of theorizing. We justify this
view below. In addition, we introduce a new way of looking at compution by
replacing the notion of function (including recursion theory and PRA) with a
theory of restartable computions that has a direct finitist
interpretation.
To summarize we intend to: 1) replace set theory with IBML types, which are
finite structures with an information theoretic semantics that we use to
represent 'mental models' of things; 2) replace functions and recursion
theory with restartable computations, whih are finite structures that allow
us to 'compute' over types; and 3) replace theories with FOL contexts - finite
structures use to reason about these 'mental models' of things using a form
of classical reasoning.
Tait writes[2005] "The crux ofunderstanding Hilbert's conception of finitist
mathematics is the question
In what sense can we prove propositions, such as
(FORALL (x y) (= (+ x y) (+ y x)))
about the natural numbers, without assuming the infinitude of
numbers or some other infinite totality.
... Moreover the "finite" in "finitism means precisely that all reference
to infinite totalities should be rejected."
Tait goes on to say that he presented and defended the point of view that
"finitist reasoning is essentially priMative recurcivr reasoning in the sense
of Skolem [1923]."
Like Tate we start with Skolem [1923] but suggest that there are questions
that need examination before 'proving' even enters the picture. We start
with asking what we mean by the sentence (of a theory we call 'Arithmetic').
(1) (forall (x y) (= (+ x y) (+ y x)))
Consistent with Skolem, we propose reading (1) as "if we can determine that
x and y are natural numbers then adding x to y and adding y to x will result
in the same natural number.
This of course begs lots of questions: what does it mean to determine if an
object is a natural number? what does + mean? what do we mean by (+ x y) and
(+ y x)? Only later can we ask: what do we mean by 'prove' (1).
[Aside: Skolem paper actually could be interpreted in even more radical ways.
To quote its translation in JVH [????] ie the discussion of = re substitution.
His remaarks as translated could suggest thinking of (1) as a meta statement.]
Skolem shows how to 'prove' (1) without assuming that there is a set of
integers. We start by rewriting (1) to have explicitly bounded quantifiers.
(2) (FORALL ((x in [natnum]) (y in [natnum])) (= (+ x y) (+ y x)))
since 'in' suggests set theory and we know there is at least one natural
number, (e.g., 0, we prefer to replace mention of the collection [natnum]
with a predicate symbol and write
(3) (FORALL (x y) (IMPLIES (AND (NATNUM? x) (NATNUM? y)) (= (+ x y) (+ y x)))
By instantiating (3) with x=II and y=IV, we get the sentence
(4) (IMPLIES (AND (NATNUM? II) (NATNUM? IV) (= (+ II IV) (+ IV II)))
where here we have used roman numberals to mention natural numbers.
Now if, in a model of 'arithmetic', the 'meaning' of II is 2, the 'meaning'
of IV is 4 and 2, 4, #'+, #'NATNUM?, and #'DATA-EQUAL were 'in a model', M,
then we can determine if (4) holds in M by seeing if the value of
the program
(5) (IMPLIES (AND (NATNUM? 2) (NATNUM? 4)) (DATA-EQUAL (+ 2 4) (+ 4 2)))
But (5) is just the computational version of 'is (4) satisifiable in M',
where natural numbers 2 and 4 are in M, and #'NATNUM? and #'- are the
interpretations of 'NATNUM?' and '+' respectively,
A closer look
-------------
We start by introducing a collection of finite data objects which we call
types where some of them are thought to be examples of other types.
This use of the word 'type' is more like use of 'datatype' in computer
science and not like the logicians notion in type in 'type theory' and
might be thought
of as an answer to the question 'What kind/type of thing is this?' The
semantics of types is information theoretic and we replace the use of sets
with types.E Among other things, this allows us to mention objects whose type
is not necessarily a set but might, for example, be collection of apples.
The natural numbers form a type, which we call [natnum].
For each type there is an associated program, which we call its recognizer.
If a recognizer for a type is given any allowable datum it will determine if
that datum is an example of that type. This program is itself a datum and
there is another program that will take any type as an argument and return
its recognizer as a value. Of course, there are recognizers for both types
and programs.
This allows us to build a system in which questions of the form
(3) (<thing> in <type>)
can be 'computed' by a program. That is, infrmally, (3) holds if
the value of the program recognizer-of(<type>) when applied
to <thing> is yes.
Again we need to be more careful.
1) For restartable computations applying a program to a list of arguments
results in a value of type [computation]. A computation is like a paused
running program. So
(4) (apply(program,arglist) in [computation])
We can ask a [computation] if it is done in which case we can ask it for
its value. If it is not done we must 'run' the computation some more,
This looks like
(5) run([computation],[resource]) ==> [computation]
The use of [resource] allows us to limit the amount of work 'run' can
do before or returns. If the resulting conputation od done then we
can extract its value otherwise we can run it some more by giving it
another [resource].
Unlike recursion theory, which defines functions, this view of computation
does not assume that running (even PR) programs always give us a value.
Instead it can simple say given the amoumt of [resource] you gave me I
cannot determine the value but here's where I got to if you want me to
try harder (by giving me more [resource]).
This has an impact on our understanding of logic. If we are going to
replace functions by restartable computations, then in our logic we
need to recognize that we might not be able to determine the value
of all terms, ie, if the meaning of '+' id not a function but rather
a restartable computation then even if x and y are recognized to be
integers, we may not be able to determine the value of x+y. This
extends to using the function 'satisfiable' in the metatheory to
determine if a 'sentence' holds in a model (more about what is a
model later). For this reason we need to introduce the notion of
"I don't know", which we abbreviate IDK.
IDK simply means that a value of a 'function' has not been determined
given some amount of effort. It imposes an information theoretic view
of determining the 'truth value' of a 'sentence', ie, a 'sentence' with
respect to a model is either 'true', 'false' or we can't tell (IDK).
It is tempting to think of IDK as a third truth value but htis would
be a mistake. Our semantice says that all 'sentences' are true ot false
but in some cases we cannot tell. Tomorrow might be different and we
might be able to tell that a sentence is false. What we know is that if
it ever is true then it can never later be false (although our ability
to determine this may no longer be possible - for examlpe we forgot
and can no longer remember how we determined it)
The computation of truth valuse follows the truth tables of Kleene
[ref ] but our meaning of uu is IDK as described above. The
interpretation of terms and relations (in the sense of Computer
science) is strict, ie, if any argument evaluates to IDK then the
value of the function or relation is IDK. Quantified sentences
wrt a model are always interpreted as IDK. This is weaker than
absolutely necessary but in practice is enough. This is the sense
in which we think of this as infomation theoretic.
Models. We replace the objects in the domain of a model with types.
This gives us a 'many sorted' logic with a natural partial order on
sorts. We replace functions and relations with computations.
As usual, given a language nd a model with the same signature
defines an implicit map between the symbols in a language and
a model allows us to define satisfaction as a computation, and
define the idea of consequence.
We replace thory with the idea of context which is a triple
C = (L,M,F)
Where F is a set of facts
bib
Skolem 1923
Rose 1961
Tait 2005
Talcott and Weyhrauch ????
Hilbert and Berneys
Kleene 1952
This article owes an enormous debt to The Provenance of Pure Reason (2005)
by William Tait.
The motivation for this work is to provide a blueprint for the development of computer
systems that can understand natural language. We do not want to engage in philosophical
discussion but rather to describe a system that can manage to 'correctly' build world
when it hears (or reads) a text. Of course it must deal with the issues that have
engaged philosophers for centuries about how we come to understand things, but the
amount of literature is massive and, in any event, the 'proof is in the pudding'. If
systems using the principles articulated below can come to know and answer questions
about their world the philosophy will follow.
What Next
---------
These requirements lead us to a really different view of how we come to know about
things. In particular
1) The world is full of things and our 'head' need to have 'mental images' of both
things, how they relate to one another and how they relate to their environment.
This implies that we have some machinery for describing objects and their relation
to one another. To address this we introduce a theory of things that is based on
information theoretic ideas rather than using set theory.
2) Because everything is in our head 'reference' cannot connect us to the outside
world but at best is a conection between 'the words we use' and our 'mental images'
of the things they denote. Implicit in this remark is that 'the words we use' and
'mental images' are things in our ontology. what they are is part of this work.
3) Since we have no direct access to our exterior there is no 'analytic' truth. That
is. every thought/idea/fact is, at the least, relative to us, ie, everything comes
imbedded into a context. We build this notion into our world models at every level
of granularity.
With this in mind we start with ontology. For this we describe an information
theoretic theory of types. A detailed semantics of this type theory are found in
[...] and the IBML manual [...] describes its reference implemation. This document
describes the use of IBML to build 'mental models' of things we imagine might be in
the 'real world'.
|