Quellcodebibliothek Statistik Leitseite products/Sources/formale Sprachen/Isabelle/Doc/Tutorial/Inductive/   (Beweissystem Isabelle Version 2025-1©)  Datei vom 16.11.2025 mit Größe 13 kB image not shown  

Quelle  AB.thy

  Sprache: Isabelle
 

(*<*)theory AB imports Main begin(*>*)

sectionCase Study: A Context Free Grammar

text\label{sec:CFG}
 index{grammars!defining inductively|(}%
  are nothing but shorthands for inductive definitions of nonterminals
  represent sets of strings. For example, the production
 A \to B c$ is short for
 [ w \in B \Longrightarrow wc \in A \]
  section demonstrates this idea with an example
  to Hopcroft and Ullman, a grammar for generating all words with an
  number of $a$'s and~$b$'s:
 begin{eqnarray}
  &\to& \epsilon \mid b A \mid a B \nonumber\\
  &\to& a S \mid b A A \nonumber\\
  &\to& b S \mid a B B \nonumber
 end{eqnarray}
  the end we say a few words about the relationship between
  original proof citep.\ts81 in HopcroftUllman and our formal version.

  start by fixing the alphabet, which consists only of terma's
 ~termb's:
 


datatype alfa = a | b

text\noindent
  convenience we include the following easy lemmas as simplification rules:
 


lemma [simp]: "(x a) = (x = b) (x b) = (x = a)"
by (case_tac x, auto)

text\noindent
  over this alphabet are of type 🍋alfa list, and
  three nonterminals are declared as sets of such words.
  productions above are recast as a \emph{mutual} inductive
 \index{inductive definition!simultaneous}
  termS, termA and~termB:
 


inductive_set
  S :: "alfa list set" and
  A :: "alfa list set" and
  B :: "alfa list set"
where
  "[] S"
"w A ==> b#w S"
"w B ==> a#w S"

"w S ==> a#w A"
"[ vA; wA ] ==> b#v@w A"

"w S ==> b#w B"
"[ v B; w B ] ==> a#v@w B"

text\noindent
  we show that all words in termS contain the same number of terma's and termb's. Since the definition of termS is by mutual
 , so is the proof: we show at the same time that all words in
 termA contain one more terma than termb and all words in termB contain one more termb than terma.
 


lemma correctness:
  "(w S size[xw. x=a] = size[xw. x=b])
   (w A size[xw. x=a] = size[xw. x=b] + 1)
   (w B size[xw. x=b] = size[xw. x=a] + 1)"

txt\noindent
  propositions are expressed with the help of the predefined termfilter function on lists, which has the convenient syntax [xxs. P
 ]
, the list of all elements termx in termxs such that propP x
 . Remember that on lists size and length are synonymous.

  proof itself is by rule induction and afterwards automatic:
 


by (rule S_A_B.induct, auto)

text\noindent
  may seem surprising at first, and is indeed an indication of the power
  inductive definitions. But it is also quite straightforward. For example,
  the production $A \to b A A$: if $v,w \in A$ and the elements of $A$
  one more $a$ than~$b$'s, then $bvw$ must again contain one more $a$
 ~$b$'s.

  usual, the correctness of syntactic descriptions is easy, but completeness
  hard: does termS contain \emph{all} words with an equal number of
 terma's and termb's? It turns out that this proof requires the
  lemma: every string with two more terma's than termb's can be cut somewhere such that each half has one more terma than
 termb. This is best seen by imagining counting the difference between the
  of terma's and termb's starting at the left end of the
 . We start with 0 and end (at the right end) with 2. Since each move to the
  increases or decreases the difference by 1, we must have passed through
  on our way from 0 to 2. Formally, we appeal to the following discrete
  value theorem @{thm[source]nat0_intermed_int_val}
 {thm[display,margin=60]nat0_intermed_int_val[no_vars]}
  termf is of type 🍋nat ==> int, 🍋int are the integers,
 . is the absolute value function\footnote{See
 ~\ref{tab:ascii} in the Appendix for the correct \textsc{ascii}
 .}, and term1::int is the integer 1 (see \S\ref{sec:numbers}).

  we show that our specific function, the difference between the
  of terma's and termb's, does indeed only change by 1 in every
  to the right. At this point we also start generalizing from terma's
  termb's to an arbitrary property termP. Otherwise we would have
  prove the desired lemma twice, once as stated above and once with the
  of terma's and termb's interchanged.
 


lemma step1: "i < size w.
  (int(size[xtake (i+1) w. P x])-int(size[xtake (i+1) w. ¬P x]))
   - (int(size[xtake i w. P x])-int(size[xtake i w. ¬P x])) 1"

txt\noindent
  lemma is a bit hard to read because of the coercion function
 int :: nat ==> int. It is required because termsize returns
  natural number, but subtraction on type~🍋nat will do the wrong thing.
  termtake is predefined and termtake i xs is the prefix of
  termi of termxs; below we also need termdrop i xs, which
  what remains after that prefix has been dropped from termxs.

  proof is by induction on termw, with a trivial base case, and a not
  trivial induction step. Since it is essentially just arithmetic, we do not
  it.
 


apply(induct_tac w)
apply(auto simp add: abs_if take_Cons split: nat.split)
done

text
  we come to the above-mentioned lemma about cutting in half a word with two more elements of one sort than of the other sort:
 


lemma part1:
 "size[xw. P x] = size[xw. ¬P x]+2 ==>
  isize w. size[xtake i w. P x] = size[xtake i w. ¬P x]+1"

txt\noindent
  is proved by force with the help of the intermediate value theorem,
  appropriately and with its first premise disposed of by lemma
 {thm[source]step1}:
 


apply(insert nat0_intermed_int_val[OF step1, of "P" "w" "1"])
by force

text\noindent

  @{thm[source]part1} tells us only about the prefix termtake i w.
  easy lemma deals with the suffix termdrop i w:
 



lemma part2:
  "[size[xtake i w @ drop i w. P x] =
    size[xtake i w @ drop i w. ¬P x]+2;
    size[xtake i w. P x] = size[xtake i w. ¬P x]+1]
   ==> size[xdrop i w. P x] = size[xdrop i w. ¬P x]+1"
by(simp del: append_take_drop_id)

text\noindent
  the proof we have disabled the normally useful lemma
 begin{isabelle}
 {thm append_take_drop_id[no_vars]}
 rulename{append_take_drop_id}
 end{isabelle}
  allow the simplifier to apply the following lemma instead:
 {text[display]"[xxs@ys. P x] = [xxs. P x] @ [xys. P x]"}

  dispose of trivial cases automatically, the rules of the inductive
  are declared simplification rules:
 


declare S_A_B.intros[simp]

text\noindent
  could have been done earlier but was not necessary so far.

  completeness theorem tells us that if a word has the same number of
 terma's and termb's, then it is in termS, and similarly
  termA and termB:
 


theorem completeness:
  "(size[xw. x=a] = size[xw. x=b] w S)
   (size[xw. x=a] = size[xw. x=b] + 1 w A)
   (size[xw. x=b] = size[xw. x=a] + 1 w B)"

txt\noindent
  proof is by induction on termw. Structural induction would fail here
 , as we can see from the grammar, we need to make bigger steps than
  appending a single letter at the front. Hence we induct on the length
  termw, using the induction rule @{thm[source]length_induct}:
 


apply(induct_tac w rule: length_induct)
apply(rename_tac w)

txt\noindent
  rule parameter tells induct_tac explicitly which induction
  to use. For details see \S\ref{sec:complete-ind} below.
  this case the result is that we may assume the lemma already
  for all words shorter than termw. Because the induction step renames
  induction variable we rename it back to w.

  proof continues with a case distinction on termw,
  whether termw is empty or not.
 


apply(case_tac w)
 apply(simp_all)
(*<*)apply(rename_tac x v)(*>*)

txt\noindent
  disposes of the base case and leaves only a conjunction
  two step cases to be proved:
  propw = a#v and @{prop[display]"size[xv. x=a] = size[xv. x=b]+2"} then
 propb#v A, and similarly for propw = b#v.
  only consider the first case in detail.

  breaking the conjunction up into two cases, we can apply
 {thm[source]part1} to the assumption that termw contains two more terma's than termb's.
 


apply(rule conjI)
 apply(clarify)
 apply(frule part1[of "λx. x=a", simplified])
 apply(clarify)
txt\noindent
  yields an index propi length v such that
 {prop[display]"length [xtake i v . x = a] = length [xtake i v . x = b] + 1"}
  the help of @{thm[source]part2} it follows that
 {prop[display]"length [xdrop i v . x = a] = length [xdrop i v . x = b] + 1"}
 


 apply(drule part2[of "λx. x=a", simplified])
  apply(assumption)

txt\noindent
  it is time to decompose termv in the conclusion propb#v A
  termtake i v @ drop i v,
 


 apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])

txt\noindent
 the variables termn1 and termt are the result of composing the
  @{thm[source]subst} and @{thm[source]append_take_drop_id})
  which the appropriate rule of the grammar reduces the goal
  the two subgoals proptake i v A and propdrop i v A:
 


 apply(rule S_A_B.intros)

txt
  subgoals follow from the induction hypothesis because both termtake i
 
and termdrop i v are shorter than termw:
 


  apply(force simp add: min_less_iff_disj)
 apply(force split: nat_diff_split)

txt
  case propw = b#v is proved analogously:
 


apply(clarify)
apply(frule part1[of "λx. x=b", simplified])
apply(clarify)
apply(drule part2[of "λx. x=b", simplified])
 apply(assumption)
apply(rule_tac n1=i and t=v in subst[OF append_take_drop_id])
apply(rule S_A_B.intros)
 apply(force simp add: min_less_iff_disj)
by(force simp add: min_less_iff_disj split: nat_diff_split)

text
  conclude this section with a comparison of our proof with
 \index{Hopcroft, J. E.} and Ullman's\index{Ullman, J. D.}
 citep.\ts81 in HopcroftUllman.
  a start, the textbook
 , for no good reason, excludes the empty word, thus complicating
  just a little bit: they have 8 instead of our 7 productions.

  importantly, the proof itself is different: rather than
  the two directions, they perform one induction on the
  of a word. This deprives them of the beauty of rule induction,
  in the easy direction (correctness) their reasoning is more
  than our auto. For the hard part (completeness), they
  just one of the cases that our simp_all disposes of
 . Then they conclude the proof by saying about the
  cases: ``We do this in a manner similar to our method of
  for part (1); this part is left to the reader''. But this is
  the part that requires the intermediate value theorem and
  is not at all similar to the other cases (which are automatic in
 ). The authors are at least cavalier about this point and may
  have overlooked the slight difficulty lurking in the omitted
 . Such errors are found in many pen-and-paper proofs when they
  scrutinized formally.%
 index{grammars!defining inductively|)}
 


(*<*)end(*>*)

Messung V0.5 in Prozent
C=9 H=89 G=62

¤ Dauer der Verarbeitung: 0.10 Sekunden  (vorverarbeitet am  2026-06-10) ¤

*© Formatika GbR, Deutschland






Wurzel

Suchen

Beweissystem der NASA

Beweissystem Isabelle

NIST Cobol Testsuite

Cephes Mathematical Library

Wiener Entwicklungsmethode

Haftungshinweis

Die Informationen auf dieser Webseite wurden nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit, noch Qualität der bereit gestellten Informationen zugesichert.

Bemerkung:

Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.