cs-252/semantics-1/semantics.typ

// ============================================================
// Operational Semantics for BoolInt* Language — Typst version
// ============================================================

// ----- page setup (fullpage equivalent) ---------------------
#set page(margin: 1in)
#set text(size: 11pt)
#set par(justify: true)

// ----- helper functions -------------------------------------

// Language keywords rendered in monospace
#let kw(body) = text(font: "DejaVu Sans Mono", size: 0.85em, body)

#let tr = kw[true]
#let fl = kw[false]
#let ife(e1, e2, e3) = [#kw[if] #e1 #kw[then] #e2 #kw[else] #e3]
#let suc(e) = [#kw[succ] #e]
#let prd(e) = [#kw[pred] #e]
#let sec(e) = [#kw[secret] #e]

// Security labels
#let lH = $upright(H)$
#let lL = $upright(L)$
#let ljoin = $union.sq$

// Labeled value: (v, ℓ)
#let lval(v, l) = $(#v, #l)$

// Big-step evaluation relation: e ⇓ (v, ℓ)
#let bstep(e, v) = $#e arrow.b.double #v$

// Rule name label in small-caps
#let rel(name) = text(size: 0.9em, smallcaps[\[#name\]])

// Big-step rule: fraction with name on the left
// premises (content), conclusion (content)
#let bsrule(name, premises, conclusion) = {
  grid(
    columns: (auto, auto),
    column-gutter: 1em,
    align: (right + horizon, left + horizon),
    rel(name), math.equation(block: true, numbering: none, math.frac(premises, conclusion)),
  )
  v(0.4em)
}

// Definition head: "e ::= ..." with label
#let mydefhead(lhs, label) = {
  grid(
    columns: (auto, auto),
    column-gutter: 1fr,
    [#lhs], emph(label),
  )
}

// Definition case: indented alternative with description
#let mydefcase(alt, desc) = {
  grid(
    columns: (2em, auto, auto),
    column-gutter: 1fr,
    [], [#alt], [#desc],
  )
}

// ----- title block ------------------------------------------

#align(center)[
  #text(size: 17pt, weight: "bold")[Operational Semantics for BoolInt\* Language]

  #v(0.8em)

  Yuri Tatishchev \
  San José State University \
  iurii.tatishchev\@sjsu.edu
]

#v(1em)

// ----- abstract ---------------------------------------------

#align(center)[
  #block(width: 85%)[
    #set text(size: 10pt)
    *Abstract.* In this paper, we will provide a review of the big-step operational semantics
    for the BoolInt\* language we discussed in class.
    We extend the language with a #kw[secret] construct and security labels
    ($#lH$ for private, $#lL$ for public) to track information flow
    and prevent secret leakage.
  ]
]

#v(0.5em)

// ----- body text --------------------------------------------

BoolInt\* is a very minimal language that allows us to experiment with operational semantics.

First, we define the valid expressions in our language.
These expressions dictate the possibilities of what expressions
we may have in our source programs.
(Note that other language might also have _statements_.
Statements might not evaluate to a value;
expressions always will.)

// ----- Figure 1: language definition ------------------------

#figure(
  box(width: 85%, inset: (left: 2em))[
    #set align(left)
    #mydefhead[$e ::=$ #h(6em)][Expressions]
    #mydefcase(tr, [true value])
    #mydefcase(fl, [false value])
    #mydefcase($i$, [integers])
    #mydefcase(ife[$e$][$e$][$e$], [conditional expressions])
    #mydefcase(suc[$e$], [successor])
    #mydefcase(prd[$e$], [predecessor])
    #mydefcase(sec[$e$], [secret])

    #v(0.5em)

    #mydefhead[$v ::=$ #h(6em)][Values]
    #mydefcase(tr, [true value])
    #mydefcase(fl, [false value])
    #mydefcase($i$, [integers])

    #v(0.5em)

    #mydefhead[$ell ::=$ #h(6em)][Security Labels]
    #mydefcase($#lL$, [low / public])
    #mydefcase($#lH$, [high / private])
  ],
  caption: [The BoolInt\* language with security labels],
) <fig:lang>

@fig:lang shows the list of expressions, values, and security labels for the BoolInt\* language.
Expressions can be the value $#tr$, the value $#fl$,
integers $i$,
the conditional expression $#ife[$e$][$e$][$e$]$,
arithmetic operations $#suc[$e$]$ and $#prd[$e$]$,
and the security construct $#sec[$e$]$.
Note that conditional expressions have a recursive structure,
with 3 sub-expressions.

After evaluating a program, we produce a _labeled value_ $#lval[$v$][$ell$]$,
where $v$ is the value and $ell$ is a security label
indicating whether the result is public ($#lL$) or private ($#lH$).
(In other languages, we might hit a bad situation and need to crash instead;
that won't happen in this language.)
The valid values for BoolInt\* are $#tr$, $#fl$, and $i$.

We define a _join_ operation $#ljoin$ on security labels as follows:
$ #lL #ljoin #lL = #lL #h(3em) #lL #ljoin #lH = #lH #ljoin #lL = #lH #ljoin #lH = #lH $
Intuitively, any interaction with a private value taints the result as private.

With our expressions and values defined,
we can now specify the semantics for our language.
To do so, we will use the following big-step evaluation relation:

$ #bstep[$e$][$#lval[$v$][$ell$]$] $

The above line should be read as
"the expression $e$ evaluates to the value $v$ with security label $ell$".

// ----- Figure 2: big-step semantics -------------------------

#figure(
  box(width: 85%, inset: (left: 1em))[
    #set align(left)
    #text(weight: "bold")[Evaluation Rules:#h(1em)]#box(stroke: 0.5pt, inset: 4pt, $#bstep[$e$][$#lval[$v$][$ell$]$]$)

    #v(0.6em)

    #bsrule("B-Value", $emptyset$, $#bstep[$v$][$#lval[$v$][$#lL$]$]$)

    #bsrule("B-Secret", $#bstep[$e$][$#lval[$v$][$ell$]$]$, $#bstep[#sec[$e$]][$#lval[$v$][$#lH$]$]$)

    #bsrule(
      "B-IfTrue",
      $#bstep[$e_1$][$#lval[$#tr$][$ell_1$]$] #h(2em) #bstep[$e_2$][$#lval[$v$][$ell_2$]$]$,
      $#bstep[#ife[$e_1$][$e_2$][$e_3$]][$#lval[$v$][$ell_1 #ljoin ell_2$]$]$,
    )

    #bsrule(
      "B-IfFalse",
      $#bstep[$e_1$][$#lval[$#fl$][$ell_1$]$] #h(2em) #bstep[$e_3$][$#lval[$v$][$ell_3$]$]$,
      $#bstep[#ife[$e_1$][$e_2$][$e_3$]][$#lval[$v$][$ell_1 #ljoin ell_3$]$]$,
    )

    #bsrule("B-Succ", $#bstep[$e$][$#lval[$i$][$ell$]$]$, $#bstep[#suc[$e$]][$#lval[$i + 1$][$ell$]$]$)

    #bsrule("B-Pred", $#bstep[$e$][$#lval[$i$][$ell$]$]$, $#bstep[#prd[$e$]][$#lval[$i - 1$][$ell$]$]$)
  ],
  caption: [Big-step semantics for BoolInt\* with information flow],
) <fig:bigstep>

@fig:bigstep shows the big-step evaluation rules for the BoolInt\* language
with information flow tracking.

The #rel("B-Value") rule applies when the expression (to the left of "$arrow.b.double$")
is also a value, as defined in @fig:lang.
There are no premises for this rule (above the line),
meaning that it is an _axiom_.
This rule states that a bare value evaluates to itself with a public label $#lL$,
so that #tr evaluates to $#lval[$#tr$][$#lL$]$ and $3$ evaluates to $#lval[$3$][$#lL$]$.

The #rel("B-Secret") rule handles the #kw[secret] construct.
It evaluates the sub-expression $e$ and forces the security label to $#lH$,
regardless of the original label.
For example, $#sec[$3$]$ evaluates to $#lval[$3$][$#lH$]$.

Two different rules are needed for handling conditional expressions.
Which rule applies depends on the premises.
Critically, both rules _join_ the condition's label $ell_1$ with the branch result's label.

For the #rel("B-IfTrue") rule,
the premise states that $e_1$ evaluates to $#lval[$#tr$][$ell_1$]$
and $e_2$ evaluates to $#lval[$v$][$ell_2$]$.
The result carries $ell_1 #ljoin ell_2$ as its label.
The structure of #rel("B-IfFalse") is similar.

This join is essential for preventing _implicit flows_.
Consider the expression $#ife[#sec[$#tr$]][$1$][$0$]$.
Without the join, this would produce $#lval[$1$][$#lL$]$ — leaking which branch was taken
and thus revealing the secret condition.
With the join, the condition evaluates to $#lval[$#tr$][$#lH$]$,
so the result becomes $#lval[$1$][$#lH #ljoin #lL$] = #lval[$1$][$#lH$]$,
correctly tainting the output as private.

We also have rules for handling the integer operations.
The #rel("B-Succ") rule states that if the operand $e$ evaluates to $#lval[$i$][$ell$]$,
then the expression $#suc[$e$]$ evaluates to $#lval[$i + 1$][$ell$]$.
This propagates the security label faithfully.
For example, $#suc[#sec[$3$]]$ evaluates to $#lval[$4$][$#lH$]$,
whereas $#suc[$3$]$ evaluates to $#lval[$4$][$#lL$]$.
Similarly, #rel("B-Pred") evaluates $#prd[$e$]$ to $#lval[$i - 1$][$ell$]$ if $e$ evaluates to $#lval[$i$][$ell$]$.

In summary, under these semantics, there is no expression an attacker
can write that produces an $#lL$-labeled value
whose content depends on any $#lH$-labeled (secret) input.
The join in the conditional rules ensures that even _implicit_ information flows
through branch selection are tracked and labeled private.