Bisimulation for Conditional Modalities

We give a definition of bisimulation for conditional modalities interpreted on selection functions and prove the correspondence between bisimilarity and modal equivalence, generalizing the Hennessy–Milner Theorem to a wide class of conditional operators. We further investigate the operators and semantics to which these results apply. First, we show how to derive a solid notion of bisimulation for conditional belief, behaving as desired both on plausibility models and on evidence models. These novel definitions of bisimulations are exploited in a series of undefinability results. Second, we treat relativized common knowledge, underlining how the same results still hold for a different modality in a different semantics. Third, we show the flexibility of the approach by generalizing it to multi-agent systems, encompassing the case of multi-agent plausibility models.


Introduction
The Modal Logic literature offers a number of examples of conditional modalities, developed for a variety of reasons: conditionals from conditional logic, conditional belief, relativized common knowledge, to name a few. Yet there has been little work so far in developing model-theoretic tools to study such operators, which have been used mainly for the purpose of modelling our intuitions. The notable exception is conditional belief. The problem of finding the right notion of bisimulation for conditional belief has been the focal point of some recent publications in the field of formal epistemology [1][2][3]13,14].
In this paper we attempt to understand what is conditional about conditional modalities, proposing a framework that covers all the aforementioned operators. The cornerstone of our approach is a general notion of bisimulation for conditional modalities, where the latter are interpreted on selection functions. Conditional logics, together with selection functions, have a long history and tradition in philosophical logic [12,22,23,25]; they have been used in various applications such as non-monotonic inference, belief change and the analysis of intentions and desires. We thus tackle the problem at Presented by Heinrich Wansing; Received April 18, 2016 a high level of generality; this bird's eye perspectives enables a streamlined presentation of the main arguments, avoids repetitions and highlights the crucial assumptions.
To ensure that the notion of bisimulation is a good fit for the logic, the key result that one would like to obtain is the classical theorem establishing the correspondence between bisimilarity and modal equivalence, usually on some restricted class of models, echoing the analogous theorem for basic modal logic. 1 In other words, one wants to characterize exactly when two models are indistinguishable by means of a conditional modality.
Such result is however not the end of the story, a well behaved notion of bisimulation should also satisfy the following list of desiderata: 1. The bisimulation should be structural, that is, it should not make reference to formulas of the modal language besides the atomic propositions featuring in the basic condition "if w and w are bisimilar then ∀p we have w ∈ V (p) iff w ∈ V (p)". 2 2. Ideally such bisimulation should be closed under unions and relational composition. The former ensures the existence of a largest bisimulation, while the latter guarantees that the related notion of bisimilarity is transitive.
3. The definition of such bisimulation should be in principle independent from additional parts of the structure that do not appear in the semantics of the conditional modality: two states should be indistinguishable only if they behave in the same way with respect to the features that the conditional modality can "detect". This characteristic makes the bisimulation modular, allowing us to add further conditions to it in order to take care of additional operators in the language and still retain the correspondence with modal equivalence. 4. When the unconditional modality is amenable to different semantics, the bisimulation for the conditional version should generalize the bisimulation for the un-conditional modality uniformly across semantics.
We use this list as a benchmark to assess the quality of a notion of bisimulation. In this paper we provide a notion of bisimulation for conditional modalities that complies with the list and prove the correspondence between bisimilarity and modal equivalence for the semantics on selection functions.
In the rest of the paper we showcase the versatility of our framework along three directions of applications. First, we demonstrate that it applies to the same operator interpreted on different semantics (as for point 4 in our list), discussing how this approach provides a solid notion of bisimulation for conditional belief. Second, to display that our approach covers more than just conditional belief, we treat the case of another important operator, namely relativized common knowledge. Finally we explain how the central definition and results are amenable for a multi-agent generalization.
As a semantics, we consider selection functions of type W × ℘(W ) → ℘(W ), along the lines of [22]. Similar considerations can be cast in the more general framework proposed by Chellas in [12], but the generality of neighborhood selection functions is not really needed here, neither to prove our results nor to encompass the examples we mentioned; we thus limit ourselves to Lewis' original proposal. 1. for all w ∈ W we have f (w, X) ⊆ X; The intuition behind the selection function is that f (w, X) selects the worlds in X that are 'relevant' at w. 3 For a given model M, the semantics of the language is defined recursively via an interpretation function − M : L → ℘(W ), where for the propositional part of the language the clauses are the usual ones and for conditionals we have the Stalnaker-Lewis semantics: This encodes the idea that "φ is the case, conditional on ψ" in a world w iff all the worlds in ψ M that are relevant at w according to f are worlds that satisfy φ. As customary, via the interpretation function − M we can define a satisfaction relation ⊆ W × L putting M, w ψ iff w ∈ ψ M ; we will freely switch between the two notations.
To motivate our semantic clauses above, let us first recall that Gabbay [17] argues that our most general intuitions about non-monotonic derivations are captured by consequence relations NM satisfying the following three conditions, that he calls Reflexivity, Cut and Cautious Monotonicity: The Cut condition is obviously only a very special case of Gentzen's Cut rule, and it is sometimes called Cautious Transitivity. We'll adopt this last terminology, in order to avoid any confusions with the standard Cut rule. In terms of our conditional language, these requirements amount to claiming the validity of the following schemas: In terms of selection functions, the semantic clauses corresponding to these validities are: Indeed, it is easy to see that these clauses are exactly what is needed to validate the above three schemas. Moreover, they are more general than most other settings for conditional logic, conditional beliefs etc. 4 Such clauses are in fact equivalent to our requirements on conditional models, which constitute a more compact presentation.
Proposition 2. Conditional models are exactly those satisfying Gabbay's requirements, when formulated in terms of selection functions.
It is clear that Reflexivity is exactly our clause (1) in Definition 1; the following two lemmas show that, in the presence of Reflexivity, Cautious Transitivity and Cautious Monotonicity correspond to the two inclusions in our clause (2).
Lemma 3. Cautious Transitivity entails the left-to-right inclusion of condition (2) in Definition 1. In presence of Reflexivity, the latter condition entails Cautious Transitivity.
in the definition of Cautious Transitivity: the premises are now f (w, Y ) ⊆ X, which we have by assumption, and f (w, X ∩ Y ) = f (w, X) ⊆ f (w, X), which is trivially the case. By Cautious Transitivity we can then conclude f (w, Y ) ⊆ f (w, X), as desired.
For the other direction, assume . Notice now that Y and X ∩ Y satisfy the antecedent of the second condition: on Thus applying the second condition we obtain f (w, Y ) ⊆ f (w, X ∩ Y ) and we are done.
Lemma 4. Cautious Monotonicity entails the right-to-left inclusion of condition (2) in Definition 1. In presence of Reflexivity, the converse also holds. 4 One can show that Lewis' 'sphere models' are an example of conditional models. The later modification due to Grove [18], in order to model belief revision, is also a special case; interestingly, the appropriate selection function is suggested by Grove himself in [18] p. 159. As we will show, our clauses are weaker than the semantic requirements of conditional doxastic logic. A further example are the models for non-monotonic logics. Our conditions are more general than the models of, for example, the non-monotonic system P of Kraus, Lehmann and Magidor [21] or the conditional logic introduced by Halpern in [19].
. The former is given by assumption and the latter is a tautology, so applying Cautious Monotonicity we can conclude f (w, We now turn to the definition of bisimulation for conditional modalities, that is, the notion that is supposed to capture when two models are indistinguishable from the perspective of our conditional language. First we lay out some notation: given a relation Definition 5. (Bisimulation) Given two conditional models M 1 and M 2 , a conditional bisimulation is a non-empty relation • for all X ⊆ W 1 and X ⊆ W 2 such that Z[X] ⊆ X and Z −1 [X ] ⊆ X we have that for every x ∈ f 1 (w, X) there exists a y ∈ f 2 (w , X ) (where f 2 is the selection function in M 2 ) such that (x, y) ∈ Z, and vice versa.
The non-standard part of this definition, namely the quantification over subsets X and X together with the additional requirement Z[X] ⊆ X and Z −1 [X ] ⊆ X, is meant to handle the precondition ψ in the conditional ψ φ. One would want the sets X and X in the definition to be modally definable. However, to ensure that those sets are modally definable we would have to quantify over the formulas in the language and this would clash with the desideratum of having a structural bisimulation. Our solution is to replace "modally definable" with a structural condition that is close enough. 5 The relation of (conditional) bisimilarity is defined as the existence of a conditional bisimulation: two states w and w are bisimilar iff there exists a conditional bisimulation Z such that (w, w ) ∈ Z. In other words, the relation of bisimilarity between models M 1 and M 2 is the union of all the bisimulation relations between these models. The next result implies that bisimilarity is itself a bisimulation, and hence it is the largest bisimulation between two given models. Proof. Given a family of conditional bisimulations {Z i ⊆ W 1 × W 2 } i∈I , consider their union i∈I Z i . Suppose (w, w ) ∈ i∈I Z i and two sets X ⊆ To establish that i∈I Z i is a conditional bisimulation we need to show that for every x ∈ f 1 (w, X) there is y ∈ f 2 (w , X ) such that (x, y) ∈ i∈I Z i . Notice that from (w, w ) ∈ i∈I Z i we can deduce that there is an index i for which (w, w ) ∈ Z i . We also know that . Therefore X and X also satisfy the preconditions for the relation Z i : applying the property of conditional bisimulation we obtain that for every x ∈ f 1 (w, X) there is y ∈ f 2 (w , X ) such that (x, y) ∈ Z i . The latter fact entails (x, y) ∈ i∈I Z i , we are done. The converse direction is proved symmetrically.
The last proposition secures only half of our second desideratum for a notion of bisimulation (see list in the "Introduction"). We postpone the matter of relational composition to the sext subsection. The next thing to check is that our definition is suited to our conditional language: bisimilar states satisfy the same conditional formulas.  Proof. The proof is by induction on the structure of formulas; the case of p, ¬, ∧ are treated as usual, we only show the case of the conditional modality. Suppose Z is a conditional bisimulation, (w, w ) ∈ Z and M 1 , w ψ φ. Note that by induction hypothesis on ψ we have that ψ M 1 and ψ M 2 satisfy the right requirements and therefore can act as X and X in the preconditions of the bisimulation property. Because of w ψ φ we have . By vice versa of the bisimulation property we know that there exists a v ∈ f 1 (w, ψ M 1 ) such that (v, v ) ∈ Z. By assumption and induction hypothesis on φ we get M 2 , v φ. Since v was generic we can conclude that For the converse use the other direction of the bisimulation property.
Our next theorem is the key result of this paper, providing a partial converse to the previous result. This is an analogue of the Hennessy-Milner-van Benthem theorem from modal logic, saying that on finite models bisimilarity completely captures L -equivalence. Proof. We show that the relation Z of L -equivalence is a (conditional) bisimulation. First a preliminary observation. Suppose X and X are two sets satisfying Z[X] ⊆ X and Z −1 [X ] ⊆ X. We show how to build a formula α that plays the role of X and X as precondition. Notice that we can divide the domain of M 1 into three disjoint parts • X • A, the set of elements having some L -equivalent counterparts in X Notice how the conditions on X and X ensure that the elements in A do not have any counterpart in M 2 : a ∈ A cannot have a L -equivalent counterpart in X , or otherwise a would be already in X; on the other hand a cannot have an L -equivalent counterpart in W 2 \X or X itself would violate the first precondition. A symmetric partition can be defined on the model M 2 , switching the roles of X and X ; we will indicate with A the corresponding region in M 2 .
Since the image of X under Z lies within X , we know that the elements in X are not L -equivalent to the elements outside X , thus the elements in X ∪ A are also not L -equivalent to the elements outside X . Since we are dealing with finite models we can enumerate the elements in X ∪A, call them x 1 , . . . , x n . Similarly, we can put the elements of W 2 \X and W 1 \(X ∪ A) all together in a finite list y 1 , . . . , y m . By our assumptions and definition of the partition we know that every element in X ∪ A is not L -equivalent to any element in W 2 \X or W 1 \(X ∪ A). So for each i and j, with 1 ≤ i ≤ n and 1 ≤ j ≤ m, there is a formula ψ ij such that x i ψ ij and y j ψ ij . We can thus construct a formula . Symmetrically, there must be a formula γ that is true at X ∪ A and false at W 1 \X ∪ (W 2 \(X ∪ A )). Now consider the formula Let us have a closer look at the extension α M 1 of α in M 1 . We have that γ is false outside X, hence its extension lies within X. As for γ, we know it is true at X ∪ A and false in W 1 \(X ∪ A). Thus the extension of γ ∨ γ , and therefore of the formula α itself, is X ∪ A. We can make an analogous argument to show that the interpretation of α in M 2 is X ∪ A .
Say now that (w, w ) ∈ Z and suppose Z does not satisfy the bisimulation property for sets X and X : this means that there is an We first prove that by the first property of selection functions, we know that z must be either in X or in A. If z ∈ f 1 (w, α M 1 ) is in A, since we know that elements in A are not L -equivalent to any element in W 2 , we can build a formula β that is false at z and true everywhere in W 2 , thus a fortiori in f 2 (w , α M 2 ). This gives us the contradiction that we want: w ¬(α β) and w α β. We can thus conclude that f 1 (w, α M 1 ) ⊆ X. This is enough to apply the second property of selection functions and conclude that This ensures that the element x ∈ f (w, X) given by assumption is indeed also in f 1 (w, α M 1 ). If we now look at the set f 2 (w , α M 2 ), repeating a reasoning similar to the one just outlined we can conclude that . By assumption we have that x is not Lequivalent to any y ∈ f 2 (w , X ). We can thus build a formula β that is false at x and true everywhere in f 2 (w , α M 2 ); this gives us the contradiction w ¬(α β) and w α β.

Closure Under Composition
Closure under relational composition turns out to be more tricky: we need bisimulation to 'transfer' preconditions in a coherent manner. In this subsection we propose a sufficient condition to obtain closure under relational composition.
If f (w, X) selects the worlds in X that are 'relevant' at w, the set W w is the collection of all the relevant worlds for w, taking into account all possible preconditions. 6 A conditional model is grounded when, given a precondition X that is consistent with the collection of all worlds relevant for w, the selection function returns a non-empty set of relevant worlds for w in X. The idea that conditioning with sets that are consistent with the current information should yield consistent results is widespread in Formal Epistemology, see for example Lewis in [22] and Board in [11]. The following equivalent definition of grounded models will be useful in later sections.
Proof. The new condition is a special case of the main definition when instantiated to singletons, so one direction is given. For the right-to-left direction, suppose by contradiction that X ∩ W w = ∅ and f (w, X) = ∅. Let x ∈ X ∩ W w : we have x ∈ W w and thus f (w, {x}) = ∅. However, {x} ⊆ X and f (w, X) = ∅ ⊆ {x} trigger the second condition on conditional models, which states that f (w, X) = f (w, {x}), contradiction.
, and vice versa. The idea of diffuse bisimulations is that every elements belongs to a set of relevant worlds which is connected to the other model.
Lemma 14. Any diffuse conditional bisimulation between grounded models is two-ways surjective.
Proof. Let M 1 and M 2 be such models and suppose Z ⊆ W 1 × W 2 is a conditional bisimulation. Suppose moreover that Z is not two-ways surjective, say because there is an x ∈ W 1 with no counterpart in W 2 . Take {x} and ∅ and notice that they fulfill the preconditions of the property of conditional bisimulation: Since the bisimulation is diffuse we know that there are w ∈ W 1 and w ∈ W 2 such that (w, w ) ∈ Z and x ∈ W w 1 . From the latter fact we infer that {x} ∩ W w 1 = ∅, thus by the fact that M 1 is grounded we conclude that However, by the first condition on selection function we have f 2 (w , ∅) ⊆ ∅, so there can be no counterpart for x, contradiction. The other direction is proved analogously.
Proposition 15. Restricted to any class of grounded models, the notion of diffuse conditional bisimulation is closed under relational composition.
Proof. Suppose M 1 , M 2 and M 3 are three grounded models and Z 1 ⊆ W 1 × W 2 and Z 2 ⊆ W 2 × W 3 are two diffuse conditional bisimulations connecting them. To show that their relational composition Z 1 ; Z 2 is also a diffuse conditional bisimulation we first need to show that it is not empty. By Z 1 being not empty we know that there is (w, w ) ∈ Z 1 . By the previous Lemma we know that Z 1 and Z 2 are two-ways surjective. The latter fact ensures that there is some w such that (w , w ) ∈ Z 2 , thus (w, w ) ∈ Z 1 ; Z 2 .
For the property of conditional bisimulation, suppose (w, w ) ∈ Z 1 ; Z 2 . By definition it means that there is a w such that (w, w ) ∈ Z 1 and (w , w ) ∈ Z 2 . Now consider two sets X ⊆ W 1 and X ⊆ W 3 such that What we need to show is that for every The first item holds by definition of X . For the second one suppose (x, y) ∈ Z 1 and y ∈ X . By two-ways surjectivity of Z 2 we know that there is a z such that (y, z) ∈ Z 2 , hence (x, z) ∈ Z 1 ; Z 2 . By definition of X we can now make a case distinction. In the first case there is an element x ∈ X such that (x , y) ∈ Z 1 . We can then conclude that (x , z) ∈ Z 1 ; Z 2 and thus by assumption Z 1 ; Z 2 [X] ⊆ X we have z ∈ X . But then by the latter fact and (x, z) ∈ Z 1 ; Z 2 , coupled with (Z 1 ; Z 2 ) −1 [X ] ⊆ X, we can infer that x ∈ X. In the second case we have that there is a z ∈ X such that (y, z ) ∈ Z 2 . This gives us immediately that (x, z ) ∈ Z 1 ; Z 2 and thus by assumption Since X and X fulfill the preconditions of the property of conditional bisimulation for Z 1 , we can deduce that for every We can now repeat the same proof strategy for X and X and apply the property of Z 2 to obtain that for every Concatenating this with the previous result we get the desired conclusion: for every The converse is proved symmetrically.
Proposition 16. Restricted to grounded models and diffuse conditional bisimulations, the relation of bisimilarity is an equivalence relation.
Proof. We need to show that the relation of bisimilarity is reflexive, symmetric and transitive. For reflexivity, it is immediate to see that the identity relation is a diffuse conditional bisimulation. The definition of diffuse conditional bisimulation is itself symmetric, hence the converse of a diffuse conditional bisimulation is always a diffuse conditional bisimulation; the symmetry for bisimilarity follows. As for transitivity, Proposition 15 ensures that if there are two diffuse conditional bisimulations Z 1 and Z 2 such that (w, w ) ∈ Z 1 and (w , w ) ∈ Z 2 then there is a diffuse conditional bisimulation containing the pair (w, w ), namely the relational composition We will see that in the next two sections these restrictions vanish, because in those particular settings all models are grounded and all bisimulations are diffuse. In later sections we will encounter examples where the restriction does limit the scope of our results; we then characterize the grounded models and diffuse bisimulations in those particular contexts.

Plausibility Models
We now turn to applications, discussing our first example of conditional modality: conditional belief interpreted on plausibility models. Plausibility models are widely used in formal epistemology [4,6], while their introduction can be traced back at least to [22]. They consist of a carrier set, to be understood as a collection of possible worlds, and a preorder for each world, intuitively representing how an agent ranks the possible scenarios in terms of plausibility, from the perspective of the current world. The strict relation < w is defined as usual from ≤ w . Given a set X ⊆ W , let We can think of Min w (X) as the set of most plausible worlds in X with respect to w. 7 When we want to specify the ordering we write Min ≤ w (X).
Among the variety of operators that are studied in the setting of plausibility models, a prominent part is played by the operator of conditional belief, usually written as B ψ φ. The standard belief operator can be defined via the conditional one as B φ. On plausibility models the semantic clause for belief and conditional belief are: The notion of bisimulation for the standard belief operator on plausibility models, together with the corresponding Theorem, are both folklore.
Definition 18. Given two plausibility models M 1 and M 2 , a plausibility B-bisimulation is a non-empty relation • for every x ∈ Min w W 1 there is y ∈ Min w W 2 such that (x, y) ∈ Z, and vice versa.
Theorem 19. Bisimilarity with respect to plausibility B-bisimulation entails modal equivalence with respect to the language with only the belief operator. On models having finitely many minimal elements, modal equivalence with respect to the latter language entails bisimilarity for plausibility B-bisimulation.
The proof is a straightforward variation of the standard Hennessy-Milner argument.

Plausibility CB-Bisimulation
To obtain a bisimulation for conditional belief on plausibility models we show how the latter are an instance of conditional models; this move will indicate a systematic way to specialize the results of Section 2 to this particular context. Proof. We need to check that the newly defined f fulfills the prerequisites of selection functions in Definition 1. The first condition on selection functions is fulfilled by the very definition of Min w . For the second one, suppose X ⊆ Y , Min w Y ⊆ X and take x ∈ Min w Y . Since X ⊆ Y , if there is no element below x in Y then a fortiori there is no element below it in the subset X, thus in this circumstance x ∈ Min w X. For the other inclusion take x ∈ Min w X; we show x is also minimal for Y . By contradiction, suppose there is z ∈ Y \X such that z < w x . Since we are in a well-founded model there must be a minimal element z ∈ Min w Y such that z ≤ w z; but by assumption Min w Y ⊆ X, hence z ∈ X and z < x , contradicting the fact that x is minimal in X.
Notice that, setting f (w, X) = Min w X, the definition of the satisfaction relation for conditional belief becomes an instance of the satisfaction relation for conditional modalities given in Section 2. If we now replace the new f in Definition 5, we obtain a new notion of bisimulation for conditional belief on plausibility models.
Definition 22. Given two plausibility models M 1 and M 2 , a plausibility CB-bisimulation is a non-empty relation • for all X ⊆ W 1 and X ⊆ W 2 such that Z[X] ⊆ X and Z −1 [X ] ⊆ X we have that for every x ∈ Min w X then there exists a y ∈ Min w X such that (x, y) ∈ Z, and vice versa.
Since finite plausibility models are well-founded, we can now transfer the results of Section 2 on the correspondence between bisimilarity and modal equivalence. Throughout this section and the following one we use 'modal equivalence' meaning with respect to the language of conditional belief.
We can also import the results concerning the closure under union and relational composition. First note that, with the current definition of f , the notation W w trivializes: Proof. Given a well-founded plausibility model M and X ⊆ W , if X ∩ W w = ∅ then X ∩ W = ∅ so actually X = ∅. So by well-foundedness f (w, X) = MinX = ∅. This shows that the model is grounded. For the second part of the claim, let Z ⊆ W 1 × W 2 be a plausibility CB-bisimulation and x ∈ W 1 . Since the bisimulation is non-empty, there are (w, w ) ∈ Z and furthermore x ∈ W 1 = W w 1 , hence Z is diffuse. Proposition 25. On the class of well-founded plausibility models, the notion of plausibility CB-bisimulation is closed under arbitrary unions and relational composition.

Undefinability
In this subsection we put the new notion of bisimulation to use, addressing the problem of inter-definability between conditional belief and other widelyused operators. For the rest of this section we employ plausibility models where ≤ w is the same for all w (we thus remove the subscript). We begin with the operator of safe belief introduced in [5]: The dual operator is customarily defined as ≤ φ := ¬[≤]¬φ.
Proposition 26. On plausibility models, safe belief is not definable in terms of the conditional belief operator.
Proof. Suppose ≤ p is definable by a formula α in the language of conditional belief. Consider the two models depicted on the left and right side of the following picture (we use rounded rectangles to set apart the worlds of the left-hand model), where we draw a ← b to mean a ≤ b. We omit reflexive arrows. We indicate within parenthesis the propositional atoms that are true at every world and with Z a CB-bisimulation between the two models: It is easy to see that the minimal elements of these pairs are connected by the bisimulation. Given that α is a formula in the language of conditional belief, it will be invariant between states that are bisimilar according to a CB-bisimulation. However, ≤ p is true in the second model at 4 but false in the first model at 1, as the reader can check; contradiction.
Notice that the CB-bisimulation Z of this counterexample is not a bisimulation for safe belief, since it fails to satisfy the zig-zag condition: there are worlds 1, 5 and 4 such that (1, 4) ∈ Z and 5 ≤ 4 but no world w such that w ≤ 1 and (w, 5) ∈ Z. We now address the case of the strong belief operator, also introduced in [5].

Proposition 27. On plausibility models, strong belief is not definable in terms of the conditional belief operator.
Proof. Again, suppose Sbp is definable by a formula α in the language of conditional belief. Consider the two models displayed below, where Z a CB-bisimulation and the propositional variables are attached to worlds as before: The formula α in the language of conditional belief will be invariant between states that are bisimilar according to a CB-bisimulation; nevertheless, Sbp is true in the first model at 1 but false in the second model at 4, thus α will be true in one world and not in the other: contradiction.
We now turn our attention to the definability of the conditional belief operator itself. We first warm up with a definition and two auxiliary observations.
Definition 28. A BSB-bisimulation, a bisimulation for standard belief and safe belief, is a B-bisimulation satisfying an additional condition, namely the usual zig-zag condition for the ≤ relation: given two plausibility models M and M and two worlds w and w in the respective models, if (w, w ) ∈ Z then Proposition 29. On plausibility models, if two states w and w are in a BSB-bisimulation then they are modally equivalent with respect to the language containing the belief and safe belief operators.
Proof. Straightforward induction on the complexity of the formula.

Proposition 30. On plausibility models, conditional belief is not definable in terms of the language containing the operators of safe belief and standard belief.
Proof. Suppose B ¬p q is definable by a formula α in the language of belief and safe belief. Consider the two models displayed below, where Z a BSBbisimulation and the propositional variables are attached to worlds as before: Since 2 and 4 are in a BSB-bisimulation, by Proposition 29 they are modally equivalent in the language of belief and safe belief. Thus we can conclude 2 α iff 4 α. But 2 B ¬p q and 4 B ¬p q, contradiction.
Notice that the bisimulation used in this counterexample is not a plausibility CB-bisimulation.

Evidence Models
We now change the semantics of the belief operator to evidence models, showing how the passage to conditional belief in this different setting follows the same pattern as in plausibility models; this allows us to conclude that the generalization from un-conditional to conditional modality works uniformly across semantics (see item 4 in our checklist in the "Introduction").
Evidence models, introduced in [8], are structures capturing the evidence available to an agent in different possible worlds. The evidence available at a world w is represented via a family of sets of possible worlds: intuitively each set in the family constitutes a piece of evidence that the agent can use to draw conclusions at w. They constitute a generalization over plausibility models, but can be collapsed to plausibility models by considering the specialization preorder induced by the sets of evidence, however not without loss of information. 9 Definition 31. An evidence model is a tuple M = W, E, V with W a non-empty set of worlds, a function E : W → ℘(℘(W )) and V : W → ℘(At) a valuation function.
We indicate with E(w) the set of subsets image of w. We furthermore assume W ∈ E(w) and ∅ ∈ E(w) for all w ∈ W .
The last requirement ensures that at every possible world the agents has trivial evidence, namely the whole set W , and does not have inconsistent evidence, i.e. the empty set.
Definition 32. A w-scenario is a maximal family X ⊆ E(w) having the finite intersection property (abbreviated in 'f.i.p.'), that is, for each finite subfamily {X 1 , . . . , X n } ⊆ X we have 1≤i≤n X i = ∅. Given a set X ⊆ W and a collection X ⊆ E(w), the latter has the f.i.p. relative to X if for each finite subfamily {X 1 , . . . , X n } ⊆ X X = {Y ∩ X|Y ∈ X} we have 1≤i≤n X i = ∅. We say that X is an w-X-scenario if it is a maximal family with the f.i.p. relative to X.
The semantics for belief and conditional belief on evidence models is: The notion of bisimulation for the standard belief operator on evidence models establishes a connection between the scenarios of the two models: Definition 33. Given two evidence models M 1 and M 2 , an evidence Bbisimulation is a non-empty relation • for every w-scenario X and x ∈ X there is a w -scenario Y and y ∈ Y such that (x, y) ∈ Z, and vice versa.
The following result can be proven via the standard line of reasoning.
Theorem 34. Bisimilarity with respect to evidence B-bisimulation entails modal equivalence with respect to the language with only the belief operator. On finite models, modal equivalence with respect to the latter language entails bisimilarity for evidence B-bisimulation.
Footnote 9 continued of [18] also constitute an example of neighborhood models with a close tie to relational structures.

Evidence CB-Bisimulation
We first show that finite evidence models are an example of conditional models by means of two auxiliary lemmas.
Lemma 35. On finite models, suppose Y ⊇ X. Then for every w- Proof. Let X be a w-X-scenario. Clearly X already has the f.i.p. relative to Y . Enumerate the sets K in E(w) (there are finitely many), then proceed following the enumeration: if K ∈ X or X ∪ {K} has the f.i.p. relative to Y then put K in Y, otherwise not. Because of the first condition we get X ⊆ Y, while from the second one we obtain that Y is a w-Y -scenario. For the second claim, enumerate the sets in Y: K 0 , . . . , K m . Construct X in stages beginning from X 0 = ∅ and putting To see that X is maximal with the f.i.p. relative to X suppose that there is K ∈ X such that X X ∩ K = ∅. By construction, if X X ∩ K = ∅ and K ∈ X then K ∈ Y, hence by the maximality of Y it must be that Y Y ∩ K = ∅. Since X X ⊆ Y Y by construction we get a contradiction. Therefore X is maximal with the f.i.p. relative to X.
Proof. Say y ∈ Y Y and y ∈ Y \X. Then, since y ∈ Y , it must be that y ∈ X. Since y ∈ Y Y we have that y ∈ K for all K ∈ Y, and hence y ∈ K for all K ∈ X . So y ∈ X X . We can thus conclude that, if y ∈ Y \X for all y ∈ Y Y , X X ⊇ Y Y . For the other inclusion suppose z ∈ X X but not in Y Y . Then there must be K ∈ Y such that K ∈ X and z ∈ K. By maximality of X it must be that K has empty intersection with X X . Under the assumption that no element y ∈ Y Y is in Y \X, the latter fact entails that Y Y must be empty, contradiction. Hence there can be no element z that is in Proposition 37. Finite evidence models are conditional models, where Proof. The satisfaction of the first property is ensured by the definition of X X : since each X X lies within X, the union will also be contained in X.
For the second property suppose Y ⊇ X and f (w, Y ) ⊆ X. If x ∈ f (w, Y ) then there is a w-Y -scenario Y such that x ∈ Y Y . By Lemma 35 we know there is a w-X-scenario X such that X ⊆ Y. By Lemma 36 either x ∈ X X or x ∈ Y \X. But the latter cannot be because x ∈ X by assumption, so x ∈ X X . Then we can conclude that x ∈ f (w, X).
, so by the second part of Lemma 36 we can conclude that X X = Y Y . This gives us x ∈ Y Y and thus x ∈ f (w, Y ).
Notice that, setting f (w, X) = { X X |for X w-W -scenario}, the definition of the satisfaction relation for conditional belief on evidence models becomes an instance of the satisfaction relation for conditional modalities given in Section 2. Replacing the new f in Definition 5, we obtain a new notion of bisimulation for conditional belief on evidence models.
Definition 38. Given two evidence models M 1 and M 2 , an evidence CBbisimulation is a non-empty relation • for all X ⊆ W 1 and X ⊆ W 2 such that Z[X] ⊆ X and Z −1 [X ] ⊆ X we have that for every w-X-scenario X and x ∈ X X there is a w -Xscenario Y and y ∈ Y X such that (x, y) ∈ Z, and vice versa.
We can now specialize the results of Section 2: bisimilarity in the latter sense corresponds to modal equivalence on finite evidence models.
Theorem 39. Given two evidence models M 1 and M 2 if (w, w ) ∈ Z ⊆ W 1 × W 2 , where Z is an evidence CB-bisimulation, then w and w are modally equivalent. On finite evidence models, if w and w are modally equivalent then (w, w ) ∈ Z ⊆ W 1 × W 2 , where Z is an evidence CB-bisimulation.
As for plausibility models, we can infer the results concerning the closure under union and relational composition. Also in this context the definition of f renders the notation W w trivial.  f (w, {x}) is not empty. To find the desired w-{x}-scenario X , take the family of all the sets in E(w) containing x. This family is non-empty, since W ∈ E(w) for every w in the domain of the model. Clearly this family is maximal with the f.i.p. relative to {x} (not only, it is the only one), so we are done.
We can thus derive that, for this particular f : In other words, all the worlds in the model are relevant for every w ∈ W , for every w ∈ W .
Lemma 41. Every evidence model is a grounded conditional model and every evidence CB-bisimulation is diffuse.
Proof. Thanks to the previous Lemma we can appeal to Lemma 11 and conclude that evidence models are grounded. For the second part of the claim, let Z ⊆ W 1 × W 2 be a evidence CB-bisimulation and x ∈ W 1 . Since the bisimulation is non-empty, there are (w, w ) ∈ Z and furthermore x ∈ Proposition 42. The notion of evidence CB-bisimulation is closed under arbitrary unions and relational composition.

Undefinability
Thanks to the now clearly defined bisimulation for conditional belief, we can give a precise argument for the undefinability of conditional belief in terms of standard belief.
Proposition 43. On evidence models, conditional belief is not definable in terms of the standard belief operator.
Proof. Suppose B p q is definable by a formula α in the language of standard belief. Consider the two models depicted on the left and right side of the following picture, where we indicate within parenthesis the propositional atoms that are true at every world and with Z an evidence B-bisimulation between the two models: The evidence available at each world is: The reader can check that the relation Z is an evidence B-bisimulation. Given that α is a formula in the language of belief, it will be invariant between states that are bisimilar according to a B-bisimulation. However, B p q is true in the second model at 4 but false in the first model at 1: there is Hence we obtain a contradiction.
Note that the relation Z is not an evidence CB-bisimulation: the sets of worlds satisfying p in the two models satisfy the prerequisites, they are sent into each other by Z, but fail with respect to the main property, since there is a 1-p M 1 -scenario X , and an element in X p M 1 , namely 2, that has no bisimilar counterpart in the second model.
Another important operator to describe the features of evidence models is the so-called evidence modality [8].
It was shown in [8] that, on evidence models, standard belief cannot be defined in terms of the evidence modality. Since standard belief is definable in terms of conditional belief, we can conclude that also conditional belief is not definable via the evidence modality. Here we show that also the converse is the case.
Proposition 44. On evidence models, the evidence modality is not definable in terms of the conditional belief operator.
Proof. Suppose p is definable by a formula α in the language of conditional belief. Consider the two models depicted on the left and right side of the following picture, where we indicate within parenthesis the propositional atoms that are true at every world and with Z a CB-bisimulation between the two models: We take both models to be uniform, where The reader can check that with this evidence the relation Z is a CBbisimulation. Given that α is a formula in the language of normal belief, it will be invariant between states that are bisimilar according to a CBbisimulation. Nevertheless, p is true in the first model at 1 but false in the second model at 3: in the first model there is an evidence set contained in the extension of p, namely {2}, while there is no such set in the second model; contradiction.

Relativized Common Knowledge
We now introduce a third example, the conditional modality known as relativized common knowledge, defined in [9]. Let M = W, {R a } a∈A , V be a multi-agent Kripke model, where W is a non-empty set of worlds, relations R a ⊆ W × W and V : W → ℘(At) a valuation function. Put R := a∈A R a and denote by R + its transitive closure. The operator of relativized common knowledge, denoted with C(φ, ψ), is meant to capture the intuition that every R-path which consists exclusively of φ-worlds ends in a world satisfying ψ. Formally: Moreover, our semantics for conditionals for this f coincides with the above semantics for C(φ, ψ).
Proof. Again we check the prerequisites of selection functions in Definition 1. Clearly all the worlds reachable with a path in X will also lie in X, hence the first condition on selection functions is given. For the second one, . Hence there is a chain of Y -worlds leading to x . We show x ∈ f (w, X) by induction on the length of the chain. The base case: by reflexivity w ∈ f (w, Y ) ⊆ X so we also have a chain of length 0 contained in X, i.e. w ∈ f (w, X). Suppose now x ∈ f (w, X) for all x ∈ f (w, Y ) reachable with a chain of Y -worlds of length ≤ n. Now say x ∈ f (w, Y ) is reachable with a chain of Y -worlds of length n + 1. By x ∈ f (w, Y ) ⊆ X we know that also x ∈ X, thus the whole chain is in X and x ∈ f (w, X).
For the other inclusion, it is straightforward to see that X ⊆ Y immediately entails f (w, X) ⊆ f (w, Y ).
Replacing the new f in Definition 5, we obtain a new notion of bisimulation for conditional belief on plausibility models.
Definition 46. Given two Kripke models M 1 and M 2 , a bisimulation for relativized common knowledge or RCK-bisimulation is a non-empty relation have that for every x such that (w, x) ∈ (R 1 ∩ (W 1 × X)) + there exists a y such that (w , y) ∈ (R 2 ∩ (W 2 × X )) + such that (x, y) ∈ Z, and vice versa.
We can now derive our previous results for this specific setting. In this section we use 'modal equivalence' meaning with respect to the language containing only the usual propositional connectives and the relativized common knowledge operator.
Theorem 47. Given two Kripke models M 1 and M 2 , if (w, w ) ∈ Z ⊆ W 1 × W 2 , where Z is a RCK-bisimulation, then w and w are modally equivalent. On finite models, if w and w are modally equivalent then they are RCKbisimilar.
The closure under unions also follows. As for composition, note that the notion of relevant worlds for w, indicated with W w , starts to play a significant part, limiting the scope of our general results. Putting together the definition W w = Y ⊆W f (w, Y ) and f (w, X) := {v|(w, v) ∈ (R ∩ (W × X)) + }, W w becomes the set of all the worlds reachable from w via an R-path (just substitute W for X in the definition of f (w, X)). Formally, W w = {v|(w, v) ∈ R + }. We can then characterize the grounded Kripke models.
Proposition 48. A Kripke model M is grounded iff, for every w, x ∈ W , if (w, x) ∈ R + then there is an agent a such that (w, x) ∈ R a .
Proof. Let M be grounded. By Lemma 11 This entails that there is an edge (w, x) ∈ R, thus there must be an agent a such that (w, x) ∈ R a .
For the other direction, let X ⊆ W and w ∈ W and suppose X ∩W w = ∅. Then there is x ∈ X such that (w, x) ∈ R + . By our assumption on the model we know there is an agent a such that (w, x) ∈ R a . This is enough to conclude (w, x) ∈ R and thus x ∈ f (w, In this context, a bisimulation Z ⊆ W 1 ×W 2 is diffuse if, for every x ∈ W 1 , there are w ∈ W 1 and w ∈ W 2 such that (w, w ) ∈ Z and x can be reached from w via an R-path (and vice versa).
Proposition 49. On grounded Kripke models, diffuse bisimulations are closed under relational composition.

Generalization to Multi-agent Models
We have seen how our framework covers different conditional modalities, even when the same operator is interpreted on different semantics. Now we address the question: can we extend the analysis of Section 3 to cover the multi-agent case? Given a set of agents A, the language we are interested in will look like where a will denote the modality for agent a. This leads to an easy generalization of conditional models. The set of agents is nothing more than a set of labels for different selection functions, co-existing in the same models but essentially independent from each other. Instead of different agents, different labels could indicate different operators expressing distinct features of the models, depending on the interpretation. The semantics clause for the conditional modalities becomes: for every a ∈ A. Likewise, the bisimulation can also be relativized in the same fashion.
Definition 51. (Multi-agent Conditional Bisimulation) Given two multiagent conditional models M 1 and M 2 based on the same set of agents, a multi-agent conditional bisimulation is a non-empty relation we have that, for every a ∈ A, for every x ∈ f 1 a (w, X) there exists a y ∈ f 2 a (w , X ) (where f 2 's are the selection functions in M 2 ) such that (x, y) ∈ Z, and vice versa.
The proofs of the following results are a straightforward generalization of the proofs of the analogous single-agent statements.
Theorem 52. Given two multi-agent conditional models where Z is a multi-agent conditional bisimulation, then w and w are modally equivalent with respect to the logic of conditionals. On finite multi-agent conditional models, if w and w are modally equivalent then (w, w ) ∈ Z ⊆ W 1 × W 2 , where Z is a multi-agent conditional bisimulation.
Proposition 53. Multi-agent conditional bisimulations are closed under arbitrary unions.
The definitions of grounded models and diffuse bisimulation have to be generalized accordingly.
Definition 55. A multi-agent conditional bisimulation Z ⊆ W 1 × W 2 is diffuse if for every x ∈ W 1 there are a ∈ A, w ∈ W 1 and w ∈ W 2 such that (w, w ) ∈ Z and x ∈ W w 1,a , and vice versa.
Proposition 56. Restricted to any class of multi-agent grounded models, the notion of multi-agent diffuse conditional bisimulation is closed under relational composition.

Multi-agent Plausibility Models
We now turn to our fourth and last example, meant to display how the general definitions unfold in the multi-agent case. Our structure of choice is multi-agent plausibility models, a popular device used to model the knowledge and beliefs of different agents [4]. The semantics of the multi-agent belief and conditional belief operators on (well-founded) multi-agent plausibility models is given by: Proposition 58. Well-founded multi-agent plausibility models are multiagent conditional models, where f a (w, X) = Min ≤ a,w (X ∩ [w] ∼ a ).
Proof. We want to ascertain that the newly defined f a fulfills the prerequisites of selection functions in Definition 1. The first condition is again given by definition. For the second one, suppose Thus since x ∈ X and then there is no element below x in Y ∩ [w] ∼ a then a fortiori there is no element be- Proposition 60. Well-founded multi-agent plausibility models are grounded.
In this setting a multi-agent plausibility CB-bisimulation Z ⊆ W 1 × W 2 is diffuse if, for every x ∈ W 1 , there are w ∈ W 1 and w ∈ W 2 such that (w, w ) ∈ Z and x is in the information cell [w] ∼ a (and vice versa).
Proposition 61. On well-founded multi-agent plausibility models, diffuse multi-agent plausibility CB-bisimulation are closed under relational composition.

Related Work
In light of the examples treated in the previous sections, one may wonder whether there are conditional modalities that do not fall under the scope of our framework. One example is relevant implication. The proponents of this connective intend to overcome the counterintuitive properties of material implication by a notion of entailment that consider the relevance of the antecedent with respect to the consequent. This particular kind of entailment, that we will denote with φ ⇒ ψ, is interpreted on ternary relations with a semantics that goes back at least to [24].
M, w φ ⇒ ψ iff for all v, v we have that M, v φ and (w, v, v ) ∈ R entail M, v ψ.
A possible selection function for this conditional modality could be It is not hard to see, however, that such a selection function fails to satisfy the first requirement on conditional models: nothing in the definition of f (w, X) ensures that v ∈ X. This observation aligns with the intentions of the advocates of relevant implications, who contemplate the possibility of p ⇒ p failing at some worlds. A different notion of bisimulation for conditional belief on multi-agent plausibility models was recently introduced in [3]. The authors prove the correspondence between bisimilarity and modal equivalence, respectively for the languages containing conditional belief and knowledge, safe belief and knowledge, degrees of belief and knowledge. But that analysis is confined to doxastic logic. Our approach has the following two distinctive features. First, the bisimulation for conditional belief stems from a general analysis of conditional modalities and it is not tailored to a specific application. This generality has the pleasant consequence that the key notions and proofs are relatively simple and transparent. Second, the notion of bisimulation for conditional belief offered here is modular, in the sense that it can be merged with other conditions when we consider languages with additional operators. In contrast, some results in [3] depend crucially on the existence of the knowledge operator. 10 A notion of bisimulation containing a quantification over subsets has been proposed originally in [20], adapted in [16] to epistemic lottery models and later again reshaped to work in the context of epistemic neighborhood models in [15]. Such bisimulations were introduced to deal with probabilities and weights, not conditional modalities. The main difference with the present approach lies in the structure of the quantification. In our case the zig and zag conditions both share the same preconditions, a universal quantification over pairs of subsets satisfying certain prerequisites. In the aforementioned papers each direction has a ∀∃ quantification, stating that for each subset in the first model (usually within the current information cell) there exists a subset in the second model fulfilling certain properties.
Finally, we touch on the connection with the standard Hennessy-Milner result. Such result holds for an un-conditional modality, namely the box operator on Kripke models. For un-conditional modalities the proof of 'modal equivalence entails bisimilarity' simplifies considerably: it carries through with the usual technique just by assuming the finiteness of f (w, X) for all w. When f (w, X) = {v|wRv}, where R is the relation of the Kripke model, we obtain a conditional model for the box operator; in this circumstance the finiteness of f (w, X) for all w is precisely 'finitely branching'.

Conclusion and Further Work
In this paper we proposed a general notion of bisimulation for conditional modalities interpreted on selection functions and proved general results including a Hennessy-Milner theorem. We applied this to a series of examples: plausibility models for conditional beliefs, evidence models, relativized common knowledge and multi-agent conditionals. We used these notions to obtain some new undefinability results.
The first open problem concerns the extension of our results on the closure under relational composition. Our results could be strengthened at the general level of conditional models or in the specific settings, where the selection functions may enjoy additional properties (e.g., the selection function for relativized common knowledge is fully monotonic).
The second open question concerns infinite models: does modal equivalence entail bisimilarity on some natural class of infinite conditional models? We have seen an example of how, in the case of multi-agent plausibility models, the particular structure of the model can determine this answer, but we do not have an answer in the general case yet. We may furthermore ask how many 'classical' results of the model theory for basic modal logic we can obtain in the setting of conditional modalities. One natural example would be a version of the van Benthem characterization theorem.
Another group of questions arises from considering the new notion of bisimulation from a category-theoretic point of view. From this perspective bisimulations can be regarded as arrows in a suitable category of models. The closure under relational composition, together with the obvious fact that the identity relation is itself a bisimulation, ensures that we indeed obtain a category. This could enables a comparison between categories of models, for example between evidence and plausibility models, allowing for a systematic study of what has been called tracking [7], namely the matching of corresponding information dynamics in different classes of models.
anonymous reviewers, their comments were extremely helpful in improving the paper.
Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons. org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.