Tappitytap.info

Equality

What is it?

Recently I took an online test to measure my c# knowledge, not a 2 hour exam, just 15 minutes of my time whilst waiting for something to finish. One of the questions was about the difference between the Equals method and == . I looked at the answers; non of them seemed to be correct. I started to question myself, is my knowledge about this wrong? As I thought they should do the same thing. I did go for one of the multiple choice answers as I had come across this when I had been working with another dev.

When .net first arrived I did spend some time learning the basic concepts, one of those concepts was Equality.

There is value equality known as equivalence and reference equality known as identity. Testing for equality we generally use == or the Equals method, depending on the type it will either test the contents or check the reference. Testing for identity you can also using the System.Object.ReferenceEquals method.

Very commonly I have found that it is stated that == tests using the reference to the item and Equals tests using the value. From my experience this was not true but then again I have never really looked under the hood and took it on the understanding that they would return the same result. So what is going on under the hood ?


object A = "Hello";
object B = new string("Hello");

bool Q1 = A == B;                               //False
bool Q2 = A.Equals(B);                          //True
bool Q3 = System.Object.ReferenceEquals(A, B);  //False
		

This is the example that I have seen to demonstrate that the concept is true. but I'm going to take a lot more convincing. One reason that I'm going to be investigating this is that when you write operator overloads if you override the Equals method then you should override the == operator and they should have the same outcome.

The Equals method is generally considered quite inefficient on ValueTypes as it uses reflection so most people stick to using ==. When you write your own struct and want to define your own equality checks then quite often the == operator calls the Equals method as you can make it more efficient when you write your own implementation.

Finding out the truth

So keeping it simple I need to create a test and somehow work out how equality is evaluated. To do this I wrote the simplest piece of code and then checked the MISL code


string A = "Hello";
string B = "Hello";

bool Q1 = A == B;                               //True
bool Q2 = A.Equals(B);                          //True
bool Q3 = System.Object.ReferenceEquals(A, B);  //True

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       44 (0x2c)
  .maxstack  2
  .locals init (string V_0, string V_1, bool V_2, bool V_3)
  IL_0000:  nop
  IL_0001:  ldstr      "Hello"
  IL_0006:  stloc.0
  IL_0007:  ldstr      "Hello"
  IL_000c:  stloc.1
  IL_000d:  ldloc.0
  IL_000e:  ldloc.1
  IL_000f:  call       bool [System.Runtime]System.String::op_Equality(string, string)
  IL_0014:  stloc.2
  IL_0015:  ldloc.0
  IL_0016:  ldloc.1
  IL_0017:  callvirt   instance bool [System.Runtime]System.String::Equals(string)
  IL_001c:  stloc.3
  IL_001d:  ldloc.2
  IL_001e:  call       void [System.Console]System.Console::WriteLine(bool)
  IL_0023:  nop
  IL_0024:  ldloc.3
  IL_0025:  call       void [System.Console]System.Console::WriteLine(bool)
  IL_002a:  nop
  IL_002b:  ret
} // end of method Program::Main

Looking at the MISL then the == calls the ::op_Equality(string, string) and the Equals method calls ::Equals(string) these would on first glance seem to be doing different operations, but ::op_Equality is actually calling Equals after first checking for nulls.


int A = 10;
int B = 10;

bool Q1 = A == B;                               //True
bool Q2 = A.Equals(B);                          //True
bool Q3 = System.Object.ReferenceEquals(A, B);  //False

// another variation of the code
object A = new object();
object B = new object();

bool Q1 = A == B;								//False
bool Q2 = A.Equals(B);							//False
bool Q3 = System.Object.ReferenceEquals(A, B);  //False

Checking equality on an Int a ValueType and an Object a reference type. By default the testing behaviour used by value types checks equivalence and reference types checks identity. As you can see the integer values are the same but are stored at different locations on the stack. The checks on the objects return false as there are 2 discrete instances of the object type and of course there addresses are different and although the instances are stored on the heap there pointers are stored on the stack.

Boxing

In the first code example the string was cast as an object so does this have an effect on how the tests are performed? Seeing that strings are not ValueTypes they have a more complex structure then I decided to use integers to see if the inconsistent behaviour would occur.


object A = (object)10;
object B = (object)10;

bool Q1 = A == B;								//False
bool Q2 = A.Equals(B);							//True
bool Q3 = System.Object.ReferenceEquals(A, B);  //False

It would seem that they are doing different tests or is it that we are testing different things.

The == operator works differently on value types and reference types, for a value type checks the values and for a reference type it checks the addresses of the instance. Although the implementation of == and Equals are different then they should result in the same answer ie be consistent. What actually is happening is not that the default behaviour of == is a reference check and that the Equals method does a value check but that the check being done on the variable at runtime.

Looking at the MISL code for the above example you can see that the numbers are being boxed, and what is important is what happens when the values are being compared. the ceq instruction is where we do the == test. what ceq does is compare the values that are on the stack and what is on the stack at the moment is the pointer to the boxed integer which is therefor an object, this results in a check on a reference ie an identity check. The Equals method will call the method on the original value which will do a check on the value for equivalence.


.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       31 (0x1f)
  .maxstack  2
  .locals init (object V_0, object V_1, bool V_2, bool V_3)
  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  box        [System.Runtime]System.Int32 	// box instruction
  IL_0008:  stloc.0
  IL_0009:  ldc.i4.s   10
  IL_000b:  box        [System.Runtime]System.Int32    // box instruction
  IL_0010:  stloc.1
  IL_0011:  ldloc.0
  IL_0012:  ldloc.1
  IL_0013:  ceq 										// ceq instruction
  IL_0015:  stloc.2
  IL_0016:  ldloc.0
  IL_0017:  ldloc.1
  IL_0018:  callvirt   instance bool [System.Runtime]System.Object::Equals(object)
  IL_001d:  stloc.3
  IL_001e:  ret
} // end of method Program::Main

Summary

This inconsistent behaviour is not really desirable and something to be wary of, the issue was born out of the need for good default performance when checking ValueTypes. and for the most part this is generally not an issue, just be aware of unintentional implicit boxing.

  • == is not implicitly a reference check

  • Equals method is not implicitly a value check

  • Equality is dependent on what you are checking

  • Checking a boxed value type == checks identity, Equals checks equivalence.

So there it is, the question was wrong or at least the answers.