1. Default Object Lifetime Is Non-Deterministic
In most object-oriented languages, there is a very specific time when an object constructor is called (namely, when an object is instantiated) and when its destructor is called (namely, when it falls out of scope).
In C#, they have taken the "garbage collection" paradigm one step too far. Not only does memory management rely on it, but even the object destructor is called "somewhen", at an unpredictable time! In a previous version of this document I wrote "somewhen after the object falls out of scope", but it turns out to be even worse, so I'll devote a separate section to the gruesome truth below.
In any case, this means that handy constructs such as an
AutoLock
can no longer work (example in C++):class AutoLock
{
public:
AutoLock(Mutex& m): m_mutex(m) { m_mutex.Lock(); }
~AutoLock() { m_mutex.Unlock(); }
private:
Mutex& m_mutex;
};
In a typical Microsoft-way of thinking, they added a "special case" for this particular example by means of the
lock
keyword (which, incidentally, would be trivial to simulate in C++ should you like the keyword-taste of it). However, other "automatic" resource management using object lifetime (for example, for handles, GDI object, etc.) still won't work.To resolve this, objects can implement the
IDisposable
interface, which has a Dispose()
method. When object lifetime is important to you, you should put the relevant cleanup code in the Dispose()
implementation and remember to call Dispose()
on the object yourself. It is good practise to have the destructor call Dispose()
on the object too, but as Professional C#, 2nd Edition puts it: "The destructor is only there as a backup mechanism in case some badly behaved client doesn't call Dispose()
" (emphasis mine). You see, only badly behaved clients would forget to clean up after themselves, so I guess only badly behaved clients would need a garbage collector in the first place, right?The proposed "better solutions" for this are the
using
keyword, like so:using (AutoLock theLock = new AutoLock(m_lock))
{
// your protected code here
}
or using the
finally
clause (which is often recommended over the using
statement), like so:AutoLock theLock = new AutoLock(m_lock);
try
{
// your protected code here
}
finally
{
theLock.Dispose();
}
In the former case, it gets a nuisance if I have more than one object which I'd like to have a deterministic life time for, and in the latter case (which, incidentally, is very similar to the code that gets emitted by the compiler when you use the
lock
keyword) I still need to remember typing Dispose()
by hand.And it gets worse! Even program termination doesn't trigger proper cleanup. You can verify this with the following program:
using System;
using System.IO;
class TestClass
{
static void Main(string[] args)
{
StreamWriter sw = File.CreateText("C:\\foo.txt");
sw.WriteLine("Hello, World?");
// Note: We "forget" sw.Close().
// Incidentally, StreamWriter.Dispose(bool) is protected,
// so we can't call it directly.
}
}
The
foo.txt
file will be created, but it will be empty. Note that even C specifies that all unflushed data is written out, and files will be closed, at program termination. And even if I did remember to call Close()
myself (I wouldn't want to be a badly behaved client, now would I?), this wouldn't be exception-safe. I am supposed to remember to use using
, or litter my code with finally
blocks.2. Object Lifetime is Not Determined by Scope
I wrote above that I initially thought that objects are destroyed "somewhen after they go out of scope", but in reality it seems to be far, far worse. As it turns out, the JIT compiler can do "lookahead optimization", and may mark any object for collection after what it considers it's "last use", ignoring scope!
I have had a colleague ask me about the following code:
{
ReadAccessor access(image);
IntPtr p = access.GetPtr();
// lengthy piece of code here doing stuff with the pixels from the image
}
A
ReadAccessor
is an object which provides access to the pixel data in an image, which is stored in a memory mapped file for performance reasons. When a ReadAccessor
is constructed, it maps in the memory, after which you can call GetPtr()
to get at the actual data. Once it goes out of scope, it unmaps the memory again. So, the "validity" of the data is guaranteed for the lifetime of the ReadAccessor
.Incidentally, there is also a
WriteAccessor
, which makes sure that there be only a single writer at any given time. Of course, people using this code in C# quickly found out that they had to dispose of these WriteAccessor
s manually, because otherwise they'd get the error that this WriteAccessor
would still be sitting in the garbage bin while they were trying to acquire a new one. But that's the problem mentioned in the item above. This one is far, far worse.The colleague told me that his code crashed somewhere in the pixel-processing code.
It took me a while to figure out what was happening: The JIT optimizer looked ahead a little bit, decided that
access
wasn't being used after the GetPtr()
call, and marked it for collection. Later on in the code, in the same scope, mind you, the GC apparently decided it was a good time to destroy the ReadAccessor
, which unmapped the memory still being used by the code.I still find this hard to believe (even C# can't be this stupid), but the crash went away by modifying the code like so:
{
ReadAccessor access(image);
IntPtr p = access.GetPtr();
// lengthy piece of code here doing stuff with the pixels from the image
System.GC.KeepAlive(access);
}
This particular item is so mind-boggling that I hope some dear reader can tell me it's just a bad dream and scope is, in fact, honored by the GC.
3. Every Function Must Be A Method
C# imposes an object-oriented paradigm and enforces it by prohibiting the definition of stand-alone functions: every function must be a member of a class.
If you take object-orientation to the extreme, you would not say
float b = sin(a);
but rather
float b = a.sin();
This is clearly unpractical. (Ignore the question of how you would take the sine of a number instead of a variable.)
C# (and Java, for that matter) still try to go about half-way there by making the sine function a member of the
Math
class (or namespace — I can never tell them apart in C#):float b = Math.sin(a);
If I want to add my own mathematical functions, I either have to extend the
Math
class (which I can't, because it's sealed
) or put up with the strange distinction that I need to write float h = Math.sqrt(a*a + b*b);
but
float h = MyMath.hypot(a, b);
It gets even more scary if you look at the
OracleNumber
class, which also has a sin
method. Luckily, it's static
, and you can't call static
member functions on instances.This is related to the following item, but that is bad enough that I think it warrants its own item:
4. Containers Have Algorithms As Methods
The popular
ArrayList
container (an auto-resizing container, comparable to C++'s vector
template) has a Sort()
method. And a Reverse()
method. But not a Randomize()
method. Why should some algorithms be member functions, but not others? The answer is that no algorithms should be member functions. What if I wanted to use a different sorting algorithm than the one the original implementers of ArrayList
had in mind?Note that an
ArrayList
sorts itself, while Array.Sort(...)
is a static
member function of the Array
class.If I decide, late in a project, at the performance-tuning stage perhaps, that I could better use an
ArrayList
for some particular collection than the Array
I used up to now, I will likely have to modify my code in multiple places.Note that this is not a shortcoming of the language, but it is partly a consequence of item number 2, above. Also note that C# shares this problem with some other languages – even C++ has a few quirks here (the
string
class comes to mind).5. Default Comparison Behavior Is Dangerous
Given a class
Vector
, which doesn't overload the comparison operator==
, I can still write Vector a, b;
if (a == b)
{
...
}
In C++, the compiler will have the courtesy of telling me there is no
operator==
defined for Vector
s; in C#, this will simply compile, but it means "compare the references a
and b
", i.e. it is true
when a
and b
are the same Vector
, not when their value is equal. Also, because of the following item, you can't add such an operator yourself without altering the Vector
class:6. Operator Overloading Is Severely Broken
In C++, given a class
Foo
, you can define an operator for adding two Foo
s without altering the Foo
class itself:class Foo {};
Foo operator+(const Foo& lhs, const Foo& rhs)
{
return Foo(whatever it means to add two Foos);
}
In C#, this is not possible without altering the
Foo
class itself. Because of the limitation mentioned in item number 2, above, you cannot make this operator a "free-standing" one. Of course, adding this operator has nothing to do with the interface to the Foo
class, so you'd probably try something like this: public class FooOps
{
public static Foo operator+(Foo lhs, Foo rhs)
{
return new Foo(whatever it means to add two Foos);
}
}
but this doesn't work. You'll get the error "One of the parameters of a binary operator must be the containing type". In other words, if someone hands you, say, a
Vector
class without overloaded operators, you'll have to modify the class itself, also introducing a dependency of your class on the module which happens to implement these operators.But wait, there's more.
Note that when you overload the
operator==
, you also have to overload operator!=
– but we'll forgive the compiler for not being able to auto-generate it. It will do a similar "helpful" trick with arithmetic and bitwise assignment operators – when it most definitely shouldn't. You cannot overload the arithmetic and bitwise assignment operators +=
, -=
, etc. Instead, they are evaluated in terms of other operators that can be overloaded. This is exactly the wrong way around; most C++ programmers implement an operator+
in terms of operator+=
.Suppose you have a class
Image
, representing an image. Also, suppose you have some kind of image processing library, offering functionality to add two images together. For performance reasons, this library will likely have separate functions for adding one image in-place, overwriting the old contents, and for returning a new image containing the result of the addition:public class ImageProcessing
{
public static Image Add(Image lhs, Image rhs);
public static void AddInPlace(Image lhs, Image rhs);
}
You may decide that it's a nice service to clients of your
Image
class to offer operators for this, so they can write code like Image a, b, c;
c = a + b; // really c = ImageProcessing.Add(a, b)
a += b; // really ImageProcessing.AddInPlace(a, b)
(Of course, you'll have to send them a new
Image
class, because you have to modify it for this; in addition, your Image
class can now not be used without the ImageProcessing
class). You would think you'd override operator+
for ImageProcessing.Add()
and operator+=
for ImageProcessing.AddInPlace()
, but you can't. Instead, when your client types a += b
, a whole temporary Image
will have to be constructed, holding the result of the addition, after which the left operand is replaced with the result. Good bye performance!Update: In the 3.0 version of the language, a new feature called "extension methods" has appeared. It is now possible to add methods to classes without modifying the original class file, so you could make
img.AddInPlace(otherImage)
work. However, extension methods don't work together with operator overloading.7. Events Without Subscribers Raise Exceptions
If a tree falls down in the woods and there is nobody there to hear it, does it still make a sound? C# has a very interesting view on this popular Philosophy 101 question.
In C#, there is a concept called delegates. A multicast delegate is a set of methods to be called successively when the delegate is called. When the set of methods is empty, trying to call the delegate raises an exception.
However, events are implemented in terms of multicast delegates, too. You declare a
delegate
and an event
like so: public delegate void TreeListener();
class Tree
{
public event TreeListener Fell;
public void Fall()
{
// Fall down, and make some noise. To be discussed.
}
}
The idea is that clients interested in hearing trees fall can subscribe themselves to the event using a very fancy syntax:
class Client
{
public Client(Tree tree)
{
tree.Fell += new TreeListener(TreeFell);
}
private void TreeFell() // This will be called when the tree falls.
{
Console.WriteLine("I heard it!");
}
}
In the
Tree.Fall()
implementation, you'd simply call the event
: class Tree
{
public event TreeListener Fell;
public void Fall()
{
// Fall down, and make some noise:
Fell();
}
}
So, now comes the important question. What if nobody has subscribed to the
Tree.Fell
event? In that case, the multicast delegate will be empty, and calling it will raise an exception. You heard it right (or did you?): Trees simply aren't supposed to fall over when nobody's around.The suggested solution is to check whether anybody's listening first (if the
event
is empty, it will be null
): class Tree
{
public event TreeListener Fell;
public void Fall()
{
if (Fell != null)
Fell();
}
}
This, of course, is not thread-safe. To make it thread-safe, you will have to define your own event add/remove functions and take a lock in them, taking the same lock around the
if (Fell != null)
above.Conclusion
C# is very nice for quickly building GUI applications. Especially programmers used to MFC can't seem to praise C# loudly enough. But then again, if you are used to rusty pins being driven under your fingernails daily, the prospect of being kicked in the groin at unpredicable times but only once a week must sound really attractive.
No comments:
Post a Comment
Mayur Raiyani