czwartek, 24 września 2009

Stay away from these rocks we'd be a walking disaster

Some time ago my friend wrote a post about open - close principle - Open – Close principle / Object – Orientated Design in PHP. I hate to say that but there is something fundamentally wrong about the example in this post. I thought about it for a while and here are my thoughts on it.

Open - close principle doesn't encourage particular design - e.g. it doesn't say that it should be achieved through inheritance. Unfortunately, when you're into OOP, sometimes classes and inheritance and all that stuff cover the real solution. Imho, in example provided Image content type should not be hard-coded in particular classes. For me, content type is just a resource. For example, if you want to implement different language versions, you do it using resources, not through inheritance. You don't implement abstract Caption class and then EnglishCaption, FrenchCaption or PolishCaption. The same goes with content type of an image.
If you create base classes and mark methods as virtual (or you don't mark them as virtual because they are virtual by default like in Java), the point of doing it is to implement different behavior in particular classes. If you do it just to return different string and you don't implement any different logic, then maybe there is something wrong with the design.

Ok, that's the most important problem with the given example, but there also some more:
  • abstract class is called AnImage, ImageBase or sth similar would be far better
  • in every SendFileHeader in every derived class you are forced to write header("some type"), GetContentType as an abstract method and implementing SendFileHeader in abstract class as header(GetContentType()) would allow avoiding logic duplication
  • and there is that SendToBrowserFunction that was better before the refactoring than after the refactoring, I'll describe it in more details below

Before the refactoring SendToBrowser looked like this:
public function SendToBrowser()
{
$strFileExtention = $this->GetFileExtention();
switch ( $strFileExtention )
{
case ‘gif’ :
header("Content-type: image/gif");
break;
case ‘jpg’ :
header("Content-type: image/jpg");
break;
}
$strFileContent = file_get_contents($this->fPath);
echo $strFileContent;
die();
}

and after the refactoring it looks like this:
public function SendToBrowser()
{
$this->SendFileHeader();
$strFileContent = file_get_contents($this->fPath);
echo $strFileContent;
die();
}

Before and after refactoring it has this nasty echo and even worse die. Yep, there is also that header function, but I guess changing these methods would be scheduled for further refactoring, because the author writes:
Our first implementation of Image class did not follow one other object orientated design principle. The Single Responsibility Principle.

Ok, let's get back to that switch that didn't follow open - close principle. Am I wrong or the switch logic looks for me like:
header("Content-type: image/" . $this->GetFileExtention());

What is more, the author states:
Why this class breaks the open-closed rule? If we need to send PNG image, we would need to change behaviour of SendToBrowser method and add new case to our switch statement.

Well, if we got rid of switch statement, the above code would also work for PNG. And if we came up to the conclusion that it doesn't work for some particular extension, I still think it should be implemented as a resource (some config file), not using inheritance.
Furthermore, GetFileExtension should be moved to File class, not completely removed.

I would also like to share some general thoughts. Imho, interfaces provide much better decoupling than abstract classes, especially in languages where you can inherit only one base class. Creating abstract class is more risky than creating interface. Also, too many abstract methods in abstract class lead to derived classes that have empty implementation of these methods. Creating interfaces is easier and more flexible.
Interfaces are also essential to loose coupling and technology that has been a real buzzword for some time now - dependency injection.

Eventually, I want to say that this post is not criticism of the author of the Image example. I'm quite sure that he understands open - close principle really well, but he just chose a wrong example. Or maybe I'm totally wrong and the cure is worse than evil? If so Hit me! Come on, hit me!

niedziela, 6 września 2009

Programming - It's hard

I've just finished Martin Fowler's Refactoring: Improving the Design of Existing Code.
Great book with even better examples. That's the best one:
double getSpeed() {
switch (_type) {
case EUROPEAN:
return getBaseSpeed();
case AFRICAN:
return getBaseSpeed() - getLoadFactor() * _numberOfCoconuts;
case NORWEGIAN_BLUE:
return (_isNailed) ? 0 : getBaseSpeed(_voltage);
}
throw new RuntimeException ("Should be unreachable");
}

Imho, this book shows the most important thing about code that I have learnt during three years of working as a full-time programmer - quality of code DOES matter.
Heck, it matters a lot.
For me, the real revolution started after reading Ron Jeffries' Extreme Programming Adventures in C# or some time around it. I don't clearly remember if I was first into XP techniques and then read that book or whether I became a fiend of it after reading the book.
Clean, object oriented, readable code is my definition of pretty code.
When I read something like...
Use short names (like i, x) for variables with a short, local scope or ubiquitous names. Use medium length for member names. Use longer names only for global scope (like distant or OS interfaces). Generally: the tighter the scope, the shorter the name. Long, descriptive names may help the first time reader but hinder him every time thereafter.

... I just cannot understand how variable name "x" can do anything good. I really don't like variable names like i, x, var1, result11. They clearly show that someone wasn't thinking about the purpose of a variable. Imho, when you create a loop, call a variable index if it's index or at least idx. If you do that and begin to think about every variable name, you will improve your code. If you can't find appropriate name, maybe you should substitute it with a method call or change the way of thinking about solving a given problem.
The excerpt I quoted is from Seven Pillars of Pretty Code by Christopher Seiwald. I can't fight the feeling that the author tries to find a complicated solution instead of naming things appropriately and extracting functions. When I look at jam.c example, I can find many adjectives describing that code, but adjective pretty just can't come up to my mind.
The article by Christopher Seiwald is more than ten years old and it's not my aim to criticize the author. Maybe this article shows the way people were thinking about source code 10 or 15 years ago. I just wanted to show another view on code structure and other conventions that people follow.
In my opinion, source code should show programmer's intention. And here is another great example by Eric Lippert C# Round up.

Ok, but let's get back to refactoring. Martin Fowler says that before you start to refactor, you should have good unit tests. I cannot disagree with him, but I often find writing good unit tests really hard. Most systems I'm working with have architecture that makes unit testing a real pain. What is more, I often look at source code, and I see that it is properly designed and I can understand the author. The real reason is that unit testing is very very very hard and requires lots of experience. Here's a nice example - unit tests for Microsoft Enterprise Library (link).
Data — 56 tests will fail if you do not have Oracle installed. If you do happen to have oracle installed, you’ll need to manually open the Data\Tests\TestConfigurationContext.cs file and change your oracle connection settings.
Logging — EmailSinkFixture.LogMessageToEmail will fail, since you do not have access to our internal mail server. You can fix this by changing Logging\Tests\EnterpriseLibrary.LoggingDistributor.config on line 22 to reference different smtpServers and to and from addresses.
Security.ActiveDirectory — Tests will fail because you cannot access our Active Directory server. There are instructions about how to set up ADAM in Security.ActiveDirectory.Configuration.ADAM Setup. You’ll also need to change line 53 in Security.ActiveDirectory.Tests.TestConfigurationContext to reflect your ADAM setup.
EnterpriseLibrary — It is normal for several of the tests to occasionally fail. There are a few unit tests that are timing-dependent, especially in Caching and Configuration. These tests are testing whether or not something happens during a particular time interval, and sometimes the system delays the actions too long and those tests fail. If you rerun the test, it should work the next time. Additionally, our tests write to the event log, which occasionally fills up. If you begin to see a number of tests failing, check that your application event log is not full.

That's a nice suite of unit tests. Some DO fail, some don't. Run them if you like it...
Ok, it sounds like I'm making fun of Microsoft guys, but the point is that I also can't write good unit tests. The biggest problem I have is isolation of datasources, such as database of some kind or directory service.
Maybe using some kind of IoC would solve the problem, but it would require redefining the project design just because I want to write unit tests.
Currently, I'm refactoring without unit tests. I don't have confidence that I don't introduce new bugs or break existing functionalities. Sad but true. I cannot isolate directory service to some interface (without big changes in design) and I also don't have time to create and maintain domain controllers just for the sake of running unit tests on build server.
I hoped that I could find some unit test examples on Directory Programming .NET (especially here: http://directoryprogramming.net/files/3/csharp/entry24.aspx) but there are no interesting examples. Maybe because it's an example of CRUD and it would require integration tests, not unit tests.

When I started programming I thought it was simple. When I started working as a programmer and met real-life applications, it became clear that I didn't know anything about programming. Then there was some time when I thought that I learnt a lot and I really knew a lot about writing code and design, not about every possible technology, but I thought that technologies are similar.
Today, I feel that I have so much to learn and I see so many things that I should improve. I feel a bit overwhelmed, but, first of all, I feel excited and eager to learn all that stuff. Maybe "Programming - It's hard" should be changed to something as kitschy as "Programming - A quest for perfect code" ;)

Tomorrow I'm going on a two-week vacation. It's been a while since I last had a vacation and it could be hard not to think about programming at all (what bad could possibly happen?). Especially, when I'll be reading Pragmatic Programmer, The: From Journeyman to Master (talking about kitschy titles...) ;)

czwartek, 3 września 2009

Don't add uninitialised data or check if it is not in md_rand...

Recently I blogged about static code analysis. I am learning more about it and when I have some more experience with nDepend I'll share some knowledge on this blog.
More than a year ago there was THAT story about random number generator in debian. Well, it was pretty funny and pretty serious. For me mostly funny ;)



And this is the famous revision: diff.
In comments the developer claims that Purify made him do it ;) I'm not saying that this is the effect of code analysis, it's just a funny story. Still...
The funniest part for me is that this bug existed for about two years before it was spotted.

For your consideration ;)