sobota, 13 czerwca 2009

Regex dollars can't buy me love

Short note about regex dollars.
I always thought that regex dollar sign $ in SingleLine mode means end of the character string (I thought it is like 0 <zero> in the end of C string, some kind of delimiter that ends string). However i found out that regex like ^[0-9]+$ matches not only digits but also strings like "123\n". It matches strings that not only have characters specified in regex pattern but also the same strings that end with single \n.
Lately, i had to validate user input from webpage end i ended up checking if the string matches regex and also checking if the string doesn't end with \n. I didn't like the solution and with great help from stackoverflow users i found out that only way to mark end of string is using \z. After a bit more research i was able to find some more explanation:

The difference between ‹\Z› and ‹\z› comes into play when the last character in your subject text is a line break. In that case, ‹\Z› can match at the very end of the subject text, after the final line break, as well as immediately before that line break. The benefit is that you can search for ‹omega\Z› without having to worry about stripping off a trailing line break at the end of your subject text. When reading a file line by line, some tools include the line break at the end of the line, whereas others don’t; ‹\Z› masks this difference. ‹\z› matches only at the very end of the subject text, so it will not match text if a trailing line break follows. The anchor ‹$› is equivalent to ‹\Z›, as long as you do not turn on the “^ and $ match at line breaks” option. This option is off by default for all regex flavors except Ruby. Ruby does not offer a way to turn this option off. Just like ‹\Z›, ‹$› matches at the very end of the subject text, as well as before the final line break, if any.


Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan. Copyright 2009 Jan Goyvaerts and Steven Levithan, 978-0-596-2068-7

To sum up, the only safe way to mark the end of character string is to use \z instead of $. At least if it is input from webpage or any source where user doesn't press enter to confirm their input.
Weird... I was really used to $.

sobota, 6 czerwca 2009

Fancy stuff

Tell me something mister...
If i read everything about:
- design patterns
- dependency injection
- aspect oriented programming
- mocking
- oop principles
- extreme programming
Read The Mythical Man-Month. Twice.
And Code Complete. Three times.
Read all the documents about CLR.
Get to know all the animals from the O'Reilly series.
Read all Jon Skeet articles and answers.
Pass all microsoft exams.
Become mvp in everything from share point to compact framework.
Install vs2010 and .net 4.0 on virtual pc running windows 7.

Will i be able to write single line of code that does exactly what i want to?
Will it be error free?
Will it be easy to modify?
Will it meet open-close principle?

Yes? So i will do it! ;)

Links

.net

Regex character classes


asp.net

inline tags


unrelated

nippon kazauwa

There's something about linq... PART 1

In one of my previous posts i wrote that linq2sql (DLINQ - however you name it) is an example of poor abstraction. After a while of thinking and using linq2sql i must say that i've changed my mind... It is not an example of poor abstraction, it's a goddamn disaster.

At first i thought that linq2sql may be a problem because using same language to communicate with List object, xml document and database seems like mission impossible. What makes me wonder is that all these smart-ass guys at microsoft thought that's an excellent idea. Well, you can communicate with people from different countries without using their native language but that's kinda hard and you just can't express everything. That's the problem that linq suffers from. There are some more analogies. Some people may learn common language like english and then they can communicate pretty well with each other. However, that demands other people to learn language other than their native. Same thing with linq providers. If every datasource had well written linq provider maybe it would be really nice to query everything with linq. Well, i highly doubt that there are good linq providers for even most important datasources. What is more, even if you can speak english because you've learnt to do so, in most cases you won't be able communicate as fluently as in your native language and expressing more complicated/specific thoughts may still be a problem. And again, i'm experiencing same thing with linq. You just can't create common interface for every datasource without creating some methods that will be ignored in most of the datasources because they exist only for the sake of one particular medium.
Enough analogies, i will give some examples.

In this part i want to show one problem that i lately ran into. This problem may seem to be a bit rare but i think it opens whole range of problems that you may encounter using linq2sql. In second part i will provide examples of problems that you may encounter in even simple projects.
I would like to thanks stackoverflow and especially sambo99 for helping me to solve this problem in general.
The details of the problem are here: http://stackoverflow.com/questions/959501/problem-with-deadlock-on-select-update.
To sum up i need get a record from database and then update it. However i was running into a problem when more than one thread tried to this at the same time. Some more explanation. Suppose there are two threads - thread A and thread B. The sequence of events may be as follows:
A: select record
B: select record
B: update record
A: update record
As you can see thread A was updating data that thread was changed by thread B - obvious conflict.
OK, I thought. Transaction at Serializable Isolation Level will solve the issue. So i created a transaction and what i've got?
Exception: Transaction (Process ID 59) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Wtf? After reading some articles about deadlock, changing indexes in table, rewritting the code for datasets and then even for simple sqlcommands, creating simple test table i went to stackoverflow and fortunately got an answer. Sql server needs a little hint. I need to select the row with (updlock). I tried using some hints before but they didn't work, finally updlock worked. Hurray.
Wait... Hurray? But what about linq? Awww....
Whispering: You just can't do it with linq.
What?
You just can't do it with linq.
Damn.

But smart guys from microsoft came up with some interesting ideas.
Solution 1: create a stored procedure (you must be kidding me... i'm using your wonderful orm just to hear - don't use it - write a stored procedure instead)
Solution 2: you may still write a query by hand and execute on database, sth like IEnumerable version = (IEnumerable)dc.ExecuteQuery(typeof(string), "select top 1 some_field from some_table with (tablockx)"); (ok, you MUST be kidding... do you really call this solution?)
Some more details:
http://social.msdn.microsoft.com/forums/en-US/linqprojectgeneral/thread/2d6fdb2e-e17e-4a4c-8da0-6968e60ef855
http://social.msdn.microsoft.com/Forums/en-US/linqprojectgeneral/thread/24890e00-f224-4f6a-a596-f9b2f524aa39
http://stackoverflow.com/questions/190666/linq-to-sql-and-concurrency-issues

More examples in part 2, coming soon ;)

Happy coding.