Will the real programmer please stand up?

Over the last year or two, I’ve been reading articles that describe this strange world where software developers are Gods. They are paid so much money that it’s ridiculous. They get flown across the US and invited to parties with naked girls in spa pools and all the booze in the world.

Hipster

Image sourced from http://www.quora.com/Brogramming/How-does-a-programmer-become-a-brogrammer

Ok. I feel like I’m missing something here. I’m a software developer. I’ve been doing this for the past 10 years or so. I’ve worked in NZ, Australia, London and now Amsterdam, just to mix things up. I’ve even worked primarily in the e-commerce space. So why on earth am I not living this crazy extravagant lifestyle?! Is it because I don’t live in the US? Waaaaahh!!

Then I read this article and the penny dropped.

Who are the “fat guys who know C++”, or as someone else put it, “the guys with neckbeards, who keep Google’s servers running”?

What?? What do you mean who are they? Isn’t this a perfect description of a software developer? If it isn’t, then what is??

The more I read, the more this felt like this article was describing me, right down to the “games nights”. And that’s when I got it – the aforementioned articles weren’t talking about the people that I know as software developers. They were talking about this new breed of ‘hipster’ developers. They are talking about the cool front-end scripters, the JavaScript, PHP, Rails and Ruby developers. They aren’t talking about the back-end developers that code in one of those archaic programming languages like C, C++, C# or Java. Which happens to include me and most of the software developers I know. Bummer, I guess that’s why I still haven’t been invited to any of these crazy parties.

Until this point, I’ve been living in a world where I thought the developers that apparently belong to this ‘secret Guild of nomadic craftsmen’ were the real developers, the ones that people refer to when they mention software developers. Apparently I’m wrong and I’m now a part of something that not everyone is even aware exists! To most people, developers are actually cool hipsters that have a great fashion sense and certainly aren’t spock-like.

When did this shift in definitions happen?

Read More

Macro Photography – Take 1

For Christmas, my awesome partner Jason got me a Tamron SP AF 90mm f/2.8 Di Macro 1:1 lens for my Nikon D90. I really enjoy taking close up photos of insects and flowers and although the kit lens I already had does a pretty good job at it, a macro lens can do much better. I haven’t really learnt how to use it very well yet but here is my first attempt – a few photos I took at the Changi Airport ‘Butterfly Garden’ in Singapore.

Apart from re-sizing these photos, I haven’t edited them at all.

Butterfly
Butterfly
Butterfly
Butterfly

All imagery is copyright Annie Luxton 2007-2012. Images may not be reproduced without prior permission.

Read More

SQL Server Performance Tuning

Things in the “software development” world have changed a fair bit over the last 10 years. In the past, developers used to have to think carefully about the SQL they wrote, making sure that they used well optimized stored procedures on well designed databases with the right indexes on the right columns because let’s face it, the database was often a bottleneck.

Nowadays, with tools like ORMs being used more and more liberally, I get the feeling that some of us are getting a bit lazy and forgetting that the database is still a potential bottleneck. I bet that a lot of developers reading this won’t have had to hand-roll any SQL for a while and instead rely on what the ORM of choice generates for them. Mature frameworks like NHibernate and Entity Framework do a great job at providing us with CRUD statements but if we have a complex domain, a lot of rows, a heavy load or unique requirements then perhaps they aren’t good enough.

Although I’m by no means a DBA or SQL / SQL Server expert, I think it’s important for developers to take some responsibility over data retrieval when writing applications. To get the most out of your application and hardware, you should be aware of which queries are run most often and which are taking the longest to complete. In a perfect world, your DBAs could be providing you with this information on a weekly basis but there are some tools that you as a developer can use to preempt performance problems as well. In this post I’m going to list out some of the tools that I use to help me performance tune and monitor SQL Server, the DBMS that I am most familiar with.

Use stored procedures over executing SQL statements

In previous versions of SQL Server (version 6.5 and earlier), one of the main advantages of using stored procedures over executing SQL statements was that SQL Server would partially precompile a single execution plan for stored procedures upon their creation and cache it for reuse. However, the last couple of versions of SQL Server no longer precompile execution plans for stored procedures upon creation but instead upon their first execution, like is done for any other SQL statement. Also, execution plans for all T-SQL statements (stored procedures and SQL statements) are now stored in the procedure cache, not just stored procedure execution plans. These changes reduce the overall performance benefit that used to be gained by using stored procedures.

There are still some other performance gains to be made by using stored procedures though. For example, when you call a stored procedure from code, you only have to send the EXECUTE stored_procedure_name arguments down the wire instead of the whole statement. You can also perform a lot more logic within one stored procedure thereby saving you many round trips between the client and the server.

There are also other non-performance-related benefits to using stored procedures over SQL statements such as maintainability, abstraction, security.

Execution Plans

Before releasing a new or updated stored procedure, it’s a good idea to have a look at the planned and actual execution plans for it. If you’re making a change, you should look at the execution plan before and after your change to make sure you haven’t made anything worse! The main things to look for here are:

  • Table or clustered index scans. If the table over which the scan is being performed is small, this isn’t generally a problem. If it is a large table then it may mean that your table is missing indexes, or you need additional indexes, or that the optimizer has ignored your index for some reason. If you really need to (and you know what you’re doing!), you could use query hints to force SQL Server to use a particular index.
  • Repetitive scans. Try to avoid any repetitive table or index scans by rewriting your queries. There is usually more than one way of writing any batch of SQL statements.
  • Highest cost queries. Concentrate on optimizing the queries with the highest relative cost.

Traces – SQL Profiler vs Server Side

Traces record what’s going on in your database and are invaluable when it comes to analyzing what’s going on under the covers, especially when there is a performance issue.

SQL Profiler is a great GUI for running traces and analyzing how your code is actually communicating with your database(s). Even if the queries themselves are well optimized and running smoothly, if you’re repeatedly calling them hundreds of times in a row then you may encounter some performance issues. However, as a developer, you should only ever SQL Profiler to run traces on development and test databases. Running a trace on a production database from the SQL Profiler GUI can seriously overload it. You should also make sure you’re capturing the information you need and not too much more otherwise it becomes impossible to find what you’re looking for in all the trace output. I generally set mine up like this:

SQL Server Profiler

According to a very talented SQL Server DBA ex-colleague and friend of mine, @trademe_dave you can also run traces on the server side which apparently have much less impact than running them through the SQL Profiler GUI, but it really all depends on how many events you trace (3 or 4 should do the trick) and how long you leave it running for. Safest option is to leave this sort of thing up to the DBAs :)

Dynamic Management Views

If you can’t get your hands on a trace, you can try using Dynamic Management Views to show you some execution statistics (including CPU time, physical and logical reads, etc) on all queries. From what I understand, it isn’t perfect – for example, it only shows you aggregated data on queries that have actually finished executing, but that’s still much better than nothing! Here is a great article on how to use sys.dm_exec_query_stats to show you the top (most often run, longest execution time) queries.

Analyze Statistical Information

If you execute the statement SET STATISTICS IO ON before running a query or executing a stored proc (in SQL Server Management Studio), SQL Server will output some information regarding the amount of disk activity generated by the SQL statements being executed. To turn it off again, you simply execute SET STATISTICS IO OFF. These statements affect the existing connection so if you don’t execute a SET STATISTICS IO OFF command, any other queries you run using the same connection will also produce this performance tuning data.

The syntax looks a little like this:

1
2
3
4
5
SET STATISTICS IO ON
 
SELECT * FROM dbo.Articles
 
SET STATISTICS IO OFF

The information it outputs regarding the disk activity can be found in the ‘messages’ table, right next to the ‘results’ tab in mgmt studio:

SET STATISTICS IO

What you’re looking for in particular are the number of scan counts and logical reads. For a simple query or stored proc that selects data out of only one table, you’ll generally find a scan count of 1. However, if you’re joining tables then you’ll see that some of the tables involved might have much larger scan counts. I don’t know off the top of my head what a good value for ‘scan counts’ is but your DBA should be able to help with that. More importantly, while you fine-tune your query / stored proc you should keep an eye out on this value to see if it goes up or down with the changes you make – the higher the number, the worse the performance.

With respect to the logical reads, they are a measure of the number of pages that SQL Server had to read the data cache from in order to produce the results specified by your query. Again, a higher number is worse than a lower number so while you are tuning your query / stored proc, you should make sure that you don’t end up with a higher number of logical reads than you initially had!

I’ve also read that it’s a good idea to issue the following two commands before executing your query / stored proc with SET STATISTICS IO ON / OFF around it so that you clear out SQL Server’s data and procedure caches:

1
2
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

This article has a lot more detail on SET STATISTICS IO ON / OFF (and SET STATISTICS TIME ON / OFF which I personally haven’t used). And here is an MSDN article on SET STATISTICS IO, just for good measure.

That’s all for now folks!

I’m sure there are many other tools that developers can and frequently do use to performance tune their SQL queries and stored procedures. Those listed above are just the ones I use often – if you know of any others please feel free to leave a comment below!

But please, developers, don’t forget that under all your awesome code there is still a database struggling to keep up with the load you’re throwing at it. Be smart about how you access your data and don’t stop analyzing, benchmarking and optimizing your SQL statements.

Read More

How asynchronous is SmtpClient.SendAsync?

In a previous role, I was tasked with writing a newsletter email sender. How hard can this be, I thought to myself, and set off to complete my mission.

Initial Thoughts

We were probably going to be sending tens of thousands of emails at a time so although I figured I’d need to use some threading, I thought I’d start by using the asynchronous version of the SmtpClient.Send method, SmtpClient.SendAsync instead of the blocking SmtpClient.Send. I figured that way I’d be able to send batches of emails asynchronously and that way we’d be able to get through all the emails super fast.

The Test

I wrote some basic prototype code using SmtpClient.SendAsync and ran a test to send a couple hundred emails in it. Although my test ran without errors, I pretty quickly discovered that I wasn’t receiving all the emails that I was apparently sending by calling SmtpClient.SendAsync! At first I wondered if I’d somehow overflowed my inbox or something but that didn’t seem to be the problem. Then I started doing a bit more research and discovered something which explained the behaviour I was seeing…

The Findings

From the ‘remarks’ section of the MSDN documentation on SmtpClient.SendAsync:

After calling SendAsync, you must wait for the e-mail transmission to complete before attempting to send another e-mail message using Send or SendAsync.

And…

To receive notification when the e-mail has been sent or the operation has been canceled, add an event handler to the SendCompleted event.

OH! So in order to successfully send multiple emails asynchronously, I must add an event handler to the SmtpClient.SendCompleted event and wait until the first SmtpClient.SendAsync has completed before triggering the next one. Hmm… this does not seem all that asynchronous to me! I realize that there must be good reasons why it was implemented this way but in theory, using SmtpClient.SendAsync to send multiple emails really isn’t all that much more asynchronous than lining up a bunch of synchronous calls to SmtpClient.Send.

Conclusion

I guess it all comes down to your interpretation of what ‘asynchronous’ means – in this case, SmtpClient.SendAsync is indeed asynchronous in that it allows the program to carry on executing without blocking. This is great in most cases, unless what you want to do next is send another email.

So to sum up, it seems the only way to send multiple emails at the same time using .NET’s SmtpClient is to use threading after all. Spawn up a few worker threads with a separate instance of the SmtpClient in each and just send the emails using SmtpClient.Send.

Read More

ASP.NET Response.Redirect

In ASP.NET web applications, the Response.Redirect(string url) method is often used to control the flow of an application (both user browsing and logic) by redirecting the client to a different URL. For example, on a page that requires an authentication user, it is quite standard to first check if the user is logged in and if they aren’t, call Response.Redirect(“login.aspx”). This is all great, except that many developers probably don’t know what’s actually going on under the covers.

Under the covers

Internally, Response.Redirect (and Server.Transfer for that matter) calls Response.End to end the page’s execution (rather abruptly) and then shifts the execution to the Application_EndRequest event in the application’s Global.asax. We rely on this behaviour to bail out of wherever we were when we decided to call Response.Redirect. What most people probably don’t realize is that Response.End raises a ThreadAbortException exception. Exceptions are expensive, and if you’re calling Response.Redirect often, you might find this particular behaviour detrimental to the performance of your site.

So, how do I avoid these exceptions?

Well, Response.Redirect actually has an overload that takes two parameters – Response.Redirect(string url, bool endResponse). If you use this overload instead and pass false to the endResponse parameter, the internal call to Response.End will be suppressed and as such, no ThreadAbortException exceptions will be raised.

Great! But wait, there’s more…

There is a side effect! Using this overload and passing false so that Response.End doesn’t get called means that any code after the call to Response.Redirect will now be executed where it wouldn’t have previously. This makes perfect sense – if Response.End isn’t called then the execution isn’t shifted to the Application_EndRequest event and if the execution isn’t shifted, then it will carry on executing wherever it currently is. That’s right – although you no longer get ThreadAbortException exceptions being thrown, Response.Redirect no longer controls the flow of your logic.

This is fine if you are calling Response.Redirect at the end of a method at the bottom of your call stack because there was probably no more code to be executed anyway. On the other hand, if you are calling this from inside a method that isn’t at the bottom of your call stack, you have to be careful with the remaining stack calls. Even if you call return straight after you call Response.Redirect, where are you ‘returning’ to? Another method that executes more code? Eeek!

In conclusion

None of this is news – it’s all explained in this Microsoft Support article. However, I really don’t feel that it emphasizes enough the negative effects that switching over to the Response.Redirect overload may have on your application. You simply can’t go through your application and replace every Response.Redirect with it’s overload without thinking carefully about what this might do to the logic of your application. The ‘flow’ may still look like it’s working as it should because the user will still be redirected to another page, but under the covers you might find that you’re now executing code that you weren’t before and that may have some pretty nasty unwanted effects.

Read More