One useful characterization of the work of a product manager is that we solve problems for other people in the organization. So it’s important that we understand how to do this, and understand the types of problems that we’re likely to run into. The hard ones often are “they are meta-problems” – even though they are often presented as “How do I do thing A?” the real answer ends up being “You should do thing Z instead of thing A.”
Performance issues with your product are a good breeding grounds for these types of problems. I mentioned in last week’s post a situation where an engineering manager might come to you with the following decision to make, in his or her words: “either we can ignore the performance problems, or we can work on optimizing the queries.”
As I mentioned in the previous article, the answer is never just one of “ignore” or “optimize queries.”
First, you have to understand if you have a problem, and if you have a problem, what kind of problem it is. For a performance issue, does performance get worse with the number of elements in the query? With the number of elements in the database? What part of the bad response time is constant, no matter how much data is returned or is in the database? So, that’s one whole line of questioning without which you cannot solve the performance problem.
But there’s also another line of questioning, that’s completely orthogonal, and that is related to the design of the interaction itself. Perhaps you’ve found that displaying 1,000 rows of data in your UI is always slow, but for some reason it’s “necessary” to show all 1,000 entries. Well, right there you have a clue (that I gave you) – why is it “necessary?” What part of “showing 1,000 rows” is necessary? What are the particular use cases where this is important? What does it enable? Is it possible that showing 1,000 entries is actually not necessary? Or that you only ever need to show the names of the 1,000 entries, and you don’t ever have to show 1,000 detail rows? Or the names and owner or some other similar inexpensive subset of data?
The point of this, and I hope it was as obvious as a Mack truck, is that optimizing will not solve this performance problem. Displaying 1,000 complex records will always be challenging from a performance point of view, so if you want to make the experience significantly better, you have to figure out how to design the interaction so that you don’t have to get and display all those records. In fact, your approach to performance should be completely different, and if anyone ever says “optimize the query” to you, you should be very, very wary!
Real performance improvements, the ones that change the way users experience your application and that take it from being a “have to use” to a “want to use” application, are order-of-magnitude changes. For example, if the current interaction takes 2 seconds, then improving that by 50% makes it a 1 second transaction. That’s a lot better, but it’s still noticeable as a delay. It’s much better to improve the performance by a factor of ten, taking a 2 second transaction down to .2 seconds. That changes the transaction from a noticeable one to an unnoticeable one. And that changes the user’s perception of the application.
The “Fundamental Theorem” of Performance
The key point is this. A very smart architect once told me this basic rule of thumb:
You can’t achieve significant performance improvements by optimizing.
Optimizing can get you percentage performance improvements – like 10% or even 50% in some unusual cases. But to get the big changes, the ones that turn the user’s experience into a smooth flow, you need to change the algorithms.
What does this mean? There are any number of ways to improve algorithms, but here are a few to get you thinking:
- Don’t get all the data when you go back to the database, only get the data that’s changed (this was the fundamental way that Ajax enabled Web 2.0 – I don’t have to take the overhead of reloading and redisplaying the whole web page when there’s a change, I only redisplay the changed portion)
- Don’t go back to the database at all – keep all the data cached on the client and only reload it when necessary to maintain data consistency (you have to handle the issues raised by the CAP Theorem, but this is a well understood problem with various solutions.)
- Preload the data that the user is likely to want before the user even asks for it, then just display it client-side when he/she clicks the button (this requires a guess on the part of the system, but there are good guessing algorithms)
- Don’t use a query to retrieve the data, use a materialized view
- Don’t show a list of data, show only one datum (i.e., never return 10,000 rows, only ever return one)
- If you have to show a list, load only the data that will show on the current screen, and only load more data when the user navigates to the next screenful (aka “lazy loading”)
- Determine where the user is going to be looking at any given time, and only load the data that will be “in focus” for the user, loading the rest of the data in the background (or not at all, if possible)
The point is that to get real significant performance gains, you have to make changes like the ones above, changes to the algorithms and even the fundamental design of the interaction. This requires a conversation with an architect who knows his or her stuff, because he or she is going to have to step outside the box of “show me the code, and a profiler, and I’ll fix it.” (Not that a profiler isn’t useful in this conversation – it’s very important to know what part of the interaction is actually causing the performance issue. The worst thing to do is to optimize a query or a routine that’s only responsible for 5% of the time taken.)
Over to You
Have you had this conversation with your engineers? Do you find yourself on the horns of a dilemma where it seems you only have two choices for a decision, whether it’s about performance or something else? What techniques have you found to resolve these dilemmas? Let me know your thoughts in the comments – I’d love to hear them!