Speed Testing - A Beginner's Primer
For my final project at Flatiron, I scraped NFL scores off of ESPN's website, basically by downloading the entire html of the webpages I needed. This meant that I had a lot of text to search through to find the scores that I needed - more than 20,000+ characters - and I needed to do this up to 32 times to find each team's score. My initial attempt to parse the html involved converting the html to a string and using .slice to create a bunch of smaller strings that I could pick through looking for markups that indicated a team score was nearby. This was slow - it took up to 20 seconds to find everything which was unacceptably slow, especially given that this one merely one step in getting users their score data.
My solution?
Using Ruby's .index command to find the markups I was looking for, and only then slicing off the data I needed. Using this method, I reduced the time to generate my score data from 20 seconds to less than 1.
I've always been interested in seeing how fast I could make code go ever since I found that my classmates and I could run speed tests to compare our different solutions to various whiteboard problems. The differences usually only amounted to a couple hundred milliseconds at most, though, so the importance of writing speedy code never seemed real to me. Once I ran my NFL score scraping code, though, I had an actual concrete example of the value - cementing my passion for speedy bytes.
Speed testing has some pitfalls though - it's really easy to get incorrect results when you're testing competing ways of coding a problem or feature. Here are my biggest tips:
1. Make sure to run an adequate number of tests
The internal data that commands like Date.now() return are not completely accurate - they can often be off by 15 milliseconds or so. If you're comparing code that takes only 3 or 4 milliseconds to run, the runtime that your computer calculates is going to be much more influenced by that measurement error than the actual runtime of your code. How do you fix this? Run your code again and again and again. Here's how I compared two different profit maximization algorithms:
2. Make sure that your computer is only working on one thing at once
While we like to think of code as living in some beautiful, mathematical ideal world of logic, in reality it runs on your computer. In meatspace, sometimes objects have different properties from one moment to the next. And if your computer is streaming a youtube video while running one bit of code and not another, that's going to slow down your code. A fair test is a test where your computer is not multitasking. You might even want to run your speedtest a couple different times just to make sure you don't have background processes messing with your results.
3. Test in different environments (bonus points)
Different code runs different in different environments - sometimes a function runs quickly in Chrome but slowly in Safari. If you don't test it out in all sorts of different environments, you may inadvertently have optimized your code for an environment your users don't even use.
B. Avoid converting inputs from one data type to another - Less work for your computer means faster running time.
C. Look for early exits - If there are edge cases (or even regular cases) where your program can finish up early, give it the ability to do so. For instance, if you have a function that returns a false value if your input array is empty, don't waste your computer's time performing all sorts of calculations on that array - stick an if-function in there and tell it to get the hell out of dodge and return false if it's fed that empty array.
Good luck out there, racers!
Comments
Post a Comment