Choosing an SSD
Before I started my new job I had an inordinate amount of free time and for a majority of that time, nothing to spend it doing[1]. I was still thinking about my desktop wishlist[2] and about choosing a better SSD than the one I had previously selected[3].
A long time ago when I was following the HDD market since I was looking to buy some bulk storage I wrote a php script which loaded newegg's product list based on some search parameters you provided newegg's productlist.xml[4]. The script would then parse the list and produce a list sorted based on price per gigabyte. Which is useful when you're in the market for capacity[5].
I decided to do more or less the same thing with SSD's except this time I did it in python since I'm rusty on PHP and I didn't want to mess with setting up a web server to test on. So I got started by doing a power search on newegg for the specific flavor of SSD I was looking for.
The search parameters are as follows:
- 2.5" Form Factor
- SATA II/III
- 120GB or Greater
- Less than $300
- Retail or OEM
- Support TRIM Command
As of this writing those particular search parameters narrows the result to 17 SSD's. Now comes the code. Before I started coding I needed some way to sort them according to what I thought was important. The metric is as follows:
After looking closer at the scores this produces I noticed that it heavily penalizes drives with huge differences between read and write speeds which effectively weeds out drives that still have acceptable read//write speeds. So I removed that section of the metric producing:
The basic idea behind this scoring measure is that sequential read and write speeds are important, as well as capacity. Price and difference between sequential read//write are considered bad[6]. In the equation read and write refer to sequential read and write speeds. The ratio of these will produce a score of the SSD's overall performance for capacity, read//write speeds and price.
The code is relatively simple in purpose. Load the data and parse it into a dictionary then sort based on the metric above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | import urllib2, re # url = " # http://www.newegg.com/Product/ProductList.aspx?Submit=Property&Subcatego # ry=636&Description=&Type=&N=100008120&IsNodeId=1&srchInDesc=&MinPrice=&M # axPrice=&OEMMark=1&OEMMark=0&PropertyCodeValue=4213:30854&PropertyCodeVa # lue=4214:30848&PropertyCodeValue=4214:39416&PropertyCodeValue=4214:30849 # &PropertyCodeValue=4214:39415&PropertyCodeValue=4215:55552&PropertyCodeV # alue=4215:41071&PropertyCodeValue=4215:46319" # data = open("temp.html", "w") # data.write(urllib2.urlopen(url).read()) # data.close() raw = open("temp.html").read() item_re = re.compile(r'<div class="itemCell".*?>(.*?)<br class="clear".*?</div>') feature_re = re.compile(r"<li> (.*?)</li>") feature_list_re = re.compile(r'<b>(.*?)\s?\#?\s?:\s?</b>\s?(.*?)</li>') speed_re = re.compile(r"(up to )?(\d+).*?MB/s") capacity_re = re.compile(r"(\d+)GB") price_re = re.compile(r"</span>\$<strong>(\d+)</strong><sup>.(\d+)</sup>") item_list = [] valid = ['Read', 'Item', 'Interface', 'Capacity', 'Model', 'Write', 'Size'] for item in item_re.findall(raw): current = {} no_label = [] features = feature_re.findall(item) current["Size"] = features[0] current["Capacity"] = features[1] current["Interface"] = features[2] for feature in feature_list_re.findall(item): if feature[1].find("\r") != -1: current[feature[0]] = feature[1].split("\r")[0] else: current[feature[0]] = feature[1] current["Read"] = int(speed_re.findall(current["Sequential Access - Read"])[0][1]) current["Write"] = int(speed_re.findall(current["Sequential Access - Write"])[0][1]) current["Capacity"] = int(capacity_re.findall(current["Capacity"])[0]) for feature in current.keys(): if feature not in valid: del current[feature] current["Price"] = float('.'.join(price_re.findall(item)[0])) current["Item"] = "http://www.newegg.com/Product/Product.aspx?Item=%s" % (current["Item"]) item_list.append(current) sorted = {} for item in item_list: ratio = (item["Read"] * item["Write"] * item["Capacity"]) / (item["Price"]) sorted[ratio] = item sort_order = sorted.keys() sort_order.sort() sort_order.reverse() for key in sort_order: #print '\t'.join(map(lambda x: str(x), sorted[key].keys())) print '\t'.join(map(lambda x: str(x), sorted[key].values())) |
Now given that there is quite a lot of data to present and analyze all at once I've decided it would be easiest to just provide you with a pretty graph[7]:

If you look closely at the scores of all the disks in the query, you'll notice that this is a noticeable gap between the top 3 and the rest. They are as follows:
| Manufacturer: | A-DATA | Patriot | G.Skill |
| Series: | S599 | Inferno | Phoenix Series |
| Capacity: | 128GB | 120GB | 120GB |
| Read: | 280MB/s | 285MB/s | 285MB/s |
| Write: | 270MB/s | 275MB/s | 275MB/s |
| Item: | N82E16820211471[8] | N82E16820220510[9] | N82E16820231372[10] |
| Price: | $295.99 | $289.99 | $299.00 |
I noticed that if you ignore capacity in the metric then the Patriot Inferno is the clear winner here. So as it turns out the Western Digital SiliconEdge I had selected when I first wrote the wishlist wasn't the best drive for my needs. But then I've always had a soft-spot for Western Digital. But now I'm convinced that the Patriot Inferno is the SSD I'll be getting unless by the time I get around to buying one there are better options[11].
- Nothing worth-while anyway [↩]
- See previous post: Wishlist. [↩]
- Western Digital SiliconEdge 128GB SSD [↩]
- Which no longer exists in it's original form. [↩]
- Which I was. [↩]
- Although we're excluding read//write speed difference. [↩]
- Scores have been normalized to 100%. [↩]
- A-Data S599 [↩]
- Patriot Inferno [↩]
- G.Skill Phoenix Series [↩]
- Which there probably will be. [↩]
Matplotlib and Live Data: A Tale of Two Technologies
Being unemployed over the summer is never usually a good thing for me. I get bored very easily if I don't have something to occupy myself with. This last bout of boredom led me to unpack some of my electronics. Dusted off my multimeter, Arduino and a digital thermometer I bought a little while ago. Figured I could use these to solve one of my current problems.
Living in Laramie usually subjects people to harsh winters which leaves most housing developments without central air conditioning installed since, well it's never really needed except maybe one or two days over the summer where it gets above 85 oF. This summer has apparently been hotter than previous summers and It's left my condo in an "uncomfortable state". Mind you I'm used to living in hot weather so this isn't such a terrible thing to me, I'm used to it.
What I'm not used to is not having AC and it cooling off enough at night that it's worthwhile to open a few windows and stick a fan in one of them. Which leaves me with this problem: When is the optimal time to open the windows and turn on the fan to get my condo cooled off earliest//fastest?
In comes my Arduino + digital thermometer[1]. Once I rigged up the proper power//data connections on a breadboard for my Arduino I set out to find code for the thermometer. I've setup the thermometer with a sketch on my Arduino before I just didn't feel like wasting a few hours trying to do it from scratch again. Soon enough I found some code[2] that worked perfectly. So I trimmed out some code I didn't need for the project and set it up to just write the temperature as fast as possible[3] to the serial port it's connected to.
After that I wrote a logging program on my desktop in Python to record temperatures sent via serial to my desktop. The program is incredibly simple and uses the pySerial library[4] to read temperatures from the serial port of my desktop and append them to a temperature log. I used a simple windows command to do this since it wouldn't lock the file so I could read data from it simultaneously. There are still occasionally collisions with the processing program locking the file and the logger not being able to write the data to the file but these are rare enough that it's negligible in my situation.
1 2 3 4 5 | import serial, os ser = serial.Serial(2) while True: os.system("echo %s>>out.txt" % (ser.readline().strip())) |
The next step in this project was visualizing the data. I've used matplotlib[5] before and I was thinking this time I would like to see if I could write the program to update data live as it recieves it. My first foray into this goal was a miserable disaster. Most of the solutions I could find involved just setting up an infinite loop with a short time delay in it. Which works great except that it sleeps the thread running the plot which makes it impossible to resize the plot or do anything at all with the GUI for that matter. So obviosly this wouldn't work at all.
After poking around for different solutions to this and crashing my computer once from spawning an infinite number of instances of the plot I gave up for a bit, only to discover that there was an example in the documentation which wasn't obviously named. I quickly discovered the best way to do this. I even added some pretty annotations and such.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | import gobject import matplotlib matplotlib.use('GTKAgg') import matplotlib.pyplot as plt current_pos = 0 temps = [] pad = 5.0 f = plt.figure() def update(vars): # Unpack variables that need to be persistent between # executions of this method. temps = vars[0] current_pos = vars[1] pad = vars[2] # Open the data file and get any new data points since # the last time we read from this file data = open("out.txt", "r") data.seek(current_pos) new_temps = map(lambda x: float(x) * (1 + 4.0/5.0) + 32.0, data.read().split("\n")[:-1]) current_pos = data.tell() data.close() # If we got new data then append it to the list of # temperatures and trim to 750 points if len(new_temps) > 0: temps.extend(new_temps) temps = temps[-750:] f.clear() f.suptitle("Live Temperature") a = f.add_subplot(111) a.grid(True) l, = a.plot(temps) plt.xlabel("Time (Seconds)") plt.ylabel(r'Temperature $^{\circ}$F') # Get the minimum and maximum temperatures these are # used for annotations and scaling the plot of data min_t = min(temps) max_t = max(temps) # Add annotations for minimum and maximum temperatures a.annotate(r'Min: %0.2f$^{\circ}$F' % (min_t), xy=(temps.index(min_t), min_t), xycoords='data', xytext=(20, -20), textcoords='offset points', bbox=dict(boxstyle="round", fc="0.8"), arrowprops=dict(arrowstyle="->", shrinkA=0, shrinkB=1, connectionstyle="angle,angleA=0,angleB=90,rad=10")) a.annotate(r'Max: %0.2f$^{\circ}$F' % (max_t), xy=(temps.index(max_t), max_t), xycoords='data', xytext=(20, 20), textcoords='offset points', bbox=dict(boxstyle="round", fc="0.8"), arrowprops=dict(arrowstyle="->", shrinkA=0, shrinkB=1, connectionstyle="angle,angleA=0,angleB=90,rad=10")) # Set the axis limits to make the data more readable a.axis([0,len(temps), min_t - pad,max_t + pad]) f.canvas.draw_idle() # Repack variables that need to be persistent between # executions of this method vars = {0: temps, 1: current_pos, 2: pad} return True vars = {0: temps, 1: current_pos, 2: pad} # Execute update method every 500ms gobject.timeout_add(500, update, vars) # Display the plot plt.show() |
This code generates a plot which updates every 500ms. This is based on an example in the matplotlib examples[6]. An example of the program's output can be seen below.

I imagine that I could have made this simpler by not using the GTK libraries which are a pain to install since there are 3 or 4 modules you have to install in order to make all this work including the GTK+ runtime. I may come back later and post a version written using TK since it can be used without installing extra modules and stuff.
- DS18S20 Digital Thermometer Datasheet [↩]
- Temperature Measurement using the Dallas DS18B20 by Peter H. Anderson [↩]
- Somewhere in the range of 750ms between readings since it is in parasite mode, may change this later to run in non-parasite mode. [↩]
- pySerial Python Library [↩]
- matplotlib Python Library [↩]
- Animation example code: simple_anim_gtk.py [↩]
WriteMonkey and Markdown
Recently Download Squad had a post[1] about a practical way to get features and support for open-source programs, specifically through donations. The post was about a program called WriteMonkey which is a minimalistic writing program that the author had originally written about previously[2]. Think of the best code editing program you know of, mine is Notepad++[3]. Now take that program and refactor it specifically for writing articles or blog posts, you've just created WriteMonkey[4].
Something that interested me about WriteMonkey was the Download Squad author's post specifically mentioned writing posts using Markdown[5] syntax. Markdown is a simple plain-text syntax which is parsed into html removing the need to tediously enter html[6] as you write. At first glance it didn't really seem like it would really help all that much when it came to writing blog posts. But I was completely wrong and am better off for it. Now the especially useful part is that WriteMonkey supports this completely as well as having a very useful shortcut for parsing and copying html straight from Markdown source. This is incredibly useful since I can then just go to my website and paste the resulting html into a blog post and hit save and be done with it.
As I looked through the program I realized, this is much much more than just a Markdown IDE. It includes all sorts of useful features like a "progress bar" which tells you how far along you are in a certain quota you specify in the preferences. This led me to write a little bit of SQL[7] to calculate the average word-count of posts in my blog. Excluding the outliers it came out to ~350 words per post. So I just set the quota to 350 words and it displays a bar at the top or bottom of the screen depending on what you choose showing your current progress on the quota.
It also does several other useful things like displaying current battery life as a percentage in the progress bar, showing the file you're writing in. There's also this feature called repository//main. This allows you to store text clippings in repository and then write the blog post in main. When exported as html the repository is ignored and only main is copied. Makes it useful to write notes and such in the middle of authoring a post to keep with everything you write and it's easy enough to switch between the two to make it useful. For this post I just made a list of points I wanted to cover.
After using WriteMonkey for an hour or so I think I've found the new environment I'll be writing all my posts in for the foreseeable future.
- Download Squad: Amazing software tip: Pay free software developers to get stuff fixed! [↩]
- Download Squad: WriteMonkey is an unbelievable full-screen text editor [↩]
- Notepad++ [↩]
- WriteMonkey [↩]
- Markdown [↩]
- HTML: HyperText Markup Language, is the predominant markup language for web pages. [↩]
- SQL: Structured Query Language [↩]
FreeNAS Users Rejoice!
Unetbootin[1] now supports FreeNAS! Take a look at these awesome little snippets of code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | //distrolst.cpp if (nameDistro == "FreeNAS") { if (isarch64) { cpuarch = "amd64"; } else { cpuarch = "i386"; } instIndvfl("memdisk", QString("%1ubnkern").arg(targetPath)); if (islivecd) { downloadfile(QString("http://downloads.sourceforge.net/sourceforge/lubi/FreeNAS-%1-LiveCD-%2.img.gz").arg(cpuarch, relname), QString("%1ubninit").arg(targetPath)); } else { downloadfile(QString("http://sourceforge.net/projects/freenas/files/stable/0.7/FreeNAS-%1-embedded-%2.img/download").arg(cpuarch, relname), QString("%1ubninit").arg(targetPath)); } } |
1 2 3 4 5 6 | //distrover.cpp distroselect->addItem("FreeNAS", (QStringList() << "0.7.4919" << unetbootin::tr("<b>Homepage:</b> <a href=\"http://freenas.org/\">http://www.freenas.org</a><br/>" "<b>Description:</b> FreeNAS is an embedded open source NAS (Network-Attached Storage) distribution based on FreeBSD.<br/>" "<b>Install Notes:</b> The LiveCD version creates a RAM drive for FreeNAS, and uses a FAT formatted floppy disk or USB key for saving the configuration file. The embedded version allows installation to hard disk.") << "0.7.4919" << "0.7.4919_x64" << "0.7.1.5024_Live" << "0.7.1.4997_Live_x64")); |
Segue:
I'm actually considering forking the unetbootin project to add support for a master distro list which can be updated remotely eliminating the requirement for users to download a new copy of the program if they wish to get the latest version of the list of pre-configured distros.
This has a little bit to do with the fact that I'll be required to take a few C++ courses at the University of Wyoming since Java is the standard language taught at the University of Arizona while I was there and I've never used C++ before. Can't be that hard right?
Quote of the Day and Something Cool
So I was sitting in my discrete structures analysis class today when a student asked a question about the homework. It went a little like this:
Student: How should we format pseudo-code in the homework?
Professor: Ah see pseudo-code is in that grey area. It's somewhere in between code and english.
Professor: You see me do it one way in class, the book does it another way and the homework assignment does it an even different way.
Professor: Think of it this way, pseudo-code is a lot like pornography: you'll know it when you see it.
Professor: So I'm not very worried about how you do your pseudo-code as long as I understand it.
The other cool thing that happened today was that I found out the owner of my favorite coffee shop is a pro-gun person. Apparently some of his relatives own a gun shop and he's done some shooting competitions. All very cool. It definitely explains why none of the employees ever freak out about me carrying my 1911 which looks giant on my hip compared to others. I'm a tall skinny guy if you didn't already know.
Computer Science Professor
I think I may have found my new favorite professor. After having 3 lectures total with him I've noticed that he rates cleverness of his proofs//examples in terms of how much beer you could win by betting others at a pub on the outcomes.
Second More Useless Plugin
Bored again which seems to be the usual for the summer I sat down with one purpose in mind: To write another plugin. Didn't really matter to me if it was useful or not I just wanted to write another one. I decided to ask my friend Pete who snarkily replied "Write one that randomly inserts horse pr0n[1] in your blog." to which I immediately replied "not horse pr0n, ASCII pr0n!" which probably made him choke on his drink and immediately remember rule 34. But this made me think to myself: Why not a random plugin?
So I did just that, I wrote a random plugin. One so useless that I don't think I'm even going to activate it on my blog save for days like April Fools Day. The plugin chooses at random a word from each post using the the_content hook and censors it out with <censored>. Funny huh? This will ignore html tags so it won't break links and things like that. The code is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <? /* Plugin Name: Random Censor Plugin URI: http://www.bemasher.net/ Description: Picks a common word at random from posts at display and replaces with <censor>. Ignores urls and other semi-important things. Version: 0.01 Author: BeMasher Author URI: http://www.bemasher.net/ */ function random_censor($content) { $oldcontent = $content; $content = preg_replace("/<[^<]+?>/", "", $content); preg_match_all("/\b\w+\b/", $content, $words); $word = $words[0][rand(0, sizeof($words[0]))]; $oldcontent = preg_replace("/\b$word\b/i", " <censored> ", $oldcontent); return $oldcontent; } add_filter("the_content", "random_censor"); ?> |
- Mind you this is a joke, relating to the fact that horse pr0n happens to rank pretty high on the list of strange things on the internet. [↩]
My First WordPress Plugin
I've been meaning to do this for a while, I'm writing a plugin for wordpress to automatically capitalize the right letter in the words I'm generally lazy about typing. Things like I, I'm, I've and the like[1].
Took me a little while but I did finally find some documentation on all the different hooks wordpress has for filters. I found one in particular that does what I want called content_save_pre which applies a particular filter to the content of a post any time I save or edit it. So for things like drafts and actual posting and updating posts it would fire the function I registered as a filter.
The first problem I ran into is that for some strange reason it didn't seem to want to replace contractions with the single quote character. I later found out that when displaying the single-quote it converts it to the html entity ’ which shows up as an apostrophe. So I tried working around that but that dIdn't seem to work either. Eventually I just setup a small test post and modified the function to email me the plain-text contents of the post that the filter would receIve, I noticed that it escaped single quotes probably through the use of the php function mysql_escape_string[2]. So anything with an single-quote would show up with a backslash just before the single-quote. This of course broke the regular expression I was using and I couldn't seem to figure out how to get it to check for that character so I gave up and just used the negated word-character class \W.
Anyway after fiddling around with it a little more and adding a few new cases to the regex I arrived at this code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | <?php /* Plugin Name: Lazy Errors Plugin URI: http://www.bemasher.net/ Description: Replaces errors from lazy typing, things like: I, I'd, I've, I'm will be replaced with proper case. Version: 0.02 Author: BeMasher Author URI: http://www.bemasher.net/ */ function lazy_errors($content) { return preg_replace("/I(?(?=\W')(\W')(ve|m|d|)|(\b))/", "I$1$2$3", $content); } add_filter("content_save_pre", "lazy_errors"); ?> |
The regular expression reads like so: If there's an I followed by any non-word character and a single-quote then make sure it's got a proper contraction following the single-quote. Else make sure there's a space, period, comma, colon or semI-colon following the I. Then replace with capitalized I and the matching group from the conditional.
I should probably work out some code to make it ignore sections of text I don't want it to filter. A prime example of this would be in the comments of the plugin and especially in any code as code examples copied from my site would then be broken if the regex I wrote matched anything in the code.
- Notice they look normal to you because I've gotten my script working. [↩]
- http://www.php.net/manual/en/function.mysql-escape-string.php [↩]
Installing MikTeX on Windows 7
One of the only things that kept me from installing Windows 7 permanently during the school year was that the few times I tried, I had never gotten MikTeX[1] to work. This of course was a major problem since nearly all of my assignments are done with MikTeX now. When installing MikTeX I always ran into a BSOD that I ignored because I figured that it was only because Windows 7 was only an RC[2].
I've had Windows 7 on my desktop now for about 2 weeks and up until this point I've been making due with the MikTeX Portable Edition which is pretty buggy to say the least. More than half of the time it would hang on compiling a document to pdf at something due to it not thinking initexmf.exe was an operable program. And of course upon googling this problem, nothing of use could be found.
Well tonight I decided I'd give the install another go to see if either Windows 7 had been patched to fix this issue, or if MikTeX had fixed the problem. On my first try, it did exactly as it had always done before BSOD'd. Since I had never set windows to not automatically restart upon catastrophic system failure[3] it would just instantly restart without giving me enough time to read the type of error. I fixed this and ran the installer once more. The BSOD was a PAGE_FAULT_IN_NON_PAGED_AREA error, which was pretty vague as usual but I figured it had to do with system paging, so I disabled the Virtual Memory restarted and ran the installer once more. This time it worked exactly as it should.
On another interesting note, I discovered that pdflatex is significantly faster than texify. I found this out when I was trying different methods of compiling my TeX documents into pdf's using the MikTeX Portable Edition which was giving me fits with my old method.
I used to use the following in NotePad++'s NppExec plugin to compile a pdf and view in Adobe Reader:
1 2 | C:\Program Files\MiKTeX 2.7\miktex\bin\texify.exe -c -p "$(FULL_CURRENT_PATH)" "$(NAME_PART).pdf" C:\Program Files\Adobe\Reader 9.0\Reader\AcroRd32.exe "$(CURRENT_DIRECTORY)\$(NAME_PART).pdf" |
Now I use pdflatex:
1 2 | C:\Program Files\MiKTeX 2.7\miktex\bin\pdflatex.exe "$(FULL_CURRENT_PATH)" C:\Program Files\Adobe\Reader 9.0\Reader\AcroRd32.exe "$(CURRENT_DIRECTORY)\$(NAME_PART).pdf" |
LCD Hello World!
I finally got impatient with my progress on the LCD Shield I'm designing and decided to solder the headers onto the LCD and give it a try on the breadboard, which has turned out to be a good idea. The original pin-mapping I setup in the design wouldn't have worked and I would have been very frustrated at my wasting ~$20 on getting a PCB printed. I got a lot of great help from http://www.alfonsomartone.itb.it/kwztcq.html which almost problem-for-problem outlined the same order of issues I had excluding problems #2, #4 and #8.
1 2 3 4 5 6 7 8 9 10 11 12 13 | #define RS 11 #define RW 2 #define E 3 #define D0 4 #define D1 5 #define D2 6 #define D3 7 #define D4 14 #define D5 15 #define D6 16 #define D7 17 LiquidCrystal lcd(RS,RW,E,D0,D1,D2,D3,D4,D5,D6,D7); |
Above is what the pin-mapping ended up being. I also discovered I had no pots to use for the contrast pin on the LCD at all so that's going to have to be fixed in the future. So for the time being the contrast is a little out of whack. But I was able to get it to print the traditional "Hello World!" string as you can see in the photo. Also you'll note that I used pins 14 through 17 and you're probably scratching your head as to which pins those are. They're actually the row of pins marked as analog in, analog in 0 is 14 and can actually be used as a digital IO pin as well.
Something else that I've also noticed that I'll need to fix is that the rows seem to be written to out of order. Writing order is as follows: Row 1, Row 3, Row 2, Row 4. Which I'm sure I can fix somehow but I don't know exactly why it's doing that just yet.

