<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Language Hacker &#124; Robert Elwell&#039;s Blog</title>
	<atom:link href="http://robertelwell.info/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://robertelwell.info/blog</link>
	<description>PHP Web Development, Computational Linguistics, and Nerdy Miscellany</description>
	<lastBuildDate>Tue, 15 May 2012 19:59:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Just Keep Learning</title>
		<link>http://robertelwell.info/blog/just-keep-learning/</link>
		<comments>http://robertelwell.info/blog/just-keep-learning/#comments</comments>
		<pubDate>Tue, 15 May 2012 19:59:54 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=180</guid>
		<description><![CDATA[Jeff Atwood blew up the Internet today by making the statement that not everyone should learn to program. I think maybe one of the biggest complaints here is that his arguments read as elitist, and exclusionary, whereas the real kernel &#8230; <a href="http://robertelwell.info/blog/just-keep-learning/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Jeff Atwood blew up the Internet today by making the statement that <a href="http://www.codinghorror.com/blog/2012/05/please-dont-learn-to-code.html">not everyone should learn to program</a>. I think maybe one of the biggest complaints here is that his arguments read as elitist, and exclusionary, whereas the real kernel of his argument is that not everyone needs to be able to program to be good at what they do. Truly, that&#8217;s a good point.  But I also think it&#8217;s also a totally valid point that anyone who wants to take the time to learn how to program should have that opportunity, and not be driven out by harsh attitudes. </p>
<p>Alternately, learning to program should require some real learning by doing that isn&#8217;t informed by some gold-star method or gamification. I am of the opinion that these approaches only develop a false sense of mastery among its users. This is the kind of pernicious marketing that Jeff Atwood is railing against. You&#8217;re not suddenly a PHP expert because you spent ten hours on Codecademy. </p>
<p>Learning to program encapsulates a lot of different skills. It&#8217;s an iterative process, and you should approach it as with any new skill of similar profundity: with care, interest, and a knowledge of your own limitations. Instead of assuming you can tackle any task related to your achievement badges, you should aim to constantly question whether what you&#8217;re doing now is adequate, performant, scalable, and extensible. This is a great behavior to get into for any programmer, at any time. It&#8217;s the reason why I look at diffs before ever committing code, and it&#8217;s one reason why I get better at what I do as time goes on &#8212; because I take the time to review my practices and look for better ones. </p>
<p>But this isn&#8217;t just good advice for programming as a skill. There are plenty of great skills that you could do professionally that you can learn at home. Look at the vast array of talents required in home improvement. There are certified individuals who have the requisite education and experience to build you a house, set up your wiring, install insulation, or configure your home network. But that doesn&#8217;t mean that we should leave all aspects of these skills exclusively to the pros. That&#8217;s where Atwood oversteps his bounds with respect to programming as well. While it&#8217;s not a good idea to wire your entire house if you don&#8217;t know what you&#8217;re doing, it&#8217;s well worth knowing how to switch out your outlets, or to reset your router if it&#8217;s acting up. Not everyone may need to have the programming chops to develop a database schema or roll their own MVC platform, sure.  But that doesn&#8217;t mean that they might not get good use out of some text processing that a basic education in regular expressions with a lightweight interpreted language could lend. </p>
<p>Most of the real bare-bolts stuff I&#8217;ve learned about software, I have learned on the job, by doing. I&#8217;ve probably aced more job interviews based on my knowledge of design patterns &#8211;which exclusively involved reading every design pattern listed on Wikipedia, and then trying to identify them in real code &#8212; than talking about the technical aspects of my Master&#8217;s thesis. </p>
<p>The availability of information on the Internet means that the empowerment of passionate amateurs is a trend in all skilled professions. It shouldn&#8217;t be railed against; we should just be happy that it&#8217;s easier for people who want to learn to find resources that get them adequately situated. Sure, you can watch a YouTube video about how to use a smoker, but you&#8217;re not a pit master until you&#8217;ve made it a lifestyle. And there will always be that dichotomy for all hobbies or professions that require learning and practice. </p>
<p>I would even go so far as to say that some skilled professions such as law and medicine should start looking for ways to better incorporate this additional tier of semi-experts into the profession. The issue here, as programming has feared in the past, is that it is likely to drive down the low end of the market. But I think that software in general has become better for having people from different backgrounds with a passion in the practice, if not so much the college major, and the availability of new technology for these fields, as with programming, allows individuals to more easily sit on the shoulders of giants. But again, there will always be a difference between those who can identify the signs of a flu, and those who know what  other symptoms symptoms to look for that could result in a more grave diagnosis. </p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/just-keep-learning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Search V2 Now Live on Wikia!</title>
		<link>http://robertelwell.info/blog/search-v2-now-live-on-wikia/</link>
		<comments>http://robertelwell.info/blog/search-v2-now-live-on-wikia/#comments</comments>
		<pubDate>Wed, 09 May 2012 21:54:00 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=178</guid>
		<description><![CDATA[I&#8217;m proud to announce that the search solution and interface I&#8217;ve been working on at Wikia is now live! This introduces a new UI, grouped inter-wiki search for the global site, and improved relevance. If you&#8217;re curious about more details, &#8230; <a href="http://robertelwell.info/blog/search-v2-now-live-on-wikia/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m proud to announce that the search solution and interface I&#8217;ve been working on at <a href="http://www.wikia.com/">Wikia</a> is now live! </p>
<p>This introduces a new UI, grouped inter-wiki search for the global site, and improved relevance. If you&#8217;re curious about more details, please take a look at <a href="http://community.wikia.com/wiki/User_blog:Dopp/New_Global_Search,_plus_Updates_to_Local_Search">today&#8217;s search announcement</a> as well as a <a href="http://community.wikia.com/wiki/User_blog%3ADaniel_Baran%2FSearch_Developments%3A_Big_Picture">long-term view of search at Wikia</a>. </p>
<p>I&#8217;ve had a lot of fun developing this stuff, and I&#8217;m looking forward to our next big moves!</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/search-v2-now-live-on-wikia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MTO ON BLAST: A language model for a gossip blog</title>
		<link>http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/</link>
		<comments>http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 04:06:01 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=159</guid>
		<description><![CDATA[Maybe I don&#8217;t advertise it much on my website, but I&#8217;m a total nerd about hip hop music. A proficient stalker may have noticed my over-analytical writings over on Rap Wiki. Anyone unacquainted might enjoy my contributions to the pages &#8230; <a href="http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Maybe I don&#8217;t advertise it much on my website, but I&#8217;m a total nerd about hip hop music. A proficient stalker may have noticed my over-analytical writings over on <a href="http://rap.wikia.com">Rap Wiki</a>. Anyone unacquainted might enjoy my contributions to the pages like <a href="http://rap.wikia.com/wiki/DJ_Screw">DJ Screw</a> and <a href="http://rap.wikia.com/wiki/Lil_B">Lil B</a>. As with everything I&#8217;m passionate about, I tend to perform a fairly exhaustive breadth-first search on various information sources about the things I like and what they are related to until I accidentally find myself steeped in the deep-end of esotericism. As a result of this long-term obsession, I visit <a href="http://allhiphop.com/">All Hip Hop for music news</a>, <a href="http://www.worldstarhiphop.com/">World Star Hip Hop for new videos</a>, <a href="http://datpiff.com">DatPiff for the latest mixtapes</a>, and, of course, <a href="http://cdn.mediatakeout.com/index.html">MediaTakeOut for all the juicy gossip</a>. </p>
<p>I&#8217;m always looking for ways to combine things that excite me. Considering the fact that I spend most of my work day bouncing around in my chair to one mixtape or another while coding, it was only a matter of time before I was hit with the following epiphany: I should make a language model for MediaTakeOut. MediaTakeOut is an interesting website, because its headlines have a lot of entertaining, edgy constructions that are relatively unique to its site. I wanted to use a language model to see whether the prevailing patterns I saw could actually be reconstructed using the laws of probability. Also, let&#8217;s be honest, here: language models aren&#8217;t really that useful, but they can be <i>hilarious</i>. The reason why they&#8217;re funny is that they decontextualize the underlying qualities of a given textual corpus (thus generating grammatical nonsense). This is particularly telling when that corpus uniquely subsumes a specific author or domain. A lot of the headlines on MTO are already written to be funny or outrageous, so one could imagine what you might get from inserting a little bit of randomness into the mixture.</p>
<p>So a couple of nights&#8217; worth of work, and I wrote a <a href="http://robertelwell.info/mediatakeout-headline-generator/">MediaTakeOut Headline Generator</a>, lovingly styled after the website it parodies. I open-sourced the code at GitHub under the project name <a href="https://github.com/relwell/MTO-ON-BLAST">MTO-ON-BLAST</a>. Though maybe that&#8217;s not a fair name. Putting someone or something &#8220;on blast&#8221; means to give it a hard time, such as <a href="http://cdn.mediatakeout.com/55527/r-b-singer-monica-gets-put-on-blast-by-her-husbands-babys-mother-the-babys-mom-and-her-girls-are-laughing-at-monica-pics.html">&#8220;R&#038;B singer Monica gets put ON BLAST&#8221;</a>. I used the term since those were the kind of entertaining constructions I was looking to generate in the language model. I would posit that the <a href="http://rubberducky.org/cgi-bin/chomsky.pl">Chomsky Bot</a> does a better job putting Chomsky on blast than my project does putting MTO on blast. </p>
<p>Anyhow, I&#8217;m really writing this blog post because I wanted to describe the components of the project, and how each worked. I first cut my teeth on real programming using Python in grad school, so this was an opportunity to play around with Python after years in PHP and a couple of months slogging through some intense Perl. Let&#8217;s be honest, though: this isn&#8217;t much more advanced than a graduate Computational Linguistics I project from an implementation perspective. I could have written my own language model rather than use an off-the-shelf implementation if I wanted to impress <a href="http://www.jasonbaldridge.com/">my thesis advisor</a>, but I was really only looking to have some fun. </p>
<p>First off, I wrote a script called <b>mto-scrape.py</b>, which uses the <a href="http://lxml.de/">lxml library</a> to record headlines in the MTO archives by accessing elements in the DOM. We iterate through each page of the archive, scraping the page&#8217;s HTML, parsing it into a DOM object, and pulling out headlines from appropriate DOM elements. I also ended up using this basic approach to <a href="https://github.com/relwell/nn2s-scrape">scrape comics</a> from <a href="http://www.mitchclem.com/nothingnice/">Nothing Nice To Say</a>, an old favorite webcomic of mine, for offline viewing. So the same approach is fairly reusable.</p>
<p>Since the above script is relatively naive, I stripped escape character slashes and excessive whitespace using a a command line script. You can diff headlines.txt and headlinesprepped.txt if you want to get a feel for the changes. I should probably add a bash script to automate this step, but I trust that anyone who actually wants to repeat my results would know enough to handle this rather quickly. </p>
<p>The next step is the script <b>mto-analyze.py</b>, which uses the <a href="http://www.nltk.org/">Natural Language Toolkit</a> to consume the prepped text file and store the necessary data to create the language model. It also displays the top 100 unigrams, bigrams, and trigrams for the corpus. For the uninitiated, these are N-grams, where N is the number of adjacent tokens to constitute a unique type. Consider the following <a href="https://twitter.com/#!/lilbthebasedgod/status/195703875737632769">tweet from Lil B</a>:</p>
<blockquote><p>
I MAKE SURE TO TURN OFF MY LIGHTS WHEN IM NOT USING THEM TO SAVE THE ENERGY OF THE WORLD, WE ARE VERY LUCKY RIGHT NOW #BASED &#8211; Lil B
</p></blockquote>
<p>The first three unigrams would be: [I], [MAKE], [SURE]. The first three bigrams would be [I MAKE], [MAKE SURE], [SURE TO].The first three trigrams would be [I MAKE SURE], [MAKE SURE TO], [SURE TO TURN]. These tokens are then stored as types, with individual counts associated them. For instance, the unigram types [TO] and [THE] would both have a token count of two, since they are seen twice in this tweet. This means that they are more likely to appear than other words, from a probabilistic standpoint. This is, without getting too academic, the prevailing concept behind a language model. The more data that we have, the larger the constructions we can attempt to generalize against. With almost 37,000 headlines scraped from MTO, with an average of 16 words (or about 96 characters) per headline, it&#8217;s relatively safe for us to attempt to build a trigram model. That&#8217;s a model that uses information from the previous two words to select a relatively probable next word. </p>
<p>Many language models use <i>conditional probability</i> based on the <i>prior probability</i> of observed events. Finding the conditional probability when constructing a trigram language model would be answering the question, <i>given the last two words provided, what is the probability distribution of the all possible next words?</i> We use the prior probability of observed trigrams in our headlinesprepped.txt file to inform this decision. This is what offers the illusion of grammaticality when using a relatively flat approach to sentence construction. By inserting randomness into which probability-weighted type is selected, we make each generated sentence variable, rather than just always generating the most probable sentence.</p>
<p>The script <b>mto-languagemodel.py</b> uses the data stored from the analyze script to create a language model using a <a href="http://en.wikipedia.org/wiki/Additive_smoothing">smoothed</a> bayesian approach available in NLTK. It spits out N sentences, which are then stored on my site&#8217;s server. For performance purposes, I don&#8217;t do any headline generation on the fly. Instead, I grab a random line out of a large flat file. When searching for a subject, I&#8217;m basically adding a grep step before pulling out the random line. A post-processing regular expression handles appropriate spacing for punctuation tokens, which are treated just like normal tokens in the language model.</p>
<p>So there you have it. Now all you need to do is <a href="http://robertelwell.info/mediatakeout-headline-generator">go to the generator</a> and <a href="http://robertelwell.info/mediatakeout-headline-generator/?subject=jeezy">search for a particular term</a>. The fun part about language models is how they help you discover the emergent characteristics of a given corpus in a more novel way that simple charts or graphs (like what the analyzer component provides). So under all the fun, this is some real computational linguistics in action.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The English Language: A Fractal of Bad Design</title>
		<link>http://robertelwell.info/blog/english-fractal/</link>
		<comments>http://robertelwell.info/blog/english-fractal/#comments</comments>
		<pubDate>Tue, 10 Apr 2012 21:55:09 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=156</guid>
		<description><![CDATA[Written in response to PHP: A Fractal of Bad Design. English speakers: please try not to take this personally. You&#8217;re in awful company. Preface I&#8217;m cranky. I complain about a lot of things. There&#8217;s a lot in the world of &#8230; <a href="http://robertelwell.info/blog/english-fractal/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i>Written in response to <a href="http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/">PHP: A Fractal of Bad Design</a>. English speakers: please try not to take this personally. You&#8217;re in awful company.</i></p>
<h3>Preface</h3>
<p>I&#8217;m cranky. I complain about a lot of things. There&#8217;s a lot in the world of natural language that I don&#8217;t like. Most natural languages were invented by complete amateurs who didn&#8217;t have half a clue what they were doing when they started joining noun with verb. Combine that with the <a href="http://en.wikipedia.org/wiki/Linguistic_relativity">Sapir-Worf hypothesis</a>, and you&#8217;ve got a whole gang of self-congratulating numbskulls who can&#8217;t even think past the foolish paradigms they&#8217;ve constructed to truly subject their puny minds to logical thinking.</p>
<p>This is not the same. English is an aberration. It&#8217;s not merely awkward to speak or ill-suited for what I want, or sub-optimal, or against my religion. I could tell you something I like about most languages I don&#8217;t speak, even though I have good reasons not to speak them. But English is the lone exception. Every time I try to compile a list of gripes about the English language, I get stuck in this depth-first search of discovering more and more appalling trivia. (Hence, fractal.)</p>
<p>English is an embarrassment, a blight upon my Broca&#8217;s area. It&#8217;s so broken, but so lauded by every empowered amateur who&#8217;s yet to learn anything else, as to be maddening. It has paltry few redeeming qualities that I would prefer to forget it exists at all. </p>
<p>But I&#8217;ve got to get this out of my system. So here&#8217;s one last try.</p>
<h3>An Analogy (not really)</h3>
<p>Say you were learning English for the first time, and someone told you that the rule for forming the past tense of a verb would be to add <i>-ed</i> to the end of a word. That&#8217;s a productive rule that applies to all persons and numbers. <i>I walked; John slipped; they skipped; we flipped.</i> Now try to apply that to <i>be</i>, <i>do</i>, <i>buy</i>, <i>eat</i>, etc.  You think to yourself, &#8220;Hey, this is ridiculous! There are all these rules I have to learn that don&#8217;t work for all of verbs I want to use the most. What a stupid language. How on Earth did the people speaking this make such a mess out of their most common verbs?&#8221; </p>
<p>Now in order to use English proficiently, you have to memorize all of these exceptions. Most of the time, you&#8217;re dealing in these exceptions, and not the rules. However, since you took the time to memorize the productive rules, you find that ultimately, you&#8217;re able to generalize for a large set of vocabulary you never even memorized. The more you practice, the better you get. Eventually, you&#8217;re able to explain the subtleties of some of these stranger subsets of rules. Some non-native experts can even explain to you why there are subsets of verbs that exhibit ablauting, and why certain subsets of Latin-inherited words exhibit prosody and tonality that violates iambic pentameter. And the fact that some native speakers can&#8217;t shouldn&#8217;t be too much of a surprise, either. Sometimes, it&#8217;s hard to articulate the hard stuff, but it doesn&#8217;t make them any worse at speaking English. It just confirms their status as an empowered amateur.</p>
<p>But in the end, it&#8217;s still English. And I would never speak English, because I&#8217;m better than that. I have a degree in communication sciences and am very active on ChatHub.</p>
<h2>Stance</h2>
<p>English is a mess. It&#8217;s a mess because long ago, a population of Celtic people were overrun by a few populations of Germanic people. Sometime thereafter, a portion of the island they were living on became occupied by the Roman empire. Later still, in 1066 A.D., William the Conquerer effectively made England a French colony. For much of its history, English was a language of subservience, or a language in flux due to one or more events of occupation or contact. </p>
<p>English became a language of historical significance, and a tool with critical mass, when a perfect storm of events, beginning in the 15th Century, culminated into today. The discovery of coal reserves, the establishment of British Empire, the Protestant Reformation, the colonization of the New World and India, and ultimately a scientific history that reaches from Newton through to Turing poised English as a major international language for trade, science, and engineering.</p>
<p>This is terrible. The concept of linguistic relativity suggests that using a language subjects you to the subconscious paradigms expressed in it. The fact that English misses a number of aspects, tenses, person and number distinctions screams for a solution created by people who are too smart to use a natural language as their lingua franca. </p>
<p>What it really all boils down to is the fact that English is a great language for complete amateurs to learn. The overwhelming majority of people who speak English as their only langauge learned it as their first language and stopped there. Most of them are terrible at it and make egregious mistakes in its grammar and usage every day. Do we really want to continue to allow these groundlings of such low acumen to continue to pollute linguistic landscape?</p>
<p>My position is thus:</p>
<ul>
<li>English is full of surprises: <i>ox -> oxen; be / was / am; the pronunciation of the word &#8220;subtle&#8221;</i></li>
<li>English is inconsistent: <i>food vs. good; read (present) vs. read (past); one sheep two sheep&#8230;</i></li>
<li>English is flaky: <i>English in Canada is like a completely different language from the English spoken in India or South Africa. And  <a href="http://www.youtube.com/watch?v=9_ejSOsr08Y">I&#8217;m still trying to figure out what the lady in the beginning of &#8220;Let Me Ride&#8221; from Dr. Dre is saying</a>, but it&#8217;s allegedly also English.</i></li>
<li>English is opaque: <i>I before E except after C doesn&#8217;t even freaking <b>work</b> most of the time</i></li>
</ul>
<h2>Don&#8217;t comment with these things</h2>
<p>I&#8217;ve been in arguments about the English language <i>a lot</i>. I hear very generic counter-arguments for why we shouldn&#8217;t scrap English all together:</p>
<ul>
<li>Don&#8217;t tell me you were born speaking it. Learn something else. Yeesh.</li>
<li>Don&#8217;t tell me everybody&#8217;s using it. If everyone jumped off a cliff, would you? Yeah, I didn&#8217;t think so.</li>
<li>Don&#8217;t tell me it&#8217;s an international lingua franca for science, engineering, and trade. If it was so great, then why did they have to borrow the word <i>lingua franca</i> from Latin? And why would <i>lingua franca</i> mean &#8220;French language&#8221;? See, you peel back the onion, and you&#8217;re just left with more and more layers of absurdity. And tears. </li>
<li>Don&#8217;t tell me a subset of its rules were inherited from Latin and French. Really, what&#8217;s the point of speaking some weird wrapper around Latin when we can just speak Latin? Then we wouldn&#8217;t have to worry about appropriately converting noun declensions to their place in English (type safety). How else will anyone understand a Latin loan word, if we&#8217;re not properly declining the term into its ablative case? And the Germanic part is like Perl. And Perl is hard.</li>
<li>Don&#8217;t tell me that Shakespeare, Milton,  Twain, or Meyer (yes, I went there) wrote in English. I&#8217;m aware! They could write in pictograms, for all I care. You&#8217;ll always find smart people who can overcome the shortcomings of their platform. </li>
<li>Ideally, don&#8217;t tell me anything. Hearing or reading too much English on any given day is literally enough to send me into a flying rage. So I wrote a simple script in Erlang that filters out all email messages not written in Esperanto, so odds are that I won&#8217;t even see any emails you send me.</li>
</ul>
<p>Side observation: I <i>loooooove</i> Esperanto. It&#8217;s got a great spec; it&#8217;s completely logical; there&#8217;s this very smart committee of fantastically wealthy, extremely popular people who spend all of their waking hours monitoring its usage to prevent it from getting too illogical. I mean, look at <a href="http://en.wikipedia.org/wiki/George_Soros">George Soros</a>. That guy was raised speaking Esperanto and he gets by just fine.</p>
<h3>ENOUGH!</h3>
<p>Have I made my point? Almost any sufficiently utilized system develops irregularities, particularly around areas of high frequency of interaction. It&#8217;s one of the reasons that we have irregular verbs and irregular plurals. PHP&#8217;s development, without any kind of an academically or professionally defined specification, is as organic as the growth of natural language. It&#8217;s only recently stopped being a pidgin of C and Perl over the last decade and entered a status as a well-used but poorly regarded creole. That&#8217;s exactly the status of English before the Globe Theatre or the first translation of the Bible into what we refer to as Modern English. </p>
<p>Specifications, or the lack thereof, do not define the utility of a language. Prestige does not define the value of a language. If you don&#8217;t like a language &#8212; either natural or programming &#8212; don&#8217;t use it, and leave it at that. </p>
<p>Could you imagine if I wrote this about the fundamental shortcomings of a language from a developing nation, or a low-prestige dialect of any language? Everyone would rightly call me a bigot. I&#8217;m suggesting that squelching about the shortcomings of a language without trying to offer solutions does roughly amount to technological bigotry. It&#8217;s not productive, and it&#8217;s not welcome. Participation and dialogue is always welcome. But you&#8217;re not opening a productive dialogue by denigrating the subject of discussion.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/english-fractal/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Search Haters Gonna Hate?</title>
		<link>http://robertelwell.info/blog/search-haters-gonna-hate/</link>
		<comments>http://robertelwell.info/blog/search-haters-gonna-hate/#comments</comments>
		<pubDate>Thu, 05 Apr 2012 17:25:23 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=145</guid>
		<description><![CDATA[So I just had my morning derailed by some polemic about search over on the MSDN blog. Don&#8217;t worry &#8212; it was linked to from Hacker News; I wouldn&#8217;t normally go their on my own volition. Dr. James Whittaker, a &#8230; <a href="http://robertelwell.info/blog/search-haters-gonna-hate/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>So I just had my morning derailed by some <a href="http://blogs.msdn.com/b/jw_on_tech/archive/2012/03/15/why-i-hate-search.aspx">polemic about search</a> over on the MSDN blog. Don&#8217;t worry &#8212; it was linked to from Hacker News; I wouldn&#8217;t normally go their on my own volition. Dr. James Whittaker, a &#8220;a technology executive focused on making the web a better place for users and developers&#8221;, wrote an article called &#8220;Why I Hate Search&#8221;. Okay, James. Why do you hate search?</p>
<blockquote><p>The word &#8216;search&#8217; is a negative word. It fairly reeks of loss and effort. You lose your car keys and you search for them. Your pet runs away and you search for her. Having to search implies loss. It implies effort. Search is a means to an end. You search to rescue; you seek to find. There is little that is pleasant about the process itself. The only time to feel good about a search is when it ends, successfully.</p></blockquote>
<p>With a heavy sigh, I continue reading. I sludge through a subjective, nebulous paragraph espousing the author&#8217;s opinion on the connotations of the term &#8220;search&#8221; versus the term &#8220;find&#8221;. He cites &#8220;searching&#8221; for keys when you lose them and &#8220;searching&#8221; for pets when they run away as reasons why the term &#8220;search&#8221; is so awful. Let&#8217;s pause for a minute. The difference between calling an application a &#8220;search engine&#8221; versus a &#8220;find engine&#8221; is even more trivial than pragmatics or semantics; it is a marketing problem, and has no effect on the ultimate functionality of such a utility.</p>
<p>So beyond marketing issues, which sites like Bing and Ask have previously attempted to tackle (in my opinion, rather unsuccessfully), what is the problem with search? Dr. Whittaker feels that search is broken, in summation, for the following reasons:</p>
<ul>
<li>Search engines serve up search results pages &#8212; they don&#8217;t send you directly to what you&#8217;re looking for</li>
<li>Search engines do this because their revenue is largely based on ad delivery hosted on the search results page</li>
<li>Large companies that are centered on a search application have no incentive to innovate beyond this paradigm, because doing so would cause a loss of revenue. This means we must look to outsiders for further innovation on search (e.g. Apple&#8217;s Siri)</li>
</ul>
<p>This line of reasoning is patently absurd. This assumes that for most queries:</p>
<ol>
<li>There is only one correct page for any given query.</li>
<li>The user knows exactly what result they want.</li>
<li>The user is 100% right in that this result is appropriate both for himself or herself, as well as for all other users. Furthermore, they are right in that there are no results remotely as relevant as their desired best result for same query.</li>
</ol>
<p>I&#8217;m a little surprised that the kind of ambiguities that make serving a results page necessary escape the author. He is a self proclaimed &#8220;former Googler, former professor and former startup founder&#8221;. Dr. Whittaker readily conflates search, which is as much about discovery as it is about navigation, with mind-reading. To his credit, I probably should have just stopped reading after he devoted more than one melodramatic paragraph to the semantics of the term &#8220;search&#8221;.  </p>
<p>Here&#8217;s an excellent example of everything I find wrong with this article: after having some questions about the actual usage statistics in these contexts, I used Google to get some numbers on how many documents match these exact phrases. I used the verbatim tool and found numbers identical to the default setting, so let&#8217;s naively assume that it&#8217;s applicable across all of the hidden variables for every other user in the search engine.</p>
<p>&#8220;looking for my keys&#8221;: 313,000 results<br />
&#8220;searching for my keys&#8221;: 104,000 results</p>
<p>&#8220;looking for my pet&#8221;:  4,460,000 results<br />
&#8220;searching for my pet&#8221;: 538,000 results</p>
<p>
I may not have combed through every result to make sure that they relate to the context of lost pets and keys, but at least I&#8217;m using real data to inform my opinion. Using this information, we can see that the exact straw men the author constructed don&#8217;t hold under their own weight. If he had tried to be in any way experimental about his argument, he could have saved at least two people from writing an acerbic blog post. Dr. Whittaker makes the same mistake that the generative linguists have made for decades: informing what they believe to be objective arguments on intuition alone. </p>
<p>The fact that I was able to access this information through a search engine unravels Whittaker&#8217;s perceived intentions of the general purpose of search. Let&#8217;s not even get into the &#8220;discovery&#8221; use case, whereby an individual seeks to learn about a topic by accessing multiple relevant documents on an identical term (really, what is the <i>one right page</i> for &#8220;Napoleon&#8221; or &#8220;The Beatles&#8221;?). I think one example of what makes search a robust and useful platform in its current state is counterpoint enough.</p>
<p>In my use case, I made a search not in hopes of accessing a single page, but in search of information about the term itself, and its presence on the web &#8212; search metadata, if you will, that is extremely helpful in making decisions about future search behavior, and the importance of each document relative to the other. Google is not just a &#8220;finding&#8221; engine; it&#8217;s a tool for research, and a warehouse of knowledge. If you want to get a single page and skip the SERPs, just click the &#8220;I&#8217;m feeling lucky&#8221; button and quit complaining.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/search-haters-gonna-hate/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What I&#8217;m Up To Lately</title>
		<link>http://robertelwell.info/blog/what-im-up-to-lately/</link>
		<comments>http://robertelwell.info/blog/what-im-up-to-lately/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 20:38:24 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=140</guid>
		<description><![CDATA[So I&#8217;ve been a little quiet on this blog lately. I&#8217;ve been busy devoting a lot of my free time to some interesting contracting opportunities. I&#8217;m currently working with Time Doctor, LLC, a company working on the sites TimeDoctor.com and &#8230; <a href="http://robertelwell.info/blog/what-im-up-to-lately/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>So I&#8217;ve been a little quiet on this blog lately. I&#8217;ve been busy devoting a lot of my free time to some interesting contracting opportunities.</p>
<p>I&#8217;m currently working with Time Doctor, LLC, a company working on the sites TimeDoctor.com and Staff.com, as a retained technical advisor. It&#8217;s a great opportunity to work on different software in my spare time and continue honing my craft. I get the opportunity to analyze code quality and application architecture, and solve some interesting problems on a useful stack of productivity and hiring tools. I&#8217;d see myself continuing on in this capacity for the foreseeable future, since I really enjoy doing it in my free time.</p>
<p>Here and there, I&#8217;ve also been providing some additional time to Earmilk.com, a really cool cutting-edge music blog that&#8217;s got some interesting new features on the up and up. Keep an eye out for those changes to come into effect real soon.</p>
<p>My big announcement is that I will be changing my full-time employment from Software Architect at AetherQuest Solutions to Senior Software Engineer at Wikia. I&#8217;m making this change for a lot of reasons. Working at a top-50 website is definitely up there. And I&#8217;m really excited to get into Big Data again now that the technology has progressed a bit. I&#8217;m excited to use my search knowledge again, too, after spending so much time on other sort of enterprise-level concerns in my current role. </p>
<p>I&#8217;ll be continuing strengthening my skills in application architecture, technical definition, and project management on the side as I work through some important issues with Global Workforce. This is important to me, because I don&#8217;t want to lose any of the momentum I&#8217;ve achieved over the last year and change.</p>
<p>I had a lot of great opportunities to learn and grow at AQS, and I&#8217;m really proud of what I accomplished there. But it&#8217;s about time I put the &#8220;language&#8221; back in &#8220;Language Hacker&#8221;!</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/what-im-up-to-lately/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SLA-Driven Development</title>
		<link>http://robertelwell.info/blog/sla-driven-development/</link>
		<comments>http://robertelwell.info/blog/sla-driven-development/#comments</comments>
		<pubDate>Wed, 28 Sep 2011 14:41:34 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=134</guid>
		<description><![CDATA[A service-level agreement is a useful method for setting expectations on deliverability, and incentivizing quick turnaround. It&#8217;s also a great way to motivate a development team. SLAs encourage goal-setting and collective buy-in internally. They improve your users&#8217; perception of your &#8230; <a href="http://robertelwell.info/blog/sla-driven-development/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A service-level agreement is a useful method for setting expectations on deliverability, and incentivizing quick turnaround. It&#8217;s also a great way to motivate a development team. SLAs encourage goal-setting and collective buy-in internally. They improve your users&#8217; perception of your software and the people who maintain it. Best of all, they provide concrete, measurable criteria to evaluate the success of your team at a larger-scale level than issue-by-issue, or the regressions introduced in a given release. Whether your business model is software as a service or software as a product (free products included), you can easily use the fundamental components of an SLA to reap positive benefits. In this post, I will describe some guidelines for implementing an SLA in your team.</p>
<p><span id="more-134"></span></p>
<h3>But we don&#8217;t have &#8220;clients&#8221;. Why should we have an SLA?</h3>
<p>SLAs are useful for everyone. They give your team an opportunity to sit down and decide what is a sensible turnaround for important issues with your software. And if you feel like you don&#8217;t have any clients, think again; if your software is a product that fuels other business within your organization, <em>you are your own client!</em> This goes even more so for a client-oriented business &#8212; you are your foremost client, since unsuccessfully serving external clients is a disservice to your own business. </p>
<p>You may already have a release process. Well, you should. You may already have a process for prioritizing issues. But an important component to introducing an SLA into the mix is the line of communication it creates. It gives you an opportunity to introduce a dialogue between all groups affected over what your team&#8217;s priorities should be at any given time, what is a reasonable time for arriving at a solution, and what sanctions (if any) exist against failure or to promote success. This brings me to my first guideline:</p>
<h3>Keep It Collaborative</h3>
<p>Think of all of the people involved in your product: developers, managers, users, designers, QA engineers, etc. If you don&#8217;t have an SLA, try this simple exercise: walk up to one of each of these different roles and ask them what a reasonable amount of time it should take to fix a bug that completely broke the ability to make money with your product. Now ask them what will happen if it takes longer than that to fix that bug. For as many people you ask this line of questioning, you&#8217;ll get as many different answers.
<p>With an SLA in place, everyone who interacts with the use or development of the product should be on the same page with respect to what the expectations are. If they&#8217;re unfamiliar with the specifics, they will at least know they can reference a document and find out without asking around.  When was the last time you got everyone in a room together, or looking over the same document, to agree on how long it should take to fix something that could make or break your business? Wouldn&#8217;t you feel a bit safer if you knew everyone had an idea of how long was too long? And what if, perhaps, certain scenarios were incentivized for or against in advance? If for instance, you were to give developers a nominal bonus (say $25) any time a bug of the highest severity was addressed within the SLA (or perhaps even within a shorter time limit), would you feel more or less comfortable that the issue would be addressed in time to prevent harm to your business?</p>
<p>Now think about the different kinds of people involved in the software from another aspect: there are the users or clients, obviously; then there are those involved in implementing user requests &#8212; technical or managerial. Obviously, in most organizations, a layer of customer- or client-facing individuals interface the need of the users, who generally want everything right away, with the group tasked with delivering what the users want &#8212; preferably at a pace that doesn&#8217;t keep them burning the midnight oil or rushing their work. Everyone needs an open line of communication when developing an SLA &#8212; either as an agreement within a single organization or a contract between more than one. This is a method of achieving mutual understanding and reaching a compromise on what can be determined a reasonable turnaround. It also improves the likelihood that the process in the SLA will be followed, since fewer parties will feel that the agreement has been forced upon them, but rather reached through a negotiation they were involved in.</p>
<h3>Keep It Granular</h3>
<p>A well-written SLA should have a description of the various scenarios from the reporting of a bug, to its prioritization, all the way to the concrete turnaround time for delivery of its solution. This means you need at least three separate classifications for reported bug:</p>
<ul>
<li>A high-severity issue
<ul>
<li>Usually financially impacting, but can also be highly time-sensitive</li>
<li>Needs a high turnaround time</li>
<li>Generally requires a hot-fix or off-cycle release, and sometimes a patch</li>
<li>These may require additional process in advance of the solution (such as reverting to a previously stable release) in order to ensure the product&#8217;s integrity</li>
</ul>
</li>
<li>A medium severity issue
<ul>
<li>A broken, but not financially vital or time-sensitive functionality</li>
<li>Should be addressed during the current in-development release</li>
</ul>
</li>
<li>A low severity issue
<ul>
<li>Minor UI issues, or an unexpected result in an edge case</li>
<li>Not financially impacting in any way, still usable</li>
<li>Can be prioritized to the backlog, to be addressed within an agreed-upon number of releases</li>
</ul>
</li>
</ul>
<p>If you have <b>N</b> levels of granularity, you should have <b>N</b> different turnaround times, and <b>N</b> different criteria. So make sure that you&#8217;re not overcomplicating things, but also make sure that the scenarios you describe effectively capture the issues you encounter from the origination of the bug report to its resolution and incorporation into a stable release.</p>
<h3>Keep It Reasonable</h3>
<p>This really only applies to issues of a higher level of severity. Anything that can be prioritized for the current in-process releases is not that difficult to handle. If there&#8217;s one positive thing the Agile methodology has brought to software development, it&#8217;s the philosophy that an in-development feature with a threatened deadline has the option of changing deadline or changing scope. Don&#8217;t be afraid to iterate more or delay a new feature if it means addressing a medium-severity issue of greater importance to the client or the organization. Make sure your SLA includes language describing the risk to meeting deadlines for agreed-upon feature changes or requests.</p>
<p>Different groups may have different expectations on what is a reasonable turnaround time to address an issue of a given severity level. However, in the words of Lil Wayne, numbers don&#8217;t lie. Promising to deploy a hotfix within 4 hours of an initial bug report, if your average issue resolution time is 3 hours and time to prepare a release is a half hour, is a dangerous game to play; it will only set you up for failure.</p>
<p>It only takes one vexing bug to leave your team looking incompetent. It&#8217;s better to under-promise and over-deliver here. Don&#8217;t give yourself an excess of padding, but make sure that you&#8217;ve taken into account the worst-case development time for your highest-severity issues, and the time it might take to QA the solution and manage a release for it.</p>
<p>Since an issue of the highest severity is a stop-the-presses kind of issue, it&#8217;s okay to entertain options here that may otherwise be outside of process &#8212; for instance, patching the solution in advance of deploying a hotfix or off-cycle release &#8212; in order to meet the SLA. You should include language that enumerates these options and explains the risk for each scenario. If a solution is found within the SLA, and the difference between on-time delivery is opting for one of the riskier scenarios or not, the parties involved should be again be informed of the choices. At this point, the SLA should be considered met, whether or not the riskier option is invoked.</p>
<h3>Incentivize Your Team</h3>
<p>An SLA isn&#8217;t much more than a mission statement unless there is language that explains what happens if the terms are not met for a given issue. This portion of the document might differ between what&#8217;s provided to, say, a developer compared to a client. For a client, it may involve future discounts or credits. For a developer, it may affect something like bonuses or vacation time. That&#8217;s not to say that every time a developer does his or her job, you should be shelling out extra cash. This is really just an example of one way to put some spurs on your agreement.</p>
<p>For the good kind of developer, the kind you should hope to have in your organization, meeting the agreement and maintaining the integrity of the product and the company that builds it should be incentive enough. However, if you&#8217;re in an organization that does performance bonuses, it may be worth using the SLA as an indicator of performance. This is a good way to introduce a level of transparency that can serve as a motivating factor. Another passing thought on this is perhaps to include language in the &#8220;developer version&#8221; of the SLA that comps time spent outside of an eight-hour work day to deliver a high-priority issue at a certain rate &#8212; for instance, one hour of paid vacation for every four hours spent.</p>
<p>Personally, we don&#8217;t do any of that for the developers in our group. None of this kind of stuff is going to buy a sense of ownership, but there are a lot of organizations that modulate their compensation based on performance, so I thought I&#8217;d give the option. My opinion is that paying a fair wage and making sure the people who do the brunt of the work have an opportunity to contribute to the process of developing the agreement should be a better motivator. Involvement builds ownership. A bonus program in the wrong hands can be executed poorly or arbitrarily, and is just as much a dividing factor as it is a motivating factor.</p>
<h3>It&#8217;s All About Integrity</h3>
<p>A service-level agreement is, in the end, an opportunity to put some real concrete language around what constitutes scenarios that derive from the normal process of maintaining a software product. It also sets expectations for deliverability on issues of routine maintenance. By further defining how issues are evaluated, and agreeing on a reasonable turnaround time, you can set your team up for success &#8212; both in the eyes of stakeholders, as well as in their own eyes. Another great thing about this exercise is that it will make developers feel protected from the whims of an emotional client. Doing this for your team will allow them a sense of breathing room, and the peace of mind to know that they are being looked out for in the grand scheme of the process of product development. </p>
<p>Think about it: you have an opportunity to turn &#8220;I need it here yesterday&#8221; to &#8220;I should expect this up and running in 12 (or 24, or 48) hours&#8221;. You&#8217;re protecting the integrity of your organization by preventing the attitude that deliverability should be instant. Without something in place to modulate this, you&#8217;re already fighting a losing battle at the initiation of a bug report. </p>
<p>You have an opportunity to keep middling UI issues that a single stakeholder thinks is the end of the world from causing yet another late night in the office for no reason other than his own peace of mind. <i>Expect it to be handled by noon tomorrow &#8212; not midnight tonight. Remember that SLA you helped us develop?</i> You&#8217;re protecting the morale of your team, and thus its integrity as a working unit. There&#8217;s no need to fatigue your employees with long hours and the fear that they may be kept late, so long as expectations are adequately set between all groups internally.</p>
<p>Every organization can use an SLA. It&#8217;s the starting point at which all issues, and their solutions, are prioritized. It introduces transparency, and it gives everyone a bit of breathing room. And, executed effectively, it&#8217;s probably the closest to collective bargaining we&#8217;re going to see in the world of software development. So take advantage of it.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/sla-driven-development/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Novel Methodologies for Distributed Development Teams</title>
		<link>http://robertelwell.info/blog/distributed-development-methodologies/</link>
		<comments>http://robertelwell.info/blog/distributed-development-methodologies/#comments</comments>
		<pubDate>Fri, 02 Sep 2011 06:34:34 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=130</guid>
		<description><![CDATA[I work in the Bay Area at a company based out of the Washington, DC area. I lead a development team with the manpower roughly divided between both offices. On top of that, we have certain stakeholders that work in &#8230; <a href="http://robertelwell.info/blog/distributed-development-methodologies/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I work in the Bay Area at a company based out of the Washington, DC area. I lead a development team with the manpower roughly divided between both offices. On top of that, we have certain stakeholders that work in other cities out of a home office. This isn&#8217;t unusual these days in tech. However, it puts a lot of strain on traditional methods of project management and release management. I&#8217;ve used it as an opportunity to implement some nifty methods that some might call agile to keep pushing our product forward in a rapid but guided manner.<span id="more-130"></span></p>
<p>The challenge of coordinating productivity between locations up to three hours apart can be rough at times &#8212; particularly when it comes to the days approaching the creation of a release candidate from the development branch, or when a &#8220;should have been done yesterday&#8221; hotfix floats to the top of the ticket pile. However, there are some great perks. You get to show up to work in the morning with more work collectively done than when you left.  For clients on the East Coast, we can work further into the night to get them what they need when they get in the door first thing the next morning. My favorite part of having a distributed dev team, though, is the technologies we have implemented in order to keep communication active and instant. Here are a few guidelines that I have found really enable a development team to get a lot done and stay in the know.</p>
<h3>Get on IRC</h3>
<p>At least for my use case, there&#8217;s no need to pay money for something like BaseCamp or HipChat when you&#8217;ve got a little spare server space. It should take you maybe a few hours to set up an IRC channel for your developers. Require that they stay on this channel any time they&#8217;re working. I have a bot I call Hal (cliched, I know) in the channel that keeps the channel open and logs all conversations. Bots are generally easy to extend, and come in most languages you should be comfortable in working. For instance, I added a module to Hal so that he will poll our Git repository to report when code has been pushed to a branch, and announces it. We use Jira for version control, and any time a ticket is mentioned by its key, he will give a link to the ticket&#8217;s page as well as its title. If we&#8217;re worried about our various servers, I can have it poll them for load average and CPU usage. IRC is great productivity tool, because it&#8217;s a simple protocol with plenty of extendable, open-source bots. You can tailor your communication experience to make sure everyone can stay abreast of what&#8217;s going on throughout the day with a minimum of distraction.</p>
<h3>Ditch the Standup and Do Your Daily Scrum <i>in IRC</i></h3>
<p>Even if you&#8217;re all in the same office, what are you doing making people get out of their seats, stand up, and provide their uninterrupted attention to something that they&#8217;re only half listening to and will likely forget anyway? Get what people are up to in the logs. Let them keep working on what they&#8217;re working on. Meetings kill developer concentration. I hold a 9:15am Pacific scrum meeting every day. That&#8217;s 12:15pm on the East Coast. If my developers are busy working on something important over there, I&#8217;m not going to make them get out of their seats, walk into a conference room, and dial into a GoToMeeting call just to chat about their day. By digitizing the daily scrum, you keep all the value of a constant contact point at a steady interval without breaking anyone&#8217;s stride. You also get the benefit of logging as well as the additional value of customizing the scrum experience with an IRC bot. And of course, the best benefit of all: it doesn&#8217;t really feel like a meeting, and we developers loathe meetings, don&#8217;t we?</p>
<h3>Continuous Integration on the QA Instance</h3>
<p>Since we&#8217;re all distributed, I think it&#8217;s a much better idea to cut out the risk of multiple people attempting to deploy development code on the QA server at the same time. This is, of course, where all the code for the next release is going, and we want to make sure that it can build by itself at all times. I use Hudson to poll the development branch of our Git repository, which will then update the code and automatically build everything &#8212; rolling back and running new database migrations, clearing the cache, etc. I prefer to keep an eye on Hudson&#8217;s builds via RSS, since Hal is already announcing pushes to the development branch. However, it&#8217;s your IRC channel; do what you want with it!  You can easily consume Hudson&#8217;s RSS feed to create a bot module that announces whether a build was successful or not. Most continuous integration solutions accommodate some way of automatically gleaning build data, so don&#8217;t feel like you have to use Hudson to get this level of value.</p>
<h3>Code Reviews and Release Notes</h3>
<p>Because everyone should be aware of the direction of the product and have a stake in its level of quality, we encourage code reviews for all code submitted. Another developer has to pass your code before it can go into a release candidate &#8212; present company included. I also keep other developers updated on the details of the release notes so that they can skim functional changes and recent fixes in the product. Internal release notes are available as a plaintext file in the code base, but I also send an email out both for the release candidate and the final internal release. This has helped catch snafus or missed must-have items on more than one occasion. </p>
<h3>Daily Summaries</h3>
<p>In a perfect world, you could go into Jira and get an hourly breakdown of what everyone did that day. The reality is that most developers are not consistent time recorders, even when it&#8217;s important, and even when you hammer it into them &#8212; again, present company included.  The reality is also that sometimes developers work on things that do not have or require a ticket. Encourage your developers to record what they worked on in a short summary via some kind of internal publishing tool. In our case, each developer has an internal blog they will record a daily summary into, detailing what tickets were worked on. This is useful for splitting out whether a developer worked on maintenance release items, or maybe a major functional item that will be worked on in parallel with multiple sprints. They may have even had all their time diverted to a hotfix. It&#8217;s hard to know &#8212; especially when you&#8217;re filtering things by fix version &#8212; without the &#8220;executive summary&#8221;.</p>
<h3>Leverage The Time Difference; Don&#8217;t Bemoan It</h3>
<p>With developers a few time zones ahead of you, sometimes it&#8217;s hard to make sure everything is going to get done. It&#8217;s not cool to require people to ask permission to head home at a reasonable time, and they shouldn&#8217;t have to. If something you wanted to get done by the end of the day Eastern Time doesn&#8217;t get done, don&#8217;t expect someone to stay late just because it&#8217;s only 2:00 or 3:00 Pacific.  Set goals that are reasonable for the time zone of the person working on it. Make sure you&#8217;re utilizing your resources with the time difference in mind. If a developer is working on something that they know should be completed that day, make sure that they&#8217;re in contact about its status at the end of the day. If it&#8217;s not completed, and it&#8217;s part of a release that has to get shipped out ASAP, have a set protocol, or at least a contingency in place to coordinate a hand-off. A good guideline here is to make sure that all in-progress branches get pushed at the end of the day, and that tickets get commented with details on the work done thus far so that anyone else can pick it up.</p>
<h3>Conclusion</h3>
<p>For my team, these guidelines have done a lot in terms of improving the quality of our code and the quality of our product. It has also improved the opinion our users have not only of the product, but the people behind it. You can use most of this information without having more than one office. If you have staggered shifts or people on different schedules, you can even use the last guideline. The most important part here is that we&#8217;re actually improving how we manage developers&#8217; priorities and output while reducing the amount of time they&#8217;re not working. We&#8217;re not breaking anybody&#8217;s stride, and we&#8217;re actually encouraging active communication without ever having to leave your desk. The payoff is clear: developers are happy because their meeting load is minimized; you can actually have a higher frequency of meetings since you don&#8217;t have to coordinate a bi-coastal call; everyone knows the immediate status of a release, so anyone can work on it at any time &#8212; shortening development time and increasing client responsiveness. </p>
<p>If you&#8217;re stuck slogging through call after call trying to coordinate developers in more than one place, or if you&#8217;re just sick of having to get out of your chair for your daily standup meeting, give these methodologies a shot, and see if it works for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/distributed-development-methodologies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Highlighter Incident</title>
		<link>http://robertelwell.info/blog/the-highlighter-incident/</link>
		<comments>http://robertelwell.info/blog/the-highlighter-incident/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 22:49:46 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=124</guid>
		<description><![CDATA[When it comes to &#8220;The Real World&#8221;, one of my biggest learning experiences was getting fired from a temp job at a Verizon Wireless store the summer before grad school. It&#8217;s 2005. I had just graduated from SUNY Albany, and &#8230; <a href="http://robertelwell.info/blog/the-highlighter-incident/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>When it comes to &#8220;The Real World&#8221;, one of my biggest learning experiences was getting fired from a temp job at a Verizon Wireless store the summer before grad school.</p>
<p><span id="more-124"></span>It&#8217;s 2005. I had just graduated from SUNY Albany, and had my grad school plans all figured out. I&#8217;d be going to UT Austin to continue my studies in linguistics. With a few months before my move to Austin, Texas, I picked up some temp work that landed me at a Verizon Wireless store at Crossgates Mall, just outside of Albany, New York. It was a mall job, but as far as mall jobs go, it wasn&#8217;t terribly bad. Sure, I still felt over-qualified, but that was before I had much of an idea of the value of a bachelor&#8217;s degree in liberal arts (or the lack thereof, as we&#8217;ve all come to learn since).</p>
<p>The official role was greeter; I would catch people as they came in and try to get them time with either a sales associate or a customer service rep. If necessary, I would sign them up for tech help from one of a revolving squad of anti-social metalheads in the back room. It was a lot of standing, and a lot of schmoozing.</p>
<p>Verizon not-so-subtly used their temporary labor service as a cheap HR department. Most of the non-temporary employees were at one time temps, and management regularly put individuals designated as greeters into training for customer service roles. I got a lot of experience with the variety of roles available at a cell phone store, and even had an opportunity to do a little off-the-cuff salesmanship when things got a little busy.</p>
<p>All of my exposure to the different roles and responsibilities got me thinking about what I could do to make the place run smoother. This was at the forefront for me, as I was primarily responsible for corralling people to different sections of the store. As a result, customers would often hold me responsible for not getting individuals with their salesperson or CSR in a timely manner.</p>
<p>The issue that I first targeted involved the potential logjam that occurred in the sales queue. Salespeople had to spend a lot of face time with potential new contracts or new phone purchases. For every individual they spent time with, they had to fill out a large form about their encounter, outlining what items were bought, and other sundry details about the interaction. Each form took about five minutes. It was something they could stop and go back to, and often would develop a significant backlog if they had a a full queue of customers.</p>
<p>The customer queue was tracked by a simple form on a clipboard, at the podium where I would stand. Names were written in pen, and then, once handed off to a salesperson, were crossed off. For a salesperson to begin serving a customer, they had to leave their desk and closely inspect the form. This meant that they would often have to stop filling out their back log of forms just to check if the scribbles on the clipboard were crossed out names or new customers awaiting service.</p>
<p>My &#8220;big idea&#8221; was idiotically simple, if you think about it. Grab two highlighters from the the back room &#8212; one pink, and one blue. For a new customer, who had not yet been served, highlight their name pink. The color could be seen from the other side of the room easily, so a salesperson could just catch a glimpse of the clipboard, and, if there were any pink lines available, they would then know to stop what they were doing and start helping customers. Customers would be taken off the queue by highlighting the pink lines with the blue highlighter, which resulted in a dark purple, providing an excellent contrast to the light pink.</p>
<p>This idea was a big hit with the sales staff. It helped them stay on top of their paperwork, and kept them almost instantly available to customers at all time. However, one of the younger managers &#8212;  whose primary responsibility was sales &#8212; took severe issue with it. He felt that their motivation should be the commission they stood to make. I wasn&#8217;t sure how these two concepts clashed. In the end, it may have just been that he didn&#8217;t like the idea. He told me to stop doing it.</p>
<p>And then a regional manager came by for a visit. The sales staff told him all about my great idea, and how much easier it made everything for them. This manager loved my idea so much he literally took two highlighters, put them in my hand, and told me to keep doing what I was doing &#8212; after the assistant manager had already told me to stop. The pecking order seemed pretty clear, so I went back to doing what I had been doing, and witnessed a notable improvement in efficiency return to the sales queue.</p>
<p>The next week, I showed up at work only to immediately be summoned to the back office. The head manager was literally holding up the highlighters as if I had brought contraband to the workplace. I wasn&#8217;t fired, because I was an employee of a temp agency, but my contract as a greeter with Verizon was terminated prematurely due to insubordination.</p>
<p>At the time, I was pretty dismayed. It was the first time a professional engagement ended in such an ignominious way for me. Now, I just joke about it whenever the topic of innovation or cell phones comes up. Of course I didn&#8217;t want to be a lifer at Verizon Wireless. I was just hoping to spend a few more weeks making money before I moved across the country &#8212; and maybe even transfer to a store there for some part time work during my studies. I suppose telling them I would be moving in a month didn&#8217;t really give them a reason to keep me around anyway.</p>
<p>I could spend this paragraph using this my personal experience as an anecdotal example of our failing broadband and wireless infrastructure. I could vehemently rail against the big telecommunications companies in this country, how they wantonly absorb government subsidies without any returns in the form of improved end-user service, but I&#8217;ll skip it. I&#8217;d rather not contribute to the pool of vitriol you&#8217;ll see in most hackers&#8217; blog posts about wireless phone or broadband service and the companies that purvey it. But I will say that these are organizations that clearly have a cultural problem of discouraging innovation. There are bigger companies that do better, and there are smaller companies that do worse. It&#8217;s not their industry; it&#8217;s not their size.</p>
<p>I learned a great deal from this fiasco &#8212; lessons about the interpersonal and bureaucratic sides of working in a company that stick with me to this day:</p>
<ol>
<li><strong>Just because an idea is good doesn&#8217;t automatically give it traction with management. </strong>This speaks for itself. If you implement something useful without getting approval in a workplace that values management over traction, no amount of justifying yourself will keep you from getting read the riot act.</li>
<li><strong>Multiple managers make for noisy signals. </strong>If you absolutely need more than one person managing someone else, you need to make absolutely sure that the employee knows who&#8217;s in charge of them for what aspect of their job. Otherwise, you&#8217;re just confusing people, and making it harder for them to do their job effectively. If you&#8217;re a co-manager of an employee, make sure you&#8217;re staying in your sphere of influence. Contradicting another manager could make more trouble for the employee than for you, and that&#8217;s particularly problematic from an ethical standpoint.</li>
<li><strong>Efficiency is not the only motivation.</strong> The sales manager&#8217;s interest was in motivating the sales staff to stay on top of their queue of customers, and to be proactive in servicing customers. He valued customer service over effective data entry. It was his prerogative to identify and rank the sales team&#8217;s priorities, and I can understand how staying up on your forms could take a back seat to making sales. There are countless applications for this lesson in the software industry. Your approach may optimize for all the vectors you&#8217;ve identified, but that doesn&#8217;t mean there isn&#8217;t one you haven&#8217;t taken inventory of. That &#8220;unknown unknown&#8221; may have an untenable cost introduced to it due to your approach. Collaboration usually helps catch these before they become a problem.</li>
</ol>
<p>When all is said and done, I&#8217;m kind of glad I got &#8220;fired&#8221; from this job. As a student, I hadn&#8217;t really encountered real failure before then, and it was my first exposure to real rejection therapy. It was also my first experience learning that things aren&#8217;t always fair in the workplace, and sometimes a paycheck is better than being right.</p>
<p>I&#8217;ll still never use Verizon though.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/the-highlighter-incident/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Future-Proof Your Database Change Log</title>
		<link>http://robertelwell.info/blog/future-proof-db-changelog/</link>
		<comments>http://robertelwell.info/blog/future-proof-db-changelog/#comments</comments>
		<pubDate>Tue, 31 May 2011 23:54:44 +0000</pubDate>
		<dc:creator>Robert</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://robertelwell.info/blog/?p=114</guid>
		<description><![CDATA[Adding a change log to your database is the best way to make sure you&#8217;re working on a version of your web application that adequately reflects a given state of your product. However, when working with a branching-and-merging development environment, &#8230; <a href="http://robertelwell.info/blog/future-proof-db-changelog/">Continue reading <span class="meta-nav">&#187;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Adding a change log to your database is the best way to make sure you&#8217;re working on a version of your web application that adequately reflects a given state of your product. However, when working with a branching-and-merging development environment, where two different developers may be working on a migration at the same time, we often encounter a race condition that can cause keeping development-level environments pristine compared to those that only deploy stable code. Here, I outline a methodology that allows both the flexibility of distributed development as well as sane migration management in an application with an evolving database schema.<br />
<span id="more-114"></span></p>
<h3>Object-Oriented Rollbacks</h3>
<p>I&#8217;m a big believer in using <a href="http://blog.cherouvim.com/a-table-that-should-exist-in-all-projects-with-a-database/">some kind of database change log</a> for any project you&#8217;re working on. As someone firmly entrenched in the PHP world, I&#8217;m used working with serial migrations such as you&#8217;d see in the DbDeploy functionality of Phing. So the form our DB change log table takes basically lists the major-level component upon which the logged migration interacted, and the migration number. Furthermore, with each release, we have a Version class that always lists the current version of the product and the most stable migration number for each type of migration.</p>
<p>This basic table allows us a couple of different operations:</p>
<ul>
<li><strong>Update to latest migration available:</strong> This is used when we deploy stable code, or when we&#8217;re updating our local databases to the current iteration of development code</li>
<li><strong>Update only to the latest stable migration</strong> This is used for both updating to a stable version of the database schema as well as returning the database schema to its last stable state.</li>
</ul>
<p>For both of the above actions, if the database&#8217;s change log shows that it is at a migration later than the most stable one, we roll back all migrations, removing the latest row from the change log to reflect where we are in the state of the database&#8217;s schema.  Rollback logic was originally a just a required method of each migration class that implemented a &#8220;Reversible&#8221; interface. The migration would run a <span style="font-family: monospace;">deploy()</span> method when rolling forward, and a <span style="font-family: monospace;">rollback()</span> method when rolling back. Here are some issues with utilizing this methodology off the shelf:</p>
<ol>
<li>Before pulling down new code in your development branch, you&#8217;ll have to roll back to the stable branch in case you need to, for instance, ameliorate a code conflict.  Sometimes, this unnecessarily destroys test data.</li>
<li>If you forget to roll back to stable, you now might try to roll back code that never ran if migrations have been reordered or changed</li>
<li>Your database only knows whether it ran <span style="font-family: monospace;">Migration_N</span>, and not whether calling <span style="font-family: monospace;">Migration_N::rollback()</span> will actually roll back the logic created by <span style="font-family: monospace;">Migration_N::deploy()</span>. It&#8217;s entirely possible that <span style="font-family: monospace;">Migration_N</span> is now <span style="font-family: monospace;">Migration_N+1</span> or even <span style="font-family: monospace;">Migration_N-1</span></li>
</ol>
<p>So how do we mitigate against these possibilities? Ideally, we would still like to tie rollback logic to the same migration class that is deploying the changes to the database schema. However, we don&#8217;t want to make the mistaken assumption that calling <span style="font-family: monospace;">Migration_N::rollback()</span> will universally roll back what posted the original increment to the database change log.</p>
<h3>Pairing Deployments to Rollbacks</h3>
<p>My solution to this problem is to <strong>insert rollback logic directly into your database change log</strong>. I added a <span style="font-family: monospace;">TEXT</span> field called &#8220;rollback&#8221; to each database change log. I then changed the Migration&#8217;s <span style="font-family: monospace;">rollback()</span> method to a <span style="font-family: monospace;">getRollbackSQL()</span> method that returns the fully valuated query that we store in the change log.  We now take the following steps when rolling back a migration:</p>
<ol>
<li>Check if the change log entry contains rollback SQL. If so, run the query, delete the row, and continue iterating</li>
<li>If there&#8217;s no rollback SQL in the change log entry, we try to call <span style="font-family: monospace;">Migration::getRollbackSQL()</span> for the migration number we&#8217;re on.</li>
<li>Worst case scenario: there&#8217;s nothing available for us to do. For instance, the change may have been irreversible, data cleanup, or just asymmetrical. Just remove the entry from the DB change log, and hope that we</li>
</ol>
<p>This behavior is encapsulated in the parent Migration class&#8217;s <span style="font-family: monospace;">rollback()</span> method. So how about the old migrations that may have utilized <span style="font-family: monospace;">rollback()</span> in a way that (here&#8217;s hoping rarely) flouts the paradigm of just executing a query against the database? And how about future cases where a simple DB query may not be all we have to do during the migration? Well, for these cases, we can just extend the <span style="font-family: monospace;">rollback()</span> method itself instead of the <span style="font-family: monospace;">getRollbackSQL()</span> method. This allows us freedom when we need it, and simplicity when we don&#8217;t.</p>
<h3>Application</h3>
<p>Let&#8217;s walk through a scenario where this methodology provides a significant value against a process that requires we always update our database to a stable state before updating your development branch.</p>
<p>Joe Developer creates a migration in a branch to address a ticket during a maintenance sprint. He finishes development on the ticket, and merges his completed work, including the migration, into the development branch. As he&#8217;s finishing this up, Joe gets a call from his boss. Someone needs a hot fix, pronto. Joe immediately switches over to work on said hot fix, branching off of the most stable tag of the software.</p>
<p>As it turns out, the fix also needs a migration, but Joe never rolled his changes back. Switching back to the development branch just to roll back to stable would break his stride, but it&#8217;s something he&#8217;d have to do now that he&#8217;s working on the hot fix, which doesn&#8217;t have an instance of a development-level migration. However, regardless of what branch he&#8217;s got checked out, if he&#8217;s working under the new paradigm, he can easily roll back any unstable migrations (provided they all use the <span style="font-family: monospace;">getRollbackSQL()</span> methodology), and can be sure that he&#8217;s rolling back the old migrations, and not any migration he&#8217;s currently coding under the same migration number. Furthermore, when he merges his completed hot fix into the development branch, if he&#8217;s using continuous integration, he can be assured that the migration deployment process won&#8217;t get confused by any migration reordering. Otherwise, there&#8217;s a chance that on some instances that use development code, <span style="font-family: monospace;">Migration_N_hotfix</span> (the real <span style="font-family: monospace;">Migration_N</span>) would never run, or that the rollback method would be called against <span style="font-family: monospace;">Migration_N_hotfix</span> when we&#8217;d actually want it to run against <span style="font-family: monospace;">Migration_N+1_Dev</span> now.</p>
<h3>Conclusion</h3>
<p>Obviously, this is just one way to handle the issue &#8212; particularly, if you&#8217;re using migration classes rather than simple SQL scripts. Reconciling server-side application code with database-level logic will always introduce areas where you can improve your coordination. Some people just opt to avoid this trouble all together and write their migrations as .SQL scripts rather than as classes, but there is a value in treating your approach to migration at the application layer, through objects. For instance, in our application, one type of migration is actually iterated over a variety of databases, so we can just act over the same table on a number of databases by just creating one class that manages providing the appropriate values for each dynamic script. </p>
<p>Whenever there&#8217;s an opportunity for iterating your software, there&#8217;s often an opportunity for iterating your process. The original approach was developed for lack of having any database rollback whatsoever &#8212; changes were in effect permanent, leading the shared development instance to diverge significantly from the stable instance in unexpected and irreversible ways. I&#8217;m sure there will be opportunity in the future to improve things even further. Altering your database schema invariably introduces some level of risk. <em>So what are some of the tricks you use to safely iterate your database?</em></p>
]]></content:encoded>
			<wfw:commentRss>http://robertelwell.info/blog/future-proof-db-changelog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

