<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Signal/Noise &#187; coding</title>
	<atom:link href="http://billpetti.com/tag/coding/feed/" rel="self" type="application/rss+xml" />
	<link>http://billpetti.com</link>
	<description>Trying to separate the signal from the noise, one post at a time.</description>
	<lastBuildDate>Fri, 30 Sep 2011 21:49:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='billpetti.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/6cef924e9e2296437300917a41fb5f9c?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Signal/Noise &#187; coding</title>
		<link>http://billpetti.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://billpetti.com/osd.xml" title="Signal/Noise" />
	<atom:link rel='hub' href='http://billpetti.com/?pushpress=hub'/>
		<item>
		<title>Coding the Sentiment of Web 2.0</title>
		<link>http://billpetti.com/2009/09/19/coding-the-sentiment-of-web-2-0/</link>
		<comments>http://billpetti.com/2009/09/19/coding-the-sentiment-of-web-2-0/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 12:25:17 +0000</pubDate>
		<dc:creator>Bill Petti</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Brands]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[market research]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[Sentiment]]></category>

		<guid isPermaLink="false">http://billpetti.com/?p=616</guid>
		<description><![CDATA[Kevin Randall at FastCompany pens an interesting piece on the rising tide of sentiment analysis&#8211;the players, the technologies, the possibilities, and the current pitfalls.  The idea behind sentiment analysis is pretty simple (but the execution is difficult): to identify and code attitudes, whether written or verbal, towards particular topics.  The explosion of activity on the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=billpetti.com&#038;blog=8839193&#038;post=616&#038;subd=billpetti&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Kevin Randall at FastCompany <a href="http://www.fastcompany.com/blog/kevin-randall/integrated-branding/market-research-30-here-attitudes-meet-algorithms-sentiment-a?1253361650" target="_blank">pens an interesting piece</a> on the rising tide of sentiment analysis&#8211;the players, the technologies, the possibilities, and the current pitfalls.  The idea behind sentiment analysis is pretty simple (but the execution is difficult): to identify and code attitudes, whether written or verbal, towards particular topics.  The explosion of activity on the web (blogs, social media platforms, etc.) has created an enormous amount of data that typically includes some <img class="alignleft" src="http://www.icofree.com/userfiles/images/VistaStyleEmoticons.jpg" alt="" width="200" height="166" />kind of feeling towards the topic.  This is a researcher&#8217;s and marketer&#8217;s dream&#8211;a plethora of opinion from which to mine and analyze.  The key, however, is to be able to easily collect, code and analyze that data.  The most difficult of these three steps is coding&#8211;how do you efficiently designate millions of utterances on the web in terms of their &#8220;polarity (positive or negative), intensity, and subjectivity&#8221;?  Randall notes the initial problems with accuracy as well as other open questions:</p>
<blockquote><p>Computer deciphering of word meaning is not always accurate and tone can be completely missed. Even the leading vendors acknowledge that the data is 70-80% reliable. For example, we may know that the phrase &#8220;quite interesting&#8221; means one thing in America, another in Britain, <a href="http://commetrics.com/articles/fails-validity-test/">but the computer would see the same meaning</a>. Note some of the long-standing issues with voice recognition technology.</p>
<p>There are questions about how robust or representative the data is. Are a brand&#8217;s tweeters the key WOM influencers or are they just a small vocal segment?</p>
<p>Some brands and products may be under the radar for this technology. Yes we love to chat about Apple but do we also regularly, enjoy blogging and tweeting about Charmin or business insurance?</p>
<p>There are conflicting approaches, metrics and offerings; over time a common Microsoft, Google, Nielsen type platform may emerge.</p></blockquote>
<p>The notion of accurate sentiment analysis is very intriguing, but, as Randall notes, it is far from a finalized technology.<span id="more-616"></span>  </p>
<p>On the one hand, we now have access to an unprecedented about of data about people&#8217;s opinions that is in constant flux, constant evolution, and is constantly being updated.  In business (and, I would argue, life) the key is lessening your information gaps, reducing the information asymmetries you face.  Often times this can be accomplished by finding a way to take the private information people hold (e.g. opinions about a product or brand, their preferences and priorities, etc.) and making it visible.  This is the essence of market/consumer research.  The current environment makes the collection of that data much easier, especially at high volumes, and more cost effective.   </p>
<p>However, the only way to derive usable, reliable information from this ocean of data is to properly code it.  If we can develop reliable technology that overcomes some of the current shortcomings we will be in a position to literally visualize the collective mind, and do so real time.  That is a very exciting prospect, but one that will be difficult to achieve.</p>
<br /> Tagged: Brands, coding, data, market research, marketing, Sentiment <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/billpetti.wordpress.com/616/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/billpetti.wordpress.com/616/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/billpetti.wordpress.com/616/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=billpetti.com&#038;blog=8839193&#038;post=616&#038;subd=billpetti&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://billpetti.com/2009/09/19/coding-the-sentiment-of-web-2-0/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/83d0c69bc078d64ebe36a701cbf755b2?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">billpetti</media:title>
		</media:content>

		<media:content url="http://www.icofree.com/userfiles/images/VistaStyleEmoticons.jpg" medium="image" />
	</item>
		<item>
		<title>Crowdsourcing Data Coding</title>
		<link>http://billpetti.com/2009/09/16/crowdflower-live-from-techcrunch50/</link>
		<comments>http://billpetti.com/2009/09/16/crowdflower-live-from-techcrunch50/#comments</comments>
		<pubDate>Wed, 16 Sep 2009 10:17:02 +0000</pubDate>
		<dc:creator>Bill Petti</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[social science]]></category>

		<guid isPermaLink="false">http://billpetti.com/2009/09/15/crowdflower-live-from-techcrunch50/</guid>
		<description><![CDATA[I just finished watching the video below of CrowdFlower&#8217;s presentation at the TechCrunch50 conference.  CrowdFlower is a plaform that allows firms to crowdsource various tasks, such as populating a spreadsheet with email addresses or selecting stills from thousands of videos that have particular qualities.  The examples in the video include very labor intensive tasks, but [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=billpetti.com&#038;blog=8839193&#038;post=608&#038;subd=billpetti&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I just finished watching the video below of CrowdFlower&#8217;s presentation at the TechCrunch50 conference.  <a href="http://crowdflower.com/" target="_blank">CrowdFlower</a> is a plaform that allows firms to crowdsource various tasks, such as populating a spreadsheet with email addresses or selecting stills from thousands of videos that have particular qualities.  The examples in the video include very labor intensive tasks, but tasks that a firm is not likely to either need again or feels is worth dedicating staff to.</p>
<p><span style="display:block;width:425px;margin:0 auto;"> <embed src='http://widgets.vodpod.com/w/video_embed/ExternalVideo.873131' type='application/x-shockwave-flash' AllowScriptAccess='sameDomain' pluginspage='http://www.macromedia.com/go/getflashplayer' wmode='transparent' flashvars='loc=%2F&autoplay=false&vid=2167086' width='425' height='350' /></span></p>
<div style="font-size:10px;">more about &#8220;<a href="http://vodpod.com/watch/2196384-crowdflower-live-from-techcrunch50?pod=">CrowdFlower, Live From TechCrunch50</a>&#8220;, posted with <a href="http://vodpod.com?r=wp">vodpod</a></div>
<p>As I was watching the video I thought about the potential to leverage such a platform for large-scale coding of qualitative data.<span id="more-608"></span>  Coming from the social sciences, often we find the need in large scale research for the massive coding of data, whether it is language from a speech, the tenor or sentiment of quotations (or newspaper articles in media studies), the nature of cases (i.e. did country A make a threat to country B, did country B back down as a result, etc.), or the responses from an open-ended survey.  Coding is an issue whether you conducting qualitative or quantitative analysis&#8211;especially where you have captured large amounts of data.  Often times the data is not inherently numerical and needs to be translated so that quantitative analysis can be conducted.  Likewise, with a qualitative approach one still needs to categorize various data points to allow for meaningful comparisons.</p>
<p>The interesting thing about a service like Crowdflower is that it can leverage a ready group of workers globally who are ready and willing to conduct the coding at a reasonable price.  Additionally, Crowdflower utilizes various real-time methods to ensure the quality of the coding.  Partially this is achieved through the scoring of coders relative to their past performance, how they fair on tasks that are &#8220;planted&#8221; by Crowdflower (i.e. salting with tasks where the correct answer is known ahead of time), and how much agreement there is between coders on various items.</p>
<p>The final method  comes up quite a bit in social science research when you have to determine how to categorize a given piece of data.  The level of agreement is crucial to confidently coding a particular case.  I would imagine that a platform such as CrowdFlower could make that task easier and more robust by quickly tapping into a larger pool of coders.</p>
<p>Has anyone used a service like CrowdFlower in this way (i.e. coding data from qualitative research)?  Would be interested in your perspective.</p>
<br /> Tagged: Big Data, coding, crowdsourcing, data, social science <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/billpetti.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/billpetti.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/billpetti.wordpress.com/608/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=billpetti.com&#038;blog=8839193&#038;post=608&#038;subd=billpetti&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://billpetti.com/2009/09/16/crowdflower-live-from-techcrunch50/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/83d0c69bc078d64ebe36a701cbf755b2?s=96&#38;d=http%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">billpetti</media:title>
		</media:content>
	</item>
	</channel>
</rss>
