Just enough to have some fun but not get you in trouble with Dad
Getting to Second Base with HBase
What non-coding business analysts need to know to manipulate Big Data
NoSQL databases are a class of databases used in Big Data applications. They provide tremendous advantage to Big Data developers: fault-tolerance, flexible schema, and fast query performance. HBase is one of the most popular . . .
Using Compose, Twitter, and Watson to compare to philosophical titans
This in an introduction to using Compose, a database hosting service, to quickly store and retrieve data. Compose eliminates all of the annoyances of hosting your own database server, saving you both time and money. It is also easy to use, which shortens the learning curve for new developers (like me).
So what should we do with this . . .
No Hadoop, No Problem
I have installed Spark directly on Windows, which is unusual. Most people will probably run Spark through a VM (virtual machine - a separate computer that runs as software within in your computer) or a docker container (same idea, but higher level of abstraction). Unlike writing MapReduce or Pig (Pig is the scripting language . . .
Using Apache Spark and Watson Analytics to analyze the most frequent words in War and Peace
This a quick example of using Apache Spark, the big data processing engine, and Watson Analytics, a new analysis and visualization tool, to do a basic word count and analyze the results. For fun, we are using Leo Tolstoy's War and Peace. As you're likely aware, Tolstoy's magnum opus is one hefty book, so figuring out which . . .
Yes, Ghana, go find it on a map
Hackathon in Ghana
This weekend I took a suprisingly pleasant 10.5 hour direct flight from DC to tropical Accra, Ghana (I was never picked first for basketball at recess but I slept just fine in coach - winning!). I was in Ghana to help facilitate a hackathon at Ashesi University, a new school outside of Accra. For those not familiar with . . .
For the Non-developer
I recently gave a talk on retrieving big data at Data Wranglers DC on Retrieving Big Data. The goal of this talk was to introduce Hadoop to business analysts in the barest, no-nonsense form and cover one Pig hurdle I faced. Read: I gave really shitty explanations of Hadoop core components like "All you need to know about MapReduce is . . .
To Rule them All (In Excel)
No Shortcuts in Life (but not in Excel MotherFucker!!)
You’ve probably heard there are no shortcuts in life. I’m not the fucking Buddha so I’ll leave that existential shit to someone else. In Excel, my mouse using neophyte, there are some sweet goddamn shortcuts you need to get on quickly.
You use the mouse in Excel? No no, please tell me . . .