Nick’s Personal Projects

For most of us, the fundamental power of our data remains completely untapped.

My Passion in Data is to change that.

The majority of the data we process as individuals is completely unrefined, existing in the form of inert blocks of text—like this one you’re reading right now! All the data processing that takes place as you read and parse what I’ve typed, and then analyze it for truth and meaning, is exclusively happening inside of your brain. But that, of course, is not the only place that data can be processed.

Take, for example, calendar events. This is one area of our lives we’ve been able to (through the effort of filling out forms on digital calendar apps) fully digitize, to the point where the calendar can actively and accurately alert us when we’ve overlapped any of the dozens of meetings we’ve scheduled. This may seem trivial, but it’s really not; if we were given a list of the start and end times of 100 calendar events over the next month, it would take us some time to sit down with a pen and paper and figure out if we’ve over-booked ourselves—and that’s if we were actually paying attention to see if there was a problem in the first place!

Another existing example is matchmaking, whether in the form of job hunting or online dating. Without even digitizing the actual concepts themselves, having digitized simple “yes or no” or number-value answers relating to salary and benefits (for the former), or age and beliefs (for the latter) have been able to exponentially increase the efficiency of deal-breaker finding, so that time could be spent further match-making instead of manually trying to suss out if each potential match has a landmine hidden somewhere.

Now, take this concept of digitizing data, and map it onto the data of our own personal lives? What could we learn if the information of our very lives were “properly” digitized (for our eyes only)? Likewise, I believe it’s a missed opportunity that each “piece of information” on Wikipedia is essentially a large block of text that simply links to other large blocks of text. Why not digitize the site, such that the the hundreds of individual concepts in just one page can connect to another page’s concepts? The sheer amount of interconnected deep learning possible with just a few blocks of text digitized in this manner is astronomical. (Note: soon after writing this, I learned that Wikidata.org was a thing. While this is amazing and exactly what I wanted, unfortunately, there is still a lack of data digitization on this platform, something I believe should change. Personally, I’d take that job…digitizing the data of everything sounds amazing.)

This level of informational power that’s unlocked when data is fully and completely digitized (and thus able to be processed) outside of the fallible human brain would revolutionize everything. What if each datum that was important to us were able to interact with ALL other such data? It could automatically provide new insights that we had never considered, or provide “out of the box thinking” by re-imagining any situation we were in by putting it into the box of any other possibly-related situation logically “nearby”. This would allow us to analyze the information already present within our data—but it would go beyond Data Analysis, and analyze concepts themselves.

Finally, in a word where “Truth” is somehow becoming subjective, and Artificial Intelligence returns “hallucinated” results—those that are completely incorrect but look-right-on-paper—to simple inputs, this kind of data-centered analysis, transparency, and mass availability is growing all too necessary.

However, until such a time exists where concepts can be digitized for a computer’s open-source clear-box analysis (and human oversight for reality checking and overriding), I will be happy to continue the role of Data Analyst/Scientist, and I believe the best way to point towards this future of the potential power of data—and point to the potential of my skillset—is to show you directly. Below you can see my portfolio.

Now Breezing (v2.8.2)

Est. May 2023

A “Now Trending” account created for Bluesky Social (a social media site similar to Twitter/X and Meta’s Threads). The bot reads the several thousands of posts made every 10 mins through the open-source AT Protocol’s “firehose” to determine the words and emojis that are being posted by the most people every 10 minutes.

The bot then generates a list of the top-10 words used, creates a search hyperlink for each, and includes a word cloud & emoji cloud. Also included are some statistics for the last 10mins of activity.

In continuous development, and new features are planned, like auto-translating non-English words. The goal is to pull interesting, deep knowledge out of the intricate web of inter-connected words that exist on a Social Media platform.